Pairwise value implementation (!72) · Merge requests · Eric Dignum / compassproject

Alexis Carras requested to merge pairwise-value-implementation into policy-learning Dec 28, 2022

Implemented a new kind of q-learning module that ranks each possible move (from the current school) individually. That is, the state vector is different. This required a few changes in the loop and a whole new file. In the meantime I also made a few bug fixes (ie, all code in this branch should be better than that in the target).

Pairwise value implementation

Merge request reports