Eval Script

Evaluating Agents with HackAtari Modifications

HackAtari makes it easy to evaluate RL agents not only on standard Atari games, but also on custom or challenging modified environments. This enables robust evaluation of generalization, interpretability, and adaptability in RL research.

Steps to Evaluate Agents using the eval script

Install the needed additional packages

pip install torch
pip install stable_baselines3
pip install rliable

Choose Your Modification(s)
See modification_list.md or print them in code:

from hackatari.core import HackAtari
env = HackAtari('Pong')
print(env.available_modifications)

Use the eval script

# Start with the baseline performance 
python eval.py -g Freeway -a path_to_model

# Evaluate the same model on a modified version of the game
python eval.py -g Freeway -a path_to_model -m all_black_cars

# Save results in a json file
python eval.py -g Freeway -a path_to_model -m all_black_cars -out results.json

For more parameters, check the eval script.