Reproducibility

Achieving reproducibility in deep RL is difficult. Even when the random seed is fixed, libraries such as PyTorch use algorithms and implementations that are nondeterministic. PyTorch has several options that allow the user to turn off some aspects of this nondeterminism, but behavior is still usually only replicable if the runs are executed on the same hardware.

We provide a global seeding class Seeder that allows the user to set a global seed for all packages currently used by the framework (NumPy, PyTorch, and Python’s random package). It also sets the PyTorch options to turn off nondeterminism. When using this seeding functionality, before starting a run, you must set the environment variable CUBLAS_WORKSPACE_CONFIG to either ":16:8" (limits performance) or ":4096:8" (uses slightly more memory). See this page for more details.

The Seeder class also provides a function get_new_seed() that provides a new random seed each time it is called, which is useful when in multi-agent setups where you want each agent to be seeded differently.