hive.runners.multi_agent_loop module

class hive.runners.multi_agent_loop.MultiAgentRunner(environment, agents, logger, experiment_manager, train_steps, test_frequency, test_episodes, stack_size, self_play, max_steps_per_episode=27000)[source]

Bases: Runner

Runner class used to implement a multiagent training loop.

Initializes the Runner object.

Parameters

environment (BaseEnv) – Environment used in the training loop.
agents (list[Agent]) – List of agents that interact with the environment
logger (ScheduledLogger) – Logger object used to log metrics.
experiment_manager (Experiment) – Experiment object that saves the state of the training.
train_steps (int) – How many steps to train for. If this is -1, there is no limit for the number of training steps.
test_frequency (int) – After how many training steps to run testing episodes. If this is -1, testing is not run.
test_episodes (int) – How many episodes to run testing for.
stack_size (int) – The number of frames in an observation sent to an agent.
max_steps_per_episode (int) – The maximum number of steps to run an episode for.

run_one_step(observation, turn, episode_metrics)[source]

Run one step of the training loop.

If it is the agent’s first turn during the episode, do not run an update step. Otherwise, run an update step based on the previous action and accumulated reward since then.

Parameters

observation – Current observation that the agent should create an action for.
turn (int) – Agent whose turn it is.
episode_metrics (Metrics) – Keeps track of metrics for current episode.

run_end_step(episode_metrics, done=True)[source]

Run the final step of an episode.

After an episode ends, iterate through agents and update then with the final step in the episode.

Parameters

episode_metrics (Metrics) – Keeps track of metrics for current episode.
done (bool) – Whether this step was terminal.

run_episode()[source]: Run a single episode of the environment.

hive.runners.multi_agent_loop.set_up_experiment(config)[source]

Returns a MultiAgentRunner object based on the config and any command line arguments.

Parameters: config – Configuration for experiment.

hive.runners.multi_agent_loop.main()[source]