hive.runners.single_agent_loop module

class hive.runners.single_agent_loop.SingleAgentRunner(environment, agent, loggers, experiment_manager, train_steps, eval_environment=None, test_frequency=-1, test_episodes=1, stack_size=1, max_steps_per_episode=1000000000.0, seed=None)[source]

Bases: Runner

Runner class used to implement a sinle-agent training loop.

Initializes the SingleAgentRunner.

Parameters
  • environment (BaseEnv) – Environment used in the training loop.

  • agent (Agent) – Agent that will interact with the environment

  • loggers (List[ScheduledLogger]) – List of loggers used to log metrics.

  • experiment_manager (Experiment) – Experiment object that saves the state of the training.

  • train_steps (int) – How many steps to train for. This is the number of times that agent.update is called. If this is -1, there is no limit for the number of training steps.

  • eval_environment (BaseEnv) – Environment used to evaluate the agent. If None, the environment parameter (which is a function) is used to create a second environment.

  • test_frequency (int) – After how many training steps to run testing episodes. If this is -1, testing is not run.

  • test_episodes (int) – How many episodes to run testing for duing each test phase.

  • stack_size (int) – The number of frames in an observation sent to an agent.

  • max_steps_per_episode (int) – The maximum number of steps to run an episode for.

  • seed (int) – Seed used to set the global seed for libraries used by Hive and seed the Seeder.

run_one_step(environment, observation, episode_metrics, transition_info, agent_traj_state)[source]

Run one step of the training loop.

Parameters
  • observation – Current observation that the agent should create an action for.

  • episode_metrics (Metrics) – Keeps track of metrics for current episode.

run_end_step(environment, observation, episode_metrics, transition_info, agent_traj_state)[source]

Run the final step of an episode.

After an episode ends, set the truncated value to true.

Parameters
  • environment (BaseEnv) – Environment in which the agent will take a step in.

  • observation – Current observation that the agent should create an action for.

  • episode_metrics (Metrics) – Keeps track of metrics for current episode.

  • transition_info (TransitionInfo) – Used to keep track of the most recent transition for the agent.

  • agent_traj_state – Trajectory state object that will be passed to the agent when act and update are called. The agent returns a new trajectory state object to replace the state passed in.

run_episode(environment)[source]

Run a single episode of the environment.

Parameters

environment (BaseEnv) – Environment in which the agent will take a step in.