Reinforcement Learning Classes¶

End-Effector¶

class gym_agx.rl.end_effector.EndEffectorConstraint(end_effector_dof, compute_forces_enabled, velocity_control, compliance_control, velocity_index, compliance_index)¶

End-effector Constraint Types.

class Dof(value)¶

An enumeration.

X_TRANSLATION = (0,)¶

Y_TRANSLATION = (1,)¶

Z_TRANSLATION = (2,)¶

X_ROTATION = (3,)¶

Y_ROTATION = (4,)¶

Z_ROTATION = (5,)¶

X_COMPLIANCE = (6,)¶

Y_COMPLIANCE = (7,)¶

Z_COMPLIANCE = (8,)¶

LOCK = 9¶

property is_active¶

class gym_agx.rl.end_effector.EndEffector(name, controllable, observable, max_velocity=1, max_angular_velocity=1, max_acceleration=1, max_angular_acceleration=1, min_compliance=0, max_compliance=1000000.0)¶

action_indices = {}¶

add_constraint(name, end_effector_dof, compute_forces_enabled=False, velocity_control=False, compliance_control=False)¶

Add constraints which make up the end-effector.

Parameters

name (str) -- Name of the constraint. Should be consistent with name of constraint in simulation
end_effector_dof (EndEffectorConstraint.Dof) -- DoF of end-effector that this constraint controls
compute_forces_enabled (bool) -- Force and torque can be measured (should be consistent with simulation)
velocity_control (bool) -- Is velocity controlled
compliance_control (bool) -- Is compliance controlled

apply_control(sim, action, dt)¶

Apply control to simulation.

Parameters

sim (agxSDK.Simulation) -- AGX simulation object
action (np.ndarray) -- Action from Gym interface
dt (float) -- Action time-step, needed to compute velocity and acceleration

Returns

Applied actions

get_velocity(sim, constraint_dof)¶

Get current velocity of end_effector.

Parameters

sim (agxSDK.Simulation) -- AGX simulation object
constraint_dof (EndEffectorConstraint.Dof) -- Degree of freedom to read velocity from

Returns

End-effector velocity and boolean indicating if it is linear or angular

rescale_velocity(velocity, current_velocity, dt, linear)¶

Rescales velocity according to velocity and acceleration limits. Note that this is done DoF-wise only.

Parameters

velocity (float) -- Action from Gym interface
current_velocity (float) -- Current velocity of the end-effector
dt (float) -- Action time-step
linear (bool) -- Boolean to differentiate between linear and angular scaling

Returns

Rescaled velocity

rescale_compliance(compliance)¶

Rescales compliance between limits defined at initialization of end-effector object.

Parameters: compliance (float) -- Action from Gym interface
Returns: Rescaled compliance

Observation¶

class gym_agx.rl.observation.ObservationType(value)¶

Observation Types.

DLO_POSITIONS = 'dlo_positions'¶

DLO_ROTATIONS = 'dlo_rotations'¶

DLO_ANGLES = 'dlo_angles'¶

DLO_CURVATURE = 'dlo_curvature'¶

DLO_TORSION = 'dlo_torsion'¶

IMG_RGB = 'img_rgb'¶

IMG_DEPTH = 'img_depth'¶

EE_FORCE_TORQUE = 'ee_force_torque'¶

EE_POSITION = 'ee_position'¶

EE_ROTATION = 'ee_rotation'¶

EE_VELOCITY = 'ee_velocity'¶

EE_ANGULAR_VELOCITY = 'ee_angular_velocity'¶

class gym_agx.rl.observation.ObservationConfig(goals, observations=None)¶

get_observations(sim, rti, end_effectors, cable=None, goal_only=False)¶

Main function which gets observations, based on configuration. To avoid repeated calls to same observation, goals can be obtained at the same time, by taking the union of the two sets.

Parameters

sim (agx.Simulation) -- AGX Dynamics simulation object
rti (list) -- agxOSG.RenderToImage buffers to render image observations
end_effectors (EndEffector) -- List of EndEffector objects which are required to obtain observations of the

end-effectors in the simulation :param agx.Cable cable: If the simulation contains an AGX Cable structure, there are special functions to obtain its state :param goal_only: If set to True, only goals will be retrieved. :return: Dictionaries with observations and achieved goals, or just desired goals.

set_dlo_positions()¶: 3D coordinates of DLO segments.

set_dlo_rotations()¶: Quaternions of DLO segments.

set_dlo_poses()¶: 3D coordinates and quaternions of DLO segments.

set_dlo_angles()¶: Inner angles of DLO segments.

set_img_rgb(image_size=None)¶

RGB image of scene containing DLO and end-effector(s).

Parameters: image_size (tuple) -- tuple with dimensions of image

set_img_depth(image_size=None)¶

Depth image of scene containing DLO and end-effector(s).

Parameters: image_size (tuple) -- tuple with dimensions of image

set_dlo_frenet_curvature()¶: Discrete Frenet curvature of DLO.

set_dlo_frenet_torsion()¶: Discrete Frenet torsion of DLO.

set_dlo_frenet_values()¶: Discrete Frenet curvature and torsion of DLO.

set_ee_position()¶: 3D coordinates of edd-effector(s).

set_ee_rotation()¶: Quaternions of edd-effector(s).

set_ee_velocity()¶: Linear velocity of edd-effector(s).

set_ee_angular_velocity()¶: Angular velocity of edd-effector(s).

set_ee_pose()¶: 3D coordinates and quaternions of edd-effector(s).

set_ee_force_torque()¶: Forces and torques sensed by edd-effector(s).

set_all_ee()¶: Pose, velocities and force-torques sensed by edd-effector(s).

gym_agx.rl.observation.get_cable_segment_rotations(cable)¶

Get AGX Cable segments' center of mass rotations.

Parameters: cable -- AGX Cable object
Returns: NumPy array with segments' rotations

gym_agx.rl.observation.get_cable_segment_positions(cable)¶

Get AGX Cable segments' center of mass positions.

Parameters: cable -- AGX Cable object
Returns: NumPy array with segments' positions

gym_agx.rl.observation.get_cable_segment_positions_and_velocities(cable)¶

Get AGX Cable segments' center of mass positions.

Parameters: cable -- AGX Cable object
Returns: NumPy array with segments' positions

gym_agx.rl.observation.get_ring_segment_positions(sim, ring_name, num_segments=None)¶

Get ring segments positions.

Parameters

sim -- AGX Dynamics simulation object
ring_name -- name of ring object
num_segments -- number of segments making up the ring (possibly saves search time)

Returns

NumPy array with segments' positions

gym_agx.rl.observation.get_rigid_body_position(sim, key)¶

Get position of AGX rigid body.

Parameters

sim -- AGX Dynamics simulation object
key -- name of rigid body

Returns

NumPy array with rigid body position

gym_agx.rl.observation.get_rigid_body_rotation(sim, name)¶

Get rotation of AGX rigid body.

Parameters

sim -- AGX Dynamics simulation object
name -- name of rigid body

Returns

NumPy array with rigid body rotation

gym_agx.rl.observation.get_rigid_body_velocity(sim, name)¶

Get velocity of AGX rigid body.

Parameters

sim -- AGX Dynamics simulation object
name -- name of rigid body

Returns

NumPy array with rigid body rotation

gym_agx.rl.observation.get_rigid_body_angular_velocity(sim, name)¶

Get rotation of AGX rigid body.

Parameters

sim -- AGX Dynamics simulation object
name -- name of rigid body

Returns

NumPy array with rigid body rotation

gym_agx.rl.observation.get_constraint_force_torque(sim, name, constraint_name)¶

Gets force a torque on rigid object, computed by a constraint defined by 'constraint_name'.

Parameters

sim -- AGX Simulation object
name -- name of rigid body
constraint_name -- Name indicating which constraint contains force torque information for this object

Returns

force a torque

Reward¶

class gym_agx.rl.reward.RewardType(value)¶

Reward Types.

SPARSE = 'sparse'¶

DENSE = 'dense'¶

class gym_agx.rl.reward.RewardConfig(reward_type, reward_range, set_done_on_success=True, **kwargs)¶

compute_reward(achieved_goal, desired_goal, info)¶

This function should return a reward computed based on the achieved_goal and desired_goal dictionaries. These: may contain more than a single observation, which means the reward can weight different parts of the goal differently. The info dictionary should be populated and returned with any relevant information useful for analysing results.

Parameters

achieved_goal (dict) -- dictionary of observations of achieved state
desired_goal (dict) -- dictionary of observations of desired state
info (dict) -- information dictionary, which should be updated, and can be used to include more information

needed for reward computations :return: float reward

is_success(achieved_goal, desired_goal)¶

This function should return a boolean based on the achieved_goal and desired_goal dictionaries.

Parameters

achieved_goal (dict) -- dictionary of observations from achieved state
desired_goal (dict) -- dictionary of observations from desired state

Returns

success

abstract reward_function(achieved_goal, desired_goal, info)¶

This abstract method should define how the reward is computed.

Parameters

achieved_goal (dict) -- dictionary of observations from achieved state
desired_goal (dict) -- dictionary of observations from desired state
info (dict) -- information dictionary, which should be updated, and can be used to include more information

needed for reward computations :return: reward, info

abstract scale_reward(reward)¶

This abstract method should define how the dense reward is scaled. This function is always called, for dense rewards, after the reward_function returns a reward value.

Parameters: reward -- reward output from reward_function
Returns: scaled reward

abstract success_condition(achieved_goal, desired_goal)¶

This abstract method returns a boolean indicating if the desired_goal is achieved. Since the goals may be composed of several observations, different conditions can be checked, at the same time.

Parameters

achieved_goal (dict) -- dictionary of observations from achieved state
desired_goal (dict) -- dictionary of observations from desired state

:return boolean