REDQ simple example

2D vehicle control learning examples with REDQ

Reinforcement learning algorithm

Agent
- Soft Actor Critic (SAC)
  - able to tune an update-to-data (UTD) ratio G
- Randomized Ensembled Double Q learning (REDQ)
  - v1 : N critics and N critic optimizers
  - v2 : N critics and 1 critic optimizer
ETC
- multi-step Q learning

Environment

Unicycle model
Lidar-like sensor model
Task : reach the goal point with the green circle while avoiding the collision with walls
Observation : different direction scan measurement values
- the default number of scan value N : 9
- maximum distance : 10m
- historical window length H : H consecutive observations are concatenated with the shape (1, N*H)
- angle range : [-120 deg, 120 deg]
- minmax normalized to [0, 1]
- add gaussian sensor noise : 0.2
Action : angular velocity
- action range : [-pi/4 rad/s, pi/4 rad/s]
Linear velocity
- train : 3m/s, constant
- test : 1.5m/s, constant
Disturbance
- zero mean gaussian noise with std 0.3 for linear velocities
- zero mean gaussian noise with std 0.1 for angular velocities
Reward
- -5 if collisions happens
- 0.5 if forward distance measure > 5 else 0
screen shot

Train the agent

Soft actor critic
- After the training of REDQ, the parameters of the agent are saved in the directory /maze_example/savefile/sac or /maze_example/savefile/sac_g20

cd REDQ_simple_example
python /maze_example/train_sac_agent.py --max_train_eps 100

python /maze_example/train_sac_agent.py --max_train_eps 100 --G 20

REDQ
- After the training of REDQ, the parameters of the agent are saved in the directory /maze_example/savefile/redq or /maze_example/savefile/redq_v2

cd REDQ_simple_example
python /maze_example/train_redq_agent.py --max_train_eps 100 --version v2

Results

Early stop the training process when the agent reaches the target for 10 consecutive episodes.
In the complex environments, the performance gap between SAC-G and REDQ might widen further

Name		Name	Last commit message	Last commit date
Latest commit History 196 Commits
img		img
maze_example		maze_example
rl_agent		rl_agent
vehicle_env		vehicle_env
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

REDQ simple example

2D vehicle control learning examples with REDQ

Reinforcement learning algorithm

Environment

Train the agent

Results

Reference

About

Uh oh!

Releases

Packages

Languages

sjYoondeltar/REDQ_simple_example

Folders and files

Latest commit

History

Repository files navigation

REDQ simple example

2D vehicle control learning examples with REDQ

Reinforcement learning algorithm

Environment

Train the agent

Results

Reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages