Skip to content

Commit ae2a064

Browse files
Update 04_ppo_with_sb3.ipynb
1 parent 0bbf7e0 commit ae2a064

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

examples/04_ppo_with_sb3.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@
130130
"\n",
131131
"Now take a look at information you've logged over training; did we learn?\n",
132132
"\n",
133-
"One important metric for assess the effectiveness of your policy is the average cumulative reward per episode. In our case, the **maximum** achievable return per episode is approximately between 9 and 10 (it varies per traffic scene and per agent). With the configurations above, your policy should approach this value in 150,000 steps. Here, steps (the `global_step`) represents the total number of **frames** our policy network has seen, you can think of it as the accumulated experience."
133+
"One important metric for assess the effectiveness of your policy is the average cumulative reward per episode. In our case, the **maximum** achievable return per episode is 1 per agent. With the configurations above, your policy should approach this value in 150,000 steps. Here, steps (the `global_step`) represents the total number of **frames** our policy network has seen, you can think of it as the accumulated experience."
134134
]
135135
}
136136
],

0 commit comments

Comments
 (0)