-
Notifications
You must be signed in to change notification settings - Fork 514
Open
Description
In your python file Ex4.7-A.py line 51 I think it should read
temp[((value_A_Changed, value_B_Changed),reward)] = temp.get( ((value_A_Changed, value_B_Changed),reward), 0 )
instead of
temp[((value_A_Changed, value_B_Changed),reward)] = temp.get( (value_A_Changed, value_B_Changed), 0 )
The second line above will always return 0 because the key (value_A_Changed, value_B_Changed) does not exist in temp
I tried rerunning it with this change and could not reproduce the answer of the book. I am attaching the optimal policy map that I got
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
