-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathinput-format.txt
More file actions
executable file
·29 lines (23 loc) · 1003 Bytes
/
input-format.txt
File metadata and controls
executable file
·29 lines (23 loc) · 1003 Bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
* WHAT IS POMDP?
POMDP is a seven tuple: (S, A, T, R, O, _||_, d).
S: set of states
A: set of actions
T: transition function...gives Pr(s'| a, s)
R: reward function...gives R(s, a)
O: set of observations
_||_: gives Pr(o | a, s')
d: discount factor
* INPUT FORMAT
1. discount: %f
2. states: <space seperated list of states>
3. actions: <space seperated list of actions>
4. observations: <space seperated list of observations>
5. T : <action> : <startState> : <end-state> %f
6. O : <action> : <end-state> : <observation> %f
7. R : <action> : <start-state> %f
8. start: <space seperated |S| values denoting the belief value of each state initially>. If no start state is provided then agent can be in any state with equal probability.
Note: the above 8 parameters can be in any order.
9. Wildcard interpretation is implemented.
10. Lines starting with # will be treated as comments
* MEMORY REQUIREMENTS
1. O(|S||S||A| + |O||A||S|) for storing the POMDP + run-time requirements.