In order to specify a POMDP, we need observation probabilities given state and action. I don't understand the need for action here. Agent can't view the actual states but the tokens (probabilistic) emitted by it. What is the role of action here ? asked 11 Jun '14, 20:29 sosha 
I strongly recommend to study MDP and consequently Dynamic Programming, before you jump into POMDP. POMDP is MDP. MDP is a type of stochastic dynamic programming. Therefore, it is a special case of Dynamic programming(DP). We want to find the optimal solution. Solution means the optimal decision. Decision means action in DP. Without action (decision), this problem(POMDP) will be a statistical estimation problem. Our goal in DP ( of cause in POMDP also) is finding the optimal rule of actions. The difference from the other optimization problem is that the optimal solution in DP is a decision policy (rule), rather than a single set of decision, because DP suggests different action(decision) for different state. ============== Actions* (From www.pomdp.org) The actions are the set of possible alternative choices you can choose to make. The main problem of solving an MDP is to find the best action to take in each particular state of the world. answered 11 Jun '14, 21:41 ksphil @ksphil: Please note that I am not asking the requirement of "action" in MDP/POMDP setting. I am aware of the genereal MDP problem : to find the optimal policy (i.e. mapping from states to actions, in case of POMDP, mapping from belief probability states to actions). I want to know the role of actions to specify "observation probabilities". We can't see the environment states, but can see the probabilistic observations emitted by the state. so, to specify observation probability we should use p(os) ? where $o$ is observation and $s$ is environment state. I am asking why is it $p(os,a)$ where $a$ is action. Hope that I have made my point clear
(12 Jun '14, 00:19)
sosha
@ksphil: It was too big for comment. So, I made it an answer. Once you read it I will delete it.
(12 Jun '14, 00:20)
sosha
Oh, I misunderstood your question. So, your question is why the observation probability is a function of action. It depends on the problem. In some problem the observation probability is a function of state only, not of action. (probably you think of this case) But, in some other problem, the observation probability is a function of action also. As a result of action we may change our state, since we are not sure where we were, new state is also not known. However, the action and the new observation together can give us some information (more than observation only) Let's say there are 5 rooms that form a circle, 4 cold room and one warm room. And assume we know the location of the warm room (let's say the warm room is room 4). But we don't know where we are. By sensing cold temperature, we know we are not in warm room, but we don't know which room we are in among 4 cold rooms. and let's move clockwise (move to the next room) and we feel cold temperate again in new room. And then we will know that we were not in the room 3, and we are not in room 4 nor room 5. We will know we are in room 1,2 or 3 now. This example is an extreme case, no uncertainty of observation, and no uncertainty of state transition. But, without using the action (the fact that we moved to right direction), we will know only that we are not in room 4. In general, action can give us additional information. It means the probability can be a function of action as well as observation. The following paper seems a good reference. You can find more general example.
(12 Jun '14, 01:58)
ksphil
