# A basic doubt on belief MDP for a POMDP

 0 The wikipedia page of POMDP http://en.wikipedia.org/wiki/Partially_observable_Markov_decision_process#Belief_MDP says about belief MDP. My question is on its transition function. In order to be a MDP should not it be transition probability ? So, the transition probability function should be from the set $$B \times A \times B$$ to $$[0,1]$$. From the wiki page we know how to update beliefs given the previous belief state, action and observation. As the observation set is finite the transition probability for a belief state will be distributed to some finite number of belief states according to the corresponding observation. Should not such "transition probability" (rather than "transition function") description is appropriate to describe belief MDP? Then value iteration, policy iteration will be just like general MDPs. Confused why wikipedia page writes like $that$. BTW, how to write math formulas here. The usual style that works in math exchange/overflow is not working here. asked 13 Jun '14, 09:28 sosha 100●1●3●6 accept rate: 0% 1 This is from FAQ. Hey, How do I get that fancy math stuff? We have MathJax installed. Use latex between pairs of "\$" for display math and backslash-backslash-( and backslash-backslash-) for inline math. Within latex, all backslashes must be doubled. (13 Jun '14, 11:14) ksphil

 1 Technically, you are right. But, I guess it is because of two reasons, 1) Transition function can be a stochastic function. 'Function' doesn't mean it is deterministic function. 2) Most of POMDP problems have deterministic transition functions rather than transition probability that changes by action. (from now it is purely my personal thought) POMDP's key point is the fact that we cannot observe our current state. And, under this condition, if we include the uncertainty of transition by action, the problem is going to be a really hard problem unless the system is simple. Not only hard to solve, the optimal policy may not be efficient. We don't know where we are, and we don't know where we are going. Almost clueless. And, that's why most of works are done in artificial intelligence (in CS or EE) where the transition is deterministic in most cases. There is a question asking an example of POMDP. https://www.or-exchange.org/questions/9638/examples-of-pomdps-where-the-actions-impact-the-transitions-of-the-underlying-markov-chain answered 13 Jun '14, 11:10 ksphil 667●1●7 accept rate: 14% @ksphil: I did not mean that I want stochastic transition function. I just want to remove the observation parameter from the deterministic transition function and make it stochastic. That way it will look like transition probability of MDP. (13 Jun '14, 11:41) sosha @ksphil: Another thing, from your answer it looks like in case of POMDP with belief vectors as states we are assuming that transitions are deterministic. I think it is not "assumption". It has to be deterministic because uncertainity is involved in states. So, the next belief state vector can be found from the previous belief state, action and observation just by simple Bayes theorem. (13 Jun '14, 12:01) sosha What I meant by tradition was the transition of actual state. I guess you meant the tradition of belief. And, I didn't mean 'we assume', I meant that most of problems have indeed deterministic state transitions. For belief vector transition, since the observation itself is stochastic, we cannot say transitions are deterministic. With a given realization of observation, yes, it is deterministic. I am sorry that I cannot go deeper than what I answered because my knowledge of POMDP is limited. (beyond general MDP) (13 Jun '14, 13:04) ksphil
 toggle preview community wiki

By Email:

Markdown Basics

• *italic* or _italic_
• **bold** or __bold__
• image?![alt text](/path/img.jpg "Title")
• numbered list: 1. Foo 2. Bar
• to add a line break simply add two spaces to where you would like the new line to be.
• basic HTML tags are also supported

Tags: