Answers to: how to calculate optimal policy for a MDP with deterministic transitionhttp://www.or-exchange.com/questions/9868/how-to-calculate-optimal-policy-for-a-mdp-with-deterministic-transition<p>Suppose for a MDP all the transitions are deterministic. Then can there by any other easy (other than value iteration/policy iteration) algorithm to calculate optimal policy ?<br>
</p>enFri, 26 Feb 2016 23:25:55 -0500Answer by ksphilhttp://www.or-exchange.com/questions/9868/how-to-calculate-optimal-policy-for-a-mdp-with-deterministic-transition/13410<p>Nothing different. Deterministic case is a special case of stochastic case. </p>
<p>Usually, we don't call a deterministic problem MDP even though it is not wrong to do so technically. We call it just dynamic programming. (DP includes MDP)</p>
<p>For finite time horizon problem, by backward induction</p>
<p>For infinite time horizon problem, by value/policy iteration. </p>
<p>When transition is deterministic, calculation is much easier, since you don't need to calculate Expectation of next period value functions. You just add the next period value function, instead.</p>ksphilFri, 26 Feb 2016 23:25:55 -0500http://www.or-exchange.com/questions/9868/how-to-calculate-optimal-policy-for-a-mdp-with-deterministic-transition/13410Answer by adityamhttp://www.or-exchange.com/questions/9868/how-to-calculate-optimal-policy-for-a-mdp-with-deterministic-transition/13402<p>The only advantage of deterministic transitions is that for finite horizon problems you can either run the value iteration problem forward in time or backward in time. Bellman's book on dynamic programming has several examples of Markov decision processes with deterministic transition functions.</p>adityamWed, 24 Feb 2016 00:49:36 -0500http://www.or-exchange.com/questions/9868/how-to-calculate-optimal-policy-for-a-mdp-with-deterministic-transition/13402