Suppose for a MDP all the transitions are deterministic. Then can there by any other easy (other than value iteration/policy iteration) algorithm to calculate optimal policy ?
asked
sosha |

The only advantage of deterministic transitions is that for finite horizon problems you can either run the value iteration problem forward in time or backward in time. Bellman's book on dynamic programming has several examples of Markov decision processes with deterministic transition functions.
answered
adityam |

Nothing different. Deterministic case is a special case of stochastic case. Usually, we don't call a deterministic problem MDP even though it is not wrong to do so technically. We call it just dynamic programming. (DP includes MDP) For finite time horizon problem, by backward induction For infinite time horizon problem, by value/policy iteration. When transition is deterministic, calculation is much easier, since you don't need to calculate Expectation of next period value functions. You just add the next period value function, instead.
answered
ksphil |