Suppose for a MDP all the transitions are deterministic. Then can there by any other easy (other than value iteration/policy iteration) algorithm to calculate optimal policy ?

asked 20 Jun '14, 11:35

sosha's gravatar image

accept rate: 0%

The only advantage of deterministic transitions is that for finite horizon problems you can either run the value iteration problem forward in time or backward in time. Bellman's book on dynamic programming has several examples of Markov decision processes with deterministic transition functions.


answered 24 Feb '16, 00:49

adityam's gravatar image

accept rate: 0%

Nothing different. Deterministic case is a special case of stochastic case.

Usually, we don't call a deterministic problem MDP even though it is not wrong to do so technically. We call it just dynamic programming. (DP includes MDP)

For finite time horizon problem, by backward induction

For infinite time horizon problem, by value/policy iteration.

When transition is deterministic, calculation is much easier, since you don't need to calculate Expectation of next period value functions. You just add the next period value function, instead.


answered 26 Feb '16, 23:25

ksphil's gravatar image

accept rate: 14%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 20 Jun '14, 11:35

Seen: 1,531 times

Last updated: 26 Feb '16, 23:25

OR-Exchange! Your site for questions, answers, and announcements about operations research.