Consider Bellman optimally equation for value function. Now, if we use the iterative algorithm to compute the optimal value function will it converge to the same ?

I see that books talks about policy/value iteration algorithms which are alternate sequence of policy evaluation and improvement steps. This is shown to converge.

Why the books does not do the obvious thing. There must be some reason.

asked 18 Jun '14, 13:15

sosha's gravatar image

accept rate: 0%

Be the first one to answer this question!
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 18 Jun '14, 13:15

Seen: 598 times

Last updated: 18 Jun '14, 13:15

OR-Exchange! Your site for questions, answers, and announcements about operations research.