Hello guys,

I'm trying to relate stochastic programming (e.g. using recourse models), Markov decision processes, and simulation modeling together. I know what each is separately, but I don't know when each is more suitable. I expect there is an overlap. Any pointers to books or articles that discuss these topics together?


asked 14 Jan '16, 09:24

anahana's gravatar image

accept rate: 0%

MDP (Markov decision processes) : Mainly for sequential decision making problem. in other word, you have to make decisions again and again(repeat many times, even infinite times), and each time you make a decision, you will get some new information (in most cases, you observe the realization of some random variables). The general Dynamic programming doesn't require the Markovian property, but if the state transition is not Markovian, the problem may be intractable.

Stochastic Programming with multiple stage : It also can handle sequential decision making problem. It doesn't require Markovian property at all. However, it will consider the all the possible paths in the problem, it cannot solve a problem with large numbers of decision epochs. Mostly not more than 2 stage.

Therefore, they are trying to solve the same problem in theory, but in practice, what they can solve are quite different.

Simulation models : It is universal. It can solve any problem. Many stage, Markovian or not. This is the strength of simulation. The weakness of simulation is that it doesn't give you the exact solution. And, it could be difficult to get the insight of properties or characteristics of optimal solutions.

MDP + Simulation : look for Approximate Dynamic Programming, which uses simulation techniques inside of dynamic programming/MDP

Stochastic Programming + Simulation : look for SAA (Sample Average Approximation, which uses simulation techniques inside of stochastic programming (but it doesn't focus on multiple stage problem)

I have never seen MDP+Stochastic programming, but a small size Stochastic programming can be employed inside of MDP.


answered 27 Jan '16, 12:42

ksphil's gravatar image

accept rate: 14%

Thanks Phil, much appreciated. You gave me a few pointers to follow. If I understood correctly, stochastic programming will optimize the decisions made now taking into account the "average" case(s) that might happen in later stage(s). However, a MDP will optimize the next step based on what has been observed so far. If this is the case, can I assume that a MDP is somehow greedy? Put differently, if I have a problem with two stages and solve it with a MPD model and a stochastic model, which will give a better answer? I think the stochastic model will. Thanks.

(28 Jan '16, 04:45) anahana

MDP is not a greedy algorithm. Stochastic Program and MDP both provide the exact solutions. Since MDP will observe and make a decision at every time point, it may seem greedy. But how we solve the MDP is basically the backward induction, considering all the possibilities in the future.

I used the words, 'all the possible paths'. for Stochastic programming. All the possible paths means, Stochastic Programming also can incorporate with observation of informations on each decision epoch.

(28 Jan '16, 13:37) ksphil

Thank you Phil. This helps a lot! Have a nice weekend.

(29 Jan '16, 03:37) anahana
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 14 Jan '16, 09:24

Seen: 1,603 times

Last updated: 29 Jan '16, 03:37

OR-Exchange! Your site for questions, answers, and announcements about operations research.