I have to solve a large number (from 2 to 10 millions) of very similar small (maybe 100 constraints) problems where only a small number of constraints, always the same change. Essentially I want to minimize the cost of covering consumption. The rules are more or less the following:
I am planning on solving this using Spark and I am not familiar with the performance of algorithms and my question is, should I try to solve a very large number of small problems, or should I combine them in a large problem repeating the constraints between products for each customer? Can I somehow leverage that the individual problems are very similar? Thanks asked 10 May '18, 06:22 Luis Sisamon 
If only a few constraints each time, it is probably fastest to use dual simplex on all problems other than maybe the first. This assumes a solver that can "hot start" after a change to a model, using the previous basis as a starting point. CPLEX has that ability, and I'm pretty sure most other good LP solvers do as well. I would test that theory by running a few problems, first solving them independently and then with hot starts and dual simple, and compare endtoend run times. I don't see any advantage to merging multiple problems into a single larger problem. Depending on what actually changes from one problem to the next, there might be some utility to doing a clustering analysis of sorts, trying to group problems so that there are minimal differences in constraints within a cluster. answered 10 May '18, 11:25 Paul Rubin ♦♦ Thanks for you answer. I have been considering the clustering option, I may not be able to cluster all cases, but probably can cluster some 8090% of them and then handle the rest individually. I will look into the hot start option, I can probably sort the problems by calculating a distance between usage profiles. I still need to figure out how to best make use of the parallel capabilities of spark.
(10 May '18, 12:30)
Luis Sisamon
On a related note, can a solver 'hot start' on a non feasible solution? For example let's say I solve an LP and get a feasible solution but then I need to add some additional constraints to the same problem; could I hot start even though the new constraint makes the last feasible solution invalid?
(11 May '18, 10:08)
gtg489p
1
Yes. It is common (dare I say standard practice) in IP solvers to hot start the solution to a node problem with added cuts from the solution to the parent problem. Dual simplex generally works well for this. The new constraints make the previous primal solution superoptimal/infeasible but just make the previous dual solution suboptimal (but still dual feasible).
(11 May '18, 10:13)
Paul Rubin ♦♦
