Hi I am working on a bin packing problem where rather than optimizing for the least number of needed bins we would like to optimize for the most remaining space amongst all bins (including bins containing an item). To achieve the spreading we came up with the idea to minimize the sum of the used space in each bin squared, that way the more full it is the more it is penalized. However this makes the problem an MIQP problem due to the quadratic component in the minimization, assumedly slowing down the optimization. Does anyone have a suggestion on how to better formulate the optimization parameter to achieve the same or similar effect?

I've created two different models using OPL and CPLEX 12.3 to solve, using 80 bins and 4 items (as a test) the first model solves in ~9minutes on a 8 thread machine, the second model I ran for 30 minutes and still had a GAP of 47.51% and gave up. 9 minutes isn't terrible but I'd like to use about 320 items, and scaling is very poor (8 items was over an hour). I'm listing my two models and data below. Does anyone have a suggestion how to tweak the model for better runtime/scaling?

Also one thing that confused me was when I ran model 1 CPLEX told me it took 65859055 iterations to finish, however I'd expect 40960000 to be enough to explore all possibilities of decision variable assignment (80^4). Why the disparity?

Thanks in advance!

Model 1:

{string} bins = ...;
{string} items = ...;
float itemSize[bins] = ...;
float binCapacity[bins] = ...;
range r = 0..1;
dvar int X[items][bins] in r;

minimize sum(u in bins) pow(((sum(a in items) itemSize[a]*X[a][u])), 2);

subject to {
    /* Item in only one bin constraint */
    forall(i in items)
        sum(b in bins)
          X[i, b] == 1;

    /* Capacity Constraint */
    forall(b in bins)
          sum(i in items) X[i][b]*itemSize[i] <= binCapacity[b];
};

execute DISPLAY {
  writeln("Mapping:");
  for (var i in items) {
    for (var b in bins) {
      if (X[i][b] == 1) {
        writeln("Item " + i + " is in bin " + b);
      }
    }
  }
}

Data 1:

bins = {"01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "80"};
items = {"01", "02", "03", "04"};
itemSize = #[
  "01": 0.56583464,
  "02": 0.60144615,
  "03": 0.4545771,
  "04": 0.47858784
]#;
binCapacity = #[
  "01": 2,
  "02": 2,
  "03": 2,
  "04": 2,
  "05": 2,
  "06": 2,
  "07": 2,
  "08": 2,
  "09": 2,
  "10": 2,
  "11": 2,
  "12": 2,
  "13": 2,
  "14": 2,
  "15": 2,
  "16": 2,
  "17": 2,
  "18": 2,
  "19": 2,
  "20": 2,
  "21": 2,
  "22": 2,
  "23": 2,
  "24": 2,
  "25": 2,
  "26": 2,
  "27": 2,
  "28": 2,
  "29": 2,
  "30": 2,
  "31": 2,
  "32": 2,
  "33": 2,
  "34": 2,
  "35": 2,
  "36": 2,
  "37": 2,
  "38": 2,
  "39": 2,
  "40": 2,
  "41": 2,
  "42": 2,
  "43": 2,
  "44": 2,
  "45": 2,
  "46": 2,
  "47": 2,
  "48": 2,
  "49": 2,
  "50": 2,
  "51": 2,
  "52": 2,
  "53": 2,
  "54": 2,
  "55": 2,
  "56": 2,
  "57": 2,
  "58": 2,
  "59": 2,
  "60": 2,
  "61": 2,
  "62": 2,
  "63": 2,
  "64": 2,
  "65": 2,
  "66": 2,
  "67": 2,
  "68": 2,
  "69": 2,
  "70": 2,
  "71": 2,
  "72": 2,
  "73": 2,
  "74": 2,
  "75": 2,
  "76": 2,
  "77": 2,
  "78": 2,
  "79": 2,
  "80": 2
]#;

Model 2:

int binCount = 80;
int itemCount = 4;

float itemSize[1..itemCount] = ...;
int binCapacity[1..binCount] = ...;
dvar int X[1..itemCount] in 1..binCount;

minimize sum(i in 1..binCount) pow(sum(j in 1..itemCount) (X[j] == i) * itemSize[j], 2);

subject to {
    // bin Capacity constraint
    forall(i in 1..binCount)
      (sum(j in 1..itemCount) (X[j] == i) * itemSize[j]) <= binCapacity[i];
};

execute DISPLAY {
  writeln("Mapping:");
  for (var i = 1; i <= itemCount; ++i) {
    writeln("Item " + i + " is in bin " + X[i]);
  }
}

Data 2: itemSize = #[ 1: 0.56583464, 2: 0.60144615, 3: 0.4545771, 4: 0.47858784 ]#; binCapacity = #[ 1: 2, 2: 2, 3: 2, 4: 2, 5: 2, 6: 2, 7: 2, 8: 2, 9: 2, 10: 2, 11: 2, 12: 2, 13: 2, 14: 2, 15: 2, 16: 2, 17: 2, 18: 2, 19: 2, 20: 2, 21: 2, 22: 2, 23: 2, 24: 2, 25: 2, 26: 2, 27: 2, 28: 2, 29: 2, 30: 2, 31: 2, 32: 2, 33: 2, 34: 2, 35: 2, 36: 2, 37: 2, 38: 2, 39: 2, 40: 2, 41: 2, 42: 2, 43: 2, 44: 2, 45: 2, 46: 2, 47: 2, 48: 2, 49: 2, 50: 2, 51: 2, 52: 2, 53: 2, 54: 2, 55: 2, 56: 2, 57: 2, 58: 2, 59: 2, 60: 2, 61: 2, 62: 2, 63: 2, 64: 2, 65: 2, 66: 2, 67: 2, 68: 2, 69: 2, 70: 2, 71: 2, 72: 2, 73: 2, 74: 2, 75: 2, 76: 2, 77: 2, 78: 2, 79: 2, 80: 2 ]#;

asked 26 Dec '11, 18:34

Halcyon's gravatar image

Halcyon
6113
accept rate: 0%

retagged 30 Dec '11, 15:23

fbahr's gravatar image

fbahr ♦
4.6k716

1

Given a fixed set of items and a fixed set of bins, the total unused space across all bins is a constant. Can you clarify your goal?

(27 Dec '11, 10:39) Paul Rubin ♦♦
  1. Paul is correct. The objective function of your first model is a constant which is (total item sizes)^2. This is due to item to bin allocation constraint and binary nature of X(i,u).
  2. I solved your first model with data 1 via CPLEX 12.3 in GAMS on a single x64 thread and CPLEX just performed 7 iterations and took only 0.135 sec (objective function is 4.411873 which is equal to what I said in first item). Perhaps you should double check your codes.
(27 Dec '11, 13:48) Ehsan ♦

Hi Paul and Ehsan thanks for the feedback. I don't believe it should be a constant, or at least if it is then I have built it incorrectly. The goal is to load balance the items into the bins. I was expecting "pow(((sum(a in items) itemSize[a]*X[a][u])), 2)" to give me the total size of the items in bin u squared, is this not the case?

RE Ehsan 2: I'm wondering if GAMS is reformulating the problem differently from my OPL formulation?

Thanks!

(27 Dec '11, 17:06) Halcyon

Your objective function is not identically a constant, but total used space (and hence total free space) is. Squaring seems to be aimed at some sort of load "spreading" (echoing the title of your question), but it's still unclear what you want. Possible (distinct) goals include maximizing the minimum unused space in any bin, minimizing the spread between most and least empty space in a bin, minimizing squared deviation of actual unused space from "average" unused space, minimizing the spread between greatest and least percentage unused space, ad nauseum.

(28 Dec '11, 13:13) Paul Rubin ♦♦

Hi Paul, understood. I need to balance feasibility and solution time on the size of problem, but minimizing the spread between greatest and least percentage unused space or minimizing squared deviation of actual unused space from "average" unused space would be good to try. I'd prefer to avoid min max unless it is the only feasible option for the reasons I put in my reply to Marco.

(28 Dec '11, 14:25) Halcyon

@Paul: Thanks so much. You're right. As a matter of fact, the special structure of the data sets @Halcyon has used caused me such belief. I've updated my answer to reflect on this matter.

(28 Dec '11, 14:36) Ehsan ♦
showing 5 of 6 show 1 more comments

A. This sounds like "load balancing". I understand that you do not wish to solve a bin packing problem but a closely related machine scheduling problem. Think of your items as jobs and the bins as machines. You would like to "schedule" the jobs on machines (assign the items to bins) such that the latest finishing time (the "makespan") is minimized. This should have the desired effect. You will achieve this goal using a linear objective function as follows: Introduce a (continuous) variable, say z. Then you need to replace the constraints

sum(i in items) .... <= binCapacity[b];

by

sum(i in items) .... <= z;

and have "minimize z" as the objective function. This is a "min max" objective function, and hopefully does the job.

B. use binary variables for modeling by all means (if you have the choice between integer and binary variables); this usually is stronger in terms of the linear relaxation, which in turn helps the branch-and-bound algorithm. This appears to be a reason why the second model is much weaker.

C. The models have a lot of "symmetry", another reason for bad performance. That means: The solver does not know that there is no essential difference between bin 1 and bin 2 ... and bin 80. You can help the solver by introducing "an artificial distinction" between bins, eg., by introducing constraints that enforce that you open bin i+1 only if bin i is already open. This works for "standard" bin packing, but possibly not with your "balancing" objective function. Hm.

D. The number of iterations reported is most probably the number of simplex iterations which is not the number of combinations to pack items. The solver solves the linear relaxation of your model first (and in every node of the branch-and-bound tree), this has nothing to do with the number of solutions to the integer program. Anyway, the number of iterations is extremely large for such a toy model.

E. What comes to my mind as a remedy to several of your problems: An entirely different model formulation as a set covering model may help: Enumerate all possibilities ("patterns") to pack a bin, and choose a subset of patterns, such that every item is contained in one pattern. This will require column generation (and possibly branch-and-price) to solve the model, which is not easy/possibly in OPL.

My answer may leave you with more questions than before, feel free to follow-up on this post. One helpful information would be: do you need optimal solutions or "only" good quality?

link

answered 27 Dec '11, 06:47

Marco%20Luebbecke's gravatar image

Marco Luebbecke ♦
3.4k1615
accept rate: 16%

Hi Marco- Thanks for this! I've been playing around with your min max objective function and it definitely got me more scalability than I previously had. The obvious downfall to min max however is if there is one large job that nearly fills a machine by itself, which (my understanding) then places nearly no constraints on the filled capacity of the other machines with the exception that the sum on any of them needs to be less than the big job.

In regards to E, do you have a link or more details on how this would work?

An optimal solution would be best, or within some percentage of optimal.

(28 Dec '11, 13:35) Halcyon

Ok, I understand the min max drawback. What about introducing an extra variable for each bin, and minimizing their sum?

When you do not need an optimal solution, you could pre-generate "some" possible packings for one bin, say, several thousand. Maybe you can use your problem knowledge to decide about how "promising" patterns may look like. Objective coefficient of a pattern is the amount of wasted space. Introduce a binary variable for each pattern (=chosen for some bin or not). Constraints are: for each item i: sum over all variables/patterns that contain i must be >= 1.

(28 Dec '11, 17:19) Marco Luebbecke ♦

You can see here http://www.springerlink.com/content/ubawrxrkf6nqeulj/ how it works for the bin packing problem when you want an optimal solution, i.e., when you need column generation to solve the linear relaxation.

(needed to add a 2nd comment due to char count constraint)

(28 Dec '11, 17:19) Marco Luebbecke ♦

"What about introducing an extra variable for each bin, and minimizing their sum?" I assume you mean a variable for the sum of the load in each bin, then minimizing the sum of those? In this case the sum is constant no matter the packing.. unless I misunderstood?

(29 Dec '11, 00:11) Halcyon

you are right, that won't work. last try (not exact): introduce a variable for "each" load (in some discretization, say z_5%, z_10%, z_15%, etc.), and use for each bin b and "load percentage" p

sum(i in items) .... <= p * binCapacity[b] + z_p;

that penalizes the load in each bin above a certain level, could even be weighted in the objective function to penalize "larger percentages more".

(29 Dec '11, 03:32) Marco Luebbecke ♦

Hello,

I did my thesis on using constraint programming to solve this kind of problem (here). You might have a look at this article: Pierre Schaus, Pascal Van Hentenryck, Jean-Charles Régin: Scalable Load Balancing in Nurse to Patient Assignment Problems. CPAIOR 2009: 248-262.

link

answered 04 Jan '12, 06:44

pierre%20schaus's gravatar image

pierre schaus
63429
accept rate: 4%

Letting load[i] be the actual load in the i-th bin, something like

maximize d1 + d2
s.t. load[i] <= binCapacity[i] - d1*binCapcity[i] for all i
     load[i] >= binCapacity[i] + d2*binCapcity[i] for all i
     0 <= d1, d2 <= 1
     ...

shrinks the range of bin loadings (in percentage terms).

link

answered 04 Jan '12, 07:41

Paul%20Rubin's gravatar image

Paul Rubin ♦♦
14.6k412
accept rate: 19%

I'm sorry because I read objective function of your model #1 incorrectly last night and therefore I implemented your first model in GAMS incorrectly. Therefore, I've to correct my previous statements as follows (apparently, I was way too sleepy while answering your question):

a) If number of items is less than bins as in your examples, the optimal objective function would constant based on your inputs, which is equal to sum(i, (itemSize(i))^2). There are obvious multiple optimal solutions to this problem: Allocate items to distinct bins. However, if number of bins is less than items, then the finding optimal solution and optimal objective function are not trivial.

b) Since I implemented your model incorrectly, the solution time I reported was wrong and your reported solution time was correct. However, I found an insight on why the CPLEX might be very slow to solve the model as in your data sets, number of bins is larger than number of bins. Apparently, as there are multiple optimal solutions (n!, which n is the number of items), CPLEX has a really hard time closing the gap as it has to explore a large number of nodes with equal objective function values in the branch & bound tree.

c) I developed a another mathematical model in GAMS for your problem. The model minimizes the squared unused capacity of bins (please note the difference in objective function). For this matter, I had to add a binary variable denoting selection of bins. The model is as follows:

Sets        i items / i1 * i4 /
            j bins  / b1 * b80/;

Parameter Size(i) Size of i_th item/
i1        0.56583464
i2        0.60144615
i3        0.4545771
i4        0.47858784/;

Parameter Cap(j) Capacity of j_th bin;

Cap(j) = 2;

variable Z Objective function value;
integer variable X(i,j) Ammount of item i allocated to bin j;
binary variable Y(j) Selecting bin j;

Equations
objfun
AllocationConstraint
CapacityConstraint
AllocationSelection;

objfun..                            Z =e=sum(j, Power((Cap(j) * Y(j) - sum(i, Size(i) * X(i,j))),2));
AllocationConstraint(i)..           sum(j, X(i,j)) =e= 1;
CapacityConstraint(j)..             sum(i, Size(i) * X(i,j)) =l= Cap(j);
AllocationSelection(i,j)..          X(i,j) =l= Y(j);

model binpacking /all/;
solve binpacking using miqcp minimizing Z;

This model could be solved effectively for your data sets (in about 10 seconds on four x64 threads). Another positive side of this model is that it tends to keep the number of used bins small even if the number of available bins is larger than the number of items. Feel free to ask question if you don't understand the logic behind my model.

ps. GAMS does not reformulate the models and just acts as medium for transforming your model to what could be understood by the solvers. However there are some solvers which could do this.

link

answered 28 Dec '11, 11:25

Ehsan's gravatar image

Ehsan ♦
4.8k31122
accept rate: 16%

edited 28 Dec '11, 14:34

Hi Ehsan thanks for the suggestions. One problem is I actually want to maximize the sum of the remaining space per bin^2, which when I do that it causes the decision variable Y to be selected on all since the inequality is just <=. If you run your model on the data you'll find it packs everything into one bin which is opposite of the goal.

Thanks!

(28 Dec '11, 14:11) Halcyon

@Halcyon: Are you sure about this? I've run the model multiple times and its gives me a solution with objective function of 1.804221 and just two bins being used (items 1 and 4 in one bin and items 2 and 3 in one bin). As a matter of fact, a solution with only one used bin should be infeasible as the total size of items is about 2.1 which is larger than the capacity of the bins.

(28 Dec '11, 14:43) Ehsan ♦

@Ehsan Ah right, I had a typo in a constraint, I'm now seeing 1/4 in a bin and 2/3 in a bin as well. However ideally these should all be in separate bins. In the larger data sets I have the # items >> # bins so the problem is more interesting.

(28 Dec '11, 14:50) Halcyon
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×191
×71
×13
×7

Asked: 26 Dec '11, 18:34

Seen: 3,459 times

Last updated: 04 Jan '12, 07:41

OR-Exchange! Your site for questions, answers, and announcements about operations research.