# dividing into roughly equal sized groups, with a sorted list

 4 I have a problem, and it seems like it should be something that someone has studied before. I have a sorted list of N elements, and I want to divide them into K groups, by choosing K-1 split points between them. There may be elements with the same value, and we want to have items with same value in the same group. Find K groups as close in size to round(N/K) as possible. For example, divide these 32 elements in to 4 groups of size 8: 1 1 1 1 2 2 3 3 3 3 3 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 7 8 9 10 10  One solution would be these 3 break points: 1 1 1 1 2 2 | 3 3 3 3 3 3 3 3 | 4 4 4 4 5 5 5 5 5 | 6 6 6 6 7 8 9 10 10 1 1 1 1 2 2 = 6 elements, error = abs(8-6)=2 3 3 3 3 3 3 3 3 = 8 elements, error = abs(8-8)=0 4 4 4 4 5 5 5 5 5 = 9 elements, error = abs(8-9)=1 6 6 6 6 7 8 9 10 10 = 9 elements, error = abs(8-9)=1  total error = 4 Does this look familiar to anyone? I'd like an approximation algorithm if possible. Thanks, Craig Schmidt asked 06 Jan '12, 15:40 Craig 151●1●3 accept rate: 0% fbahr ♦ 4.5k●5●15

 2 I think a dynamic program (which would run in polynomial time). For each break position x, define the value v(x,k) = best objective if the k'th break is in position x. Then you calculate v(x,k) by running through all x'
 2 For what it's worth I implemented a simple MiniZinc (CP) model for this problem: equal sized groups.mzn . It assumes that k is fixed (searching for a variable k is quite harder to do in MiniZinc). The main constraints are the following where a is the array of values, k is the number of buckets. x is an array of the break points (decision variable), the array s is the number of elements in each bucket. % Set the break points (x) and sum the number of elements in each bucket (s) constraint s[1] = x[1] / forall(i in 2..k-1) ( s[i] = x[i]-x[i-1] ) / s[k] = n-x[k-1] ;  The following constraint ensure that we don't have a break point between elements with the same values: forall(j in 2..n) ( a[j-1] = a[j] -> not(exists(p in 1..k-1) ( x[p] = j-1) ) % alt. version % a[j-1] = a[j] -> forall(p in 1..k-1) (x[p] != j-1 ) )  The objective is to minimize z which is defined like this: int: gsize = round(int2float(n) / int2float(k)); % group size var 0..n: z = sum(i in 1..k) ( abs(s[i]-gsize) );  Using some random data, it's quite fast for arrays of size < 100, and some solvers (such as Chuffed, a lazy SAT solver) have no problems with arrays of size 1000, at least when k is not too large (say < 20 or so); however Chuffed have to work a lot for arrays of size 10000. (I don't know if these sizes are in the range of the application.) I have assumed that the "grouping" constraint is hard in the requirements, i.e. that element with the same values must be in the same bucket. However, when k > (number of distinct values in the array) this is not possible. The model check for this via the variable nval, and in those cases it just skip this constraint (though this is not shown in the code above). Also, I have not been able to identify any upper bound on the decision variables (i.e. lower than n, the size of the array) since I don't know if there are some restrictions of the distribution of the given values. This is unfortunate since this model would do (much) better if the domains of the decision variables (x, s, and z) where smaller. Update: I have now changed the model with two improvements: 1) Tighter domain for x by using just the valid break points. set of int: valid_breaks = {j-1 | j in 2..n where a[j] != a[j-1]}; % ... array[1..k-1] of var valid_breaks: x;  Many solvers are faster with this variant, though the Chuffed solver don't like it. 2) Replaced the implication constraint shown above with the following by eliminating the invalid breakpoints, i.e. the positions between elements of the same values; this is useful if we can't use the valid breakpoints as a domain for x: set of int: non_valid_breaks = {j-1 | j in 2..n where a[j] = a[j-1]}; % ... forall(j in non_valid_breaks, p in 1..k-1) ( x[p] != j )  Now Chuffed solves the 10000 problem instance in about 0.1 second (though it takes 30 seconds to generate the FlatZinc .fzn file), and both fzn2smt and G12's LazyFD is much faster for the 1000 size instance (just a second or two). answered 08 Jan '12, 04:58 Hakan Kjelle... 481●1●2●7 accept rate: 0%
 2 Thanks to everyone who answered my question. It is really interesting to see some different approaches. I came up with a recursive, heuristic approach, to my question. I considered two forms of the problem. The first, as I stated it above, we want exactly K groups. The second, we want at most K groups. This relaxation can give better solutions in terms of error. We want K groups out of the N data points. Let L = int(round(float(N)/float(K))) be the desired size of each group. We first try to build a chunk of size L. There may be a group of identical points that goes across the ideal chunk boundary. We don't know if it is better to include or exclude this group, so we try both and choose the solution with the best error. For each case, we call our function again with the remaining data and K-1 groups. If we can build a chunk of size L then we just do so, and recursively call the algorithm on K-1 groups. In the exact case you have an additional restriction on the size of the group. You need to make sure that the remaining data has at least K-1 distinct values, or you won't be able to produce K-1 groups. Splitting our N points is an O(N) operation. The recursion will have to explore 2^K solutions in the worst case, although in practice you won't have a block overlapping the desired boundary each time. Thus the overall time is O(N 2^K). If K is small, as it is for my cases, this will be fast enough. If anyone is interested, here is a python implementation, which solves my example problem for all values of K. # This code is licensed under the MIT license # Copyright (c) 2012 Craig Schmidt # # Permission is hereby granted, free of charge, to any person obtaining a copy of this # software and associated documentation files (the "Software"), to deal in the Software # without restriction, including without limitation the rights to use, copy, modify, merge, # publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons # to whom the Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all copies or # substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR # PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE # FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR # OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER # DEALINGS IN THE SOFTWARE. # Craig Schmidt's email is firstname@firstnamelastname.com import random # how many unique values are there in the sorted list data def count_unique(data): unique = 1 # element 0 is always unique to start for i in range(1,len(data)): if data[i] != data[i-1]: unique += 1 # is unique if not the same as previous value return unique # find no more or exactly K splits of size L in *sorted* array data # exact = True, we find exactly K splits # exact = False, we find no more than K splits # the error for segment x will be abs(L-len(x)) # we want to find the splits with the overall smallest error # returns (total_error, splits), where splits is a list of lists def find_splits(data, K, L, exact): # shouldn't end up calling it with empty data assert len(data) > 0, "empty," + str(K) + "," + str(L) # if K is 1, then just return rest of data if K == 1: return (abs(L-len(data)),[data]) # we want to find out how far to go in making a new block # Ideally, we want to make a block of size min(len(data),L) # let B be the size of the block B = min(len(data),L) # however, if exact==True, we need to make sure # we have at least K-1 unique values left in data[B:] # otherwise, we won't be able to make K-1 chunks with that data if exact: unique = count_unique(data) # total in all data # there is no solution in this case so return an error value if unique < K: return (1e100,[[]]) # need to leave K-1 unique values in rest of data # leaving us this many for the current block availtouse = unique - (K-1) assert availtouse >= 1 # find the point in [0,..B-1] where we equal availtouse unique values # that's the last index i we can use and still end up with K spilts, with B = i+1 used = 0 for i in range(B): if (i==0) or (data[i] != data[i-1]): used += 1 if used == availtouse: break B = i+1 # see we can make the next split of size B, # or if there are identical elements across the split # note: may want a tolerance here if comparing floats if (len(data) >= B+1) and (data[B-1] == data[B]): # now we need to find the size of the block of values of values # equal to data[B-1] and data[B] first = B-1 while (first > 0) and (data[first-1] == data[B-1]): first -= 1 last = B while (last < len(data)-1) and (data[last+1] == data[B]): last += 1 assert data[first] == data[last] # here a large block fills all remaining data, so return it if (first == 0) and (last == len(data) - 1): return (abs(L-len(data)),[data]) # we have the special case where first == 0, # so there is a large block filling all of B # in this case excluding it would leave nothing in current, # so we can only really include it # i.e. just do the with case if first == 0: # include block current = data[:last+1] # include element last currenterror = abs(L-len(current)) rest = data[last+1:] # last + 1 to end (besterror,bestrest) = find_splits(rest,K-1,L,exact) return (currenterror+besterror,[current] + bestrest) # also the case where last == len(data) - 1, # so the block goes all the way to the end of the data, # in that case, including it will not leave any data left to make another block elif last == len(data) - 1: # included case give just a single segment included = [data] includederror = abs(L-len(data)) # compare to excluding it withoutcurrent = data[:first] # don't include element first assert len(withoutcurrent) > 0 withoutcurrenterror = abs(L-len(withoutcurrent)) withoutrest = data[first:] assert len(withoutrest) > 0 (withoutbesterror,withoutbest) = find_splits(withoutrest,K-1,L,exact) combined = [withoutcurrent] + [withoutbest] combinederror = withoutcurrenterror+withoutbesterror if combinederror < includederror: return (combinederror,combined) else: return (includederror,included) else: # we have a block that can be included or excluded with no problems # we don't know whether it is better to include the block data[first:last+1] or not # so recurse on both, picking the better withcurrent = data[:last+1] # do include the element last assert len(withcurrent) > 0 withcurrenterror = abs(L-len(withcurrent)) withrest = data[last+1:] assert len(withrest) > 0 (withbesterror,withbest) = find_splits(withrest,K-1,L,exact) withoutcurrent = data[:first] # don't include the element first assert len(withoutcurrent) > 0 withoutcurrenterror = abs(L-len(withoutcurrent)) withoutrest = data[first:] assert len(withoutrest) > 0 (withoutbesterror,withoutbest) = find_splits(withoutrest,K-1,L,exact) # choose whether it is better to include it or not if withbesterror < withoutbesterror: return (withcurrenterror+withbesterror,[withcurrent] + withbest) else: return (withoutcurrenterror+withoutbesterror,[withoutcurrent] + withoutbest) # can make ideal sized split, with no overlapping ties else: current = data[:B] # 0 ... B-1 currenterror = abs(L-len(current)) # is exactly size B, but may have wanted L rest = data[B:] # B to end if len(rest) > 0: # still stuff to do (besterror,bestrest) = find_splits(rest,K-1,L,exact) return (currenterror+besterror,[current] + bestrest) else: # B is all the data, we're done return (currenterror,[current]) def test1(exact): data = [1,1,1,1,2,2,3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5,5,6,6,6,6,7,8,9,10,10] N = len(data) for K in range(1,len(data)+1): L = int(round(float(N)/float(K))) (error, result) = find_splits(data,K,L,exact) # in exact mode should always return K partitions if no error if exact and error != 1e100: assert len(result) == K print "K=" + str(K), "L=" + str(L), "partition=" + str(len(result)), "error=" + str(error), result if __name__ == '__main__': import sys test1(exact=False) print "____________________________________________" test1(exact=True)  This gives the following output. The first set of output is the non-exact case with <= K chunks, and the second is the exact case with K chunks. K=1 L=32 partition=1 error=0 [[1, 1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 8, 9, 10, 10]] K=2 L=16 partition=2 error=4 [[1, 1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 8, 9, 10, 10]] K=3 L=11 partition=3 error=9 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4], [5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 8, 9, 10, 10]] K=4 L=8 partition=4 error=4 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4, 5, 5, 5, 5, 5], [6, 6, 6, 6, 7, 8, 9, 10, 10]] K=5 L=6 partition=5 error=8 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4, 5, 5, 5, 5, 5], [6, 6, 6, 6, 7, 8], [9, 10, 10]] K=6 L=5 partition=6 error=6 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6, 7], [8, 9, 10, 10]] K=7 L=5 partition=6 error=6 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6, 7], [8, 9, 10, 10]] K=8 L=4 partition=6 error=8 [[1, 1, 1, 1], [2, 2, 3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8, 9, 10, 10]] K=9 L=4 partition=6 error=8 [[1, 1, 1, 1], [2, 2, 3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8, 9, 10, 10]] K=10 L=3 partition=7 error=13 [[1, 1, 1, 1], [2, 2, 3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8, 9], [10, 10]] K=11 L=3 partition=7 error=13 [[1, 1, 1, 1], [2, 2, 3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8, 9], [10, 10]] K=12 L=3 partition=7 error=13 [[1, 1, 1, 1], [2, 2, 3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8, 9], [10, 10]] K=13 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=14 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=15 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=16 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=17 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=18 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=19 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=20 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=21 L=2 partition=8 error=16 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9, 10, 10]] K=22 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=23 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=24 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=25 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=26 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=27 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=28 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=29 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=30 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=31 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=32 L=1 partition=10 error=22 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] ____________________________________________ K=1 L=32 partition=1 error=0 [[1, 1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 8, 9, 10, 10]] K=2 L=16 partition=2 error=4 [[1, 1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 8, 9, 10, 10]] K=3 L=11 partition=3 error=9 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4], [5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 8, 9, 10, 10]] K=4 L=8 partition=4 error=4 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4, 5, 5, 5, 5, 5], [6, 6, 6, 6, 7, 8, 9, 10, 10]] K=5 L=6 partition=5 error=8 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4, 5, 5, 5, 5, 5], [6, 6, 6, 6, 7, 8], [9, 10, 10]] K=6 L=5 partition=6 error=6 [[1, 1, 1, 1, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6, 7], [8, 9, 10, 10]] K=7 L=5 partition=7 error=9 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6, 7], [8, 9, 10, 10]] K=8 L=4 partition=8 error=10 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8, 9], [10, 10]] K=9 L=4 partition=9 error=14 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7, 8], [9], [10, 10]] K=10 L=3 partition=10 error=18 [[1, 1, 1, 1], [2, 2], [3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5, 5], [6, 6, 6, 6], [7], [8], [9], [10, 10]] K=11 L=3 partition=1 error=1e+100 [[]] etc  answered 09 Jan '12, 14:41 Craig 151●1●3 accept rate: 0%
 0 Can't say it looks familiar, but it's a variant of a set partitioning problem, so it could be modelled easily as a mixed integer linear program. (Of course, "modeled easily" and "solved easily" are not synonymous, at least for large instances.) It could also be modeled as a constraint programming problem. The CP might be easier to solve if the objective were to minimize the maximum deviation from target size, rather than the sum of deviations. (Once the CP achieves a certain maximum deviation, it will try to achieve a smaller one, and reducing the maximum allowed deviation will eliminate certain combinations from consideration.) That might be true for the MIP formulation as well. answered 06 Jan '12, 18:03 Paul Rubin ♦ 12.0k●4●12 accept rate: 18%
 0 this is a combinatorial problem, the numbers are irrelevant. this problem has an "interval property", and actually I cannot believe that this problem is NP-hard. it is certainly polytime solvable when K is fixed because there are only (N-1 choose K-1) possible solutions which we can exhaustively inspect in polytime. but when K is part of the input, hell, there is some graph problem behind this... answered 06 Jan '12, 20:14 Marco Luebbecke ♦ 3.2k●1●5●15 accept rate: 16%
 0 If I recall correctly [disclaimer: that's not a very strong statement], @Mark once posed a similar question. Unfortunately, I can't find it from his user profile (there's a possibility that this question got lost in a database corruption). To my best knowledge, I proposed hierarchical clustering as a solution technique (for which performance guarantees are studied here). The top-rated answer, though, was a two-pass merge routine - afaicr proposed by either @Paul Rubin or (1 of the 5) @Matthew Saltzman(s). answered 07 Jan '12, 06:50 fbahr ♦ 4.5k●5●15 accept rate: 13% @Florian: I vaguely recall that, and I think you're right that Matt proposed a two-pass merge. I can't find it either (tried both Matt's answers and general searches), so it does look as though the site has lost some data. :-( (07 Jan '12, 10:24) Paul Rubin ♦
 0 you can also give it a thought to model it like min. t (t = sum(K-no.of elements)) subject to: for each Kth element, the no.of elements should be <= N/K Hope its not too vague! answered 10 Jan '12, 06:54 Pavan 300●1●2●19 accept rate: 0%
 toggle preview community wiki

By Email:

Markdown Basics

• *italic* or _italic_
• **bold** or __bold__
• image?![alt text](/path/img.jpg "Title")
• numbered list: 1. Foo 2. Bar
• to add a line break simply add two spaces to where you would like the new line to be.
• basic HTML tags are also supported

Tags: