**Problem Name:** Marriage Ceremonies**LightOJ ID:** 1011**Keywords:** dynamic programming, bitmask calculations

Given a matrix of values for all possible matches between \(N\) men with \(N\) women, we need to find out the maximum possible value after matching all people. It is, as one would suspect, completely reasonable to solve it using D.P. What makes this problem interesting is that the D.P. function depend on sets of values, rather than just one or two scalar variables, as is usually the case.

Consider for example the following definition:

- \(f(i, S)\)
- The maximum result value after matching \(i\) men with the set of women \(S\), where \(S\) has exactly \(i\) elements.

To calculate \(f\), first consider the basic case: where \(i=1\). In this case:

where \(p(i, j)\) is the priority between the *ith* male and the *jth* female, as given in the input.

Then, for \(f(i, S)\) where \(i > 1\), we have:

In other words, for every new man (the *ith* man) that we want to match with someone, we try adding his priority against every woman (\(S_j\)) and the maximum from marrying all men up to \(i-1\) with the set of all women except \(S_j\). For example, to calculate \(f(3, \{1, 2, 3\})\), we require the values of \(f(2, \{2, 3\})\), \(f(2, \{1, 3\})\) and \(f(2, \{1, 2\})\).

All of this is good in abstract, but how to represent the sets efficiently? This is where the *bitmask* concept kicks in. You should be familiar with binary representation of integers, and with bitwise operations. The idea is to use a single integer to represent all possible sets of values, which is feasible in this case because \(N\) is at most 16, so the space required for our D.P. structure would be \(16 \times 2^{16}\).

Now, although a recursive implementation may be easier to write and understand, let’s try writing an iterative version of our D.P. function. Consider the following C++ fragment:

typedef unsigned int u32; // dp[i][j] will contain the maximum priority of marrying i men with the set // of women defined by the bitset j int dp[MAXN + 1][1 << MAXN]; void run_dp() { for (int i = 1; i <= N; ++i) dp[1][1 << (i - 1)] = pri[1][i]; u32 top = 1 << N; for (int i = 2; i <= N; ++i) { u32 b = (1 << i) - 1; for (u32 j = b; j < top; j = next_popcount(j)) { b = j; // women still untested int max = 0; while (b != 0) { u32 w = b & -b; // bit of the next woman to consider int idx = pos[w]; int curr = pri[i][idx] + dp[i - 1][j & ~w]; if (curr > max) max = curr; b &= ~w; // w has been tested, remove it } dp[i][j] = max; } } }

In lines 9-10, the first row of values is filled as described above (`pri`

is the matrix of values from the input). Note that the set \(\{1\}\) is represented as an integer with the least significant bit set, the set \(\{2\}\) is represented by an integer with the second least significant bit set, and so on.

Then all the other rows are filled in the loop at line 14. Note how all possible sets of exactly \(i\) women are represented by the \(j\) variable. It is set initially to the value calculated on line 15, which uses simple bitwise manipulations to obtain an integer with the first \(i\) bits set.

A few other relevant details: first, a function called `next_popcount`

is used, and it must return the next integer, strictly greater than its argument, that has the same number of bits set (that number is sometimes called the “popcount”). A concise implementation of this function, courtesy of hakmem 175, is this:

u32 next_popcount(u32 n) { u32 c = (n & -n); u32 r = n+c; return (((r ^ n) >> 2) / c) | r; }

Second, the `pos`

array is simply an auxiliary variable to easily map single-bit bitmasks to the positions they represent. For example `pos[1 << 0]`

is 1, `pos[1 << 1]`

is 2, and so on.

Finally, have you noticed the curious expression of the form `(x & -x)`

that is used a couple of times? It simply produces an integer where the only bit set is the least significant bit set in the `x`

variable. That’s an interesting trick that is useful to remember.

A final note on using bitmask calculations for D.P. programs; sometimes writing a recursive implementation may not only be easier to write, it could also run much faster. Sometimes filling the whole D.P. structure (which requires calculations for every possible combination of bits) is not really necessary, so writing a recursive function where only the required values are calculated can be order(s) of magnitude faster. This is the case, for example, in the problem *Brush (IV)* (LightOJ ID 1018).

Nice Tutorial , thanks a lot. This has helped me very much!

ReplyDeleteMany many thanks for this very helpful task.

ReplyDeleteI can't understand the basics here :(

ReplyDelete