Thursday 30 August 2012

Turn the Lights Off

Problem Name: Turn the Lights Off
UVa ID: 10309
Keywords: brute force

I find this problem interesting for a number of reasons, but the first that comes to mind is that it involves a game with simple rules which anyone can pick up quickly, but its mathematical analysis is not trivial at all —and games of this kind are just fantastic, aren’t they?

Let’s start by mentioning that the solution I implemented can be classified as “brute force”, but if you think that it’s a naive brute–force (\(2^{100}\)), then we’re not on the same page, yet.

Actually, the first thing I did when I first read this problem was thinking about all the types of “blind” strategies and possible pruning that could be performed while looking for the solution through a breadth–first search. The results were not very encouraging. As you may have noticed, the space of possibilities is insanely large (\(2^{100}\)) and a BFS could end up using equally lunatic amounts of memory. The problem boils down to the simple fact that representing a “state” for this problem is not cheap: a boolean matrix of size \(10 \times 10\) is simply too much for any simple–minded approach.

Now, fortunately for us, some pretty smart people have worked hard on analysing this game in the past, and we can read about their findings through the magic of the Web :). First of all, as far as I know, the original game that implemented these mechanics is called Lights Out and was developed by a company called Tiger Toys in 1995, in case you want to do some research by yourself. If you do, one of the first website you’ll probably come across is this: Jaap’s Puzzle Page, which has a lot of information about it. There’s a related page dedicated exclusively to the mathematics behind Lights Out, which are fascinating to read about, but I won’t mention here mainly because my solution didn’t involve any sophisticated math analysis.

What I do want to mention though, is that there exists an heuristic that can be used to solve these puzzles, which can be summarised like this: press some buttons (i.e. switch some lights) from the top row, then “chase the lights” from the second row to the bottom. What this means is, starting with the second row, press a button only if the light to the north of the current position is on. Go on like this row–by–row until you reach the bottom. If in the end you haven’t cleared the grid, then go to the beginning, pressing some buttons from the top row, and repeating the process until you have solved it. Interesting, right?

But it gets better. In a paper —referenced in Jaap’s page— called “Turning Lights Out with Linear Algebra” we can find the following good news:

Lights Out can be generalized to an \(n \times n\) array of lights. (…) Of course if the dimension of the null space is zero, every configuration is winnable and the solution is unique (if no buttons are pressed more than once).

And then there’s a table where we find that the dimension of the null space for games of size \(10 \times 10\) is zero. In other words, for this problem, every configuration has a solution, and if we find it by “pressing a switch” no more than once, then it’s guaranteed to be the best solution.

Now we’re ready to formulate our solution: if we try pressing every combination of lights from the top row and then “chase the lights” downwards, we have to check only \(2^{10}\) alternatives, which are guaranteed to be enough to find our answer. I was truly amazed by how nicely this idea translates into elegant code, and it has a complexity of \(O(2^n n^2)\) (where \(n=10\) for this problem).

By the way, if you have some time to spare, maybe you’d like to give the heuristic explained here a try, on a game of “Lights Out” where the size of the grid increases at each level: here’s the game of Tacoyaki+. Have fun :).


References

Tuesday 14 August 2012

Olympic Swimming

Problem Name: Olympic Swimming
UVa ID: 11546
LightOJ ID: 1423
Keywords: sweep line, hashing

The London 2012 Olympics closing ceremonies took place just a couple of days ago, and it has given me a nice excuse to work on a problem that somehow involves an Olympic event :). Actually, I had this problem near the front of my queue of “to do” problems and it coincidentally emerged during these days of “Olympics fever”. The real reason I find it noteworthy, however, is that it also gives me the opportunity to talk about a couple of interesting concepts that I have not talked about in previous posts.

The problem can be summarised as follows: imagine a swimming pool with length L metres (\(L \in \mathbb{Z}\)). This pool has K lanes, and marks at every metre, so you can imagine it as the sum of \(L\) sections, each one 1 metre long and with \(K\) lanes. The pool has a number of hurdles or obstacles (it doesn’t make much sense in real swimming events, but never mind that), such that each hurdle is placed on a particular lane, between two consecutive marks.

The question is, if you can divide the pool only at a mark, what is longest possible segment of the pool in which all the lanes have the same number of hurdles (not necessarily at the same locations)?

Let’s start by looking at an example to help us analyse this problem. Consider a pool of length 10 and with three lanes. The lanes will have 3, 4 and 2 hurdles, as depicted in the following figure:

Try to find two marks \((i, j)\) such that all lanes have the same amount of hurdles in that segment of the pool and \(j - i\) is maximised. After some trial and error, it should become clear that the longest segment is \((5, 9)\), with a length of 4 and one hurdle per lane.

Now we can discuss the first interesting concept that this problem leads us into, which is the sweep line algorithm. The idea behind this is simple (although the specific details for certain problems might not be simple at all): you have a geometric plane and a set of points that relate to your model of a problem; you use an imaginary line that is moved over the plane, and every time it hits one or more points, some type of calculation is performed. In this way you’re computing a certain value gradually with an algorithm that has, at first, a complexity of \(O(n)\) —not counting the operations you perform at each “collision”. This type of algorithm can be applied to many interesting geometry–related problems, like finding the closest pair of points in a set (see problem UVa 10245) or computing the outline of the union of a set of rectangles with a common baseline (see problem UVa 105).

We’ll apply the general principle to this problem, but first let us give it a little more thought and find out how a sweep–line strategy may be useful. One of the first things to notice is that if you start at the mark \(i=0\) and move forward all the way to \(i=L\), the number of hurdles at each lane can only increase or stay the same at each step, so maybe a prefix sum can help us get closer to the answer. Let’s start by defining the following:

\(P_{i}\)
A tuple with the prefix sums of hurdles at position \(i\). Each element in the tuple corresponds to one lane.

According to this definition, the values of \(P\) for our example are:

Do you notice anything particularly interesting in this list, keeping in mind that we know that our answer is in the segment \((5, 9)\)? You should see that the relative differences among the elements in \(P_5\) are the same as in \(P_9\). In fact, those two tuples form the pair that is the most distant while having the same relative differences internally. To see it more clearly, let’s bring a second definition that simplifies things for us:

\(R_i\)
A tuple where each element \(R_{i, j} = \text{max}(P_i) - P_{i, j}\)

Now we have enough elements to formulate our main algorithm. If we use a sweep–line, we could calculate \(R_i\) and \(R_{i+1}\) at each point \(i\) given in the input and every time we find an \(R\) tuple that we have seen before, we calculate the distance between the new mark and the old mark, and our final answer comes from choosing the overall maximum.

This brings us to the second concept that I want to talk about —hashing. You see, a sweep line by itself is just a very basic approach, but what’s important is the type of operations you do at each critical point you find in your sweep, which commonly requires keeping track of data (inserting, querying, deleting) very quickly. For this reason, a sweep line algorithm usually comes along with some kind of efficient data structure such as B-trees or hash maps.

In Java you could use a java.util.HashMap (implemented internally using a hashing algorithm) and in C++ you could use STL’s map (implemented internally as an ordered tree) or use you own hashing algorithm. Given the tight time limit constraints for this problem (at least in the UVa judge), it’s probably better to use your own hashing algorithm. For a good introduction to many good hashing algorithms (and hashing in general), you can read this excellent article: The Art of Hashing.

I’d recommend you choose a simple algorithm, one that doesn’t perform many computations per byte. Some good ones are the Bernstein hash, the Shift-Add-XOR hash and the FNV hash. I personally use the FNV algorithm, but in practise they tend to perform very similarly.

I will not post the code for a hash table here because if you are about to write your own hash map implementation for the first time, I think it would be better to do it from scratch and use it as an exercise to learn the concepts in depth (and also you end up with something written in your own style). However, I’ll give you the general idea of how all of this glues together.

First, you have to implement your hash table. There are many ways to do it but a fairly simple and effective method is keeping a number of “buckets” to place your data into, and for any piece of data you want to store, you find its corresponding bucket using your hashing algorithm. The right amount of buckets to use depends on the problem, your hashing algorithm and the size of your input, but I’ve seen that you quickly reach a point where using more buckets doesn’t have any significant impact in the efficiency of your program, so you don’t need to choose a very big number. For this problem, for example, I’ve used \(2^{15}\). Since you have to assume that in general there will be collisions in your hashing procedure, each bucket is, in reality, a linked list (or vector) of data. Don’t worry if you think that using linked lists will be slow. When you have a fairly decent hashing algorithm, the number of collisions should be minimised, and the resulting average complexity is really good, perhaps even an order of magnitude better than using other data structures like trees in your buckets because of the minimal overhead of using a linked list or vector.

With your hash table in place, you apply the sweep line strategy. To put it all together, the algorithm for this problem could look something like this:

  1. Read the data that describes the pool, and sort the hurdles, first by their mark \(p_i\), and then by their lane.
  2. Let \(A=0\), where \(A\) represents our final answer.
  3. Let \(R\) be the tuple \(\langle0, 0, 0\rangle\). Add \(R\) to your hash table, associated with the value 0 (the mark that corresponds to the tuple \(R\)).
  4. Sweep line. For each hurdle in your data:
    1. Let \(m\) be the mark of the current hurdle. The tuple \(R_m\) is the same \(R\) we had previously, so calculate the difference \(d = m - \text{HT}(R)\) where \(\text{HT}(R)\) is the value associated to \(R\) in the hash table. Let \(A = \text{max}(A, d)\).
    2. Take from your input all other hurdles that are also at \(m\), and with that calculate \(R_{m+1}\).
    3. Add the new \(R\) to the hash table associated with the mark \(m+1\). If \(R\) was already in the table, calculate \(d = m + 1 - \text{HT}(R)\) and \(A = \text{max}(A, d)\).
  5. In the end, the current \(R\) should be the value of \(R_L\), so calculate \(A = \text{max}(A, L - \text{HT}(R))\).

The final value of \(A\) should be our answer. Our resulting algorithm starts by sorting the input —with a complexity of \(O(n K \log{n K})\)— then does a sweep line (\(O(n)\)) with a hashing algorithm at each critical point (\(O(K)\)), which gives us an overall complexity of \(O(n K \log{n K})\).

Friday 3 August 2012

Goldbach and Euler

Problem Name: Goldbach and Euler
UVa ID: 10311
Keywords: number theory, primes, sieve

When you come across a problem titled after two well-known, brilliant mathematicians, you know it’s going to be good :).

What the problem is asking for can be expressed as follows: given an integer n (which can go as high as 108), find —if possible— two prime numbers p1 and p2 such that:

  • \(p_1 + p_2 = n\)
  • \(p_2 > p_1\)
  • \(p_2 - p_1\) is minimised

It is clear that for this problem it is necessary to find out which numbers are primes in the range \((0 : 10^8)\). Let’s focus on that first.

As you probably know, one good way to find prime numbers computationally is using the classic Sieve of Eratosthenes (or a similar but possibly faster alternative, such as the Sieve of Atkin). No matter what kind of “sieve” you use, you have to implement it in a way that allows you to store information about 108 numbers, which is not a trivial job.

Let us generalise the problem and say that you want to find all prime numbers up to an integer N. If you write a regular implementation using a boolean (bool in C++ or similar) data type for each integer in the range [1:N] then you would end up using N bytes of memory, which for this problem would mean about 95MB —just “a tad” heavy.

Fortunately, there are ways to reduce the memory requirements with relatively few changes that are easy to code. The main idea is that a boolean value can be stored in a single bit, so you can store 8 boolean values in a single byte. In addition, consider that most prime numbers are odd; in fact, there is only one even prime number which is 2. With this in mind, you can further cut your memory requirements in half by simply not storing information for even numbers (you have to handle the case of 2 separately, of course).

So, with these adjustments, we can now implement a prime-finding sieve that requires only \(N \div (8 * 2)\) bytes. If we store this information in 4-byte integers, then we require only \(N \div 64\) integers. The following is a C++ implementation of this type of sieve, using the standard Eratosthenes algorithm, which has served me well for some time:

const int MAX = 100000000;  // 10^8
const int LMT =     10000;  // sqrt(MAX)

// array of data; each bit tells if a certain number is composite or not
int _c[(MAX>>6)+1];

// we use a STL vector to store the primes
vector<int> primes;

// possibly the most important piece for a bitwise sieve
// these are the macros that check and set a bit
#define IsComp(n)  (_c[n>>6]&(1<<((n>>1)&31)))
#define SetComp(n) _c[n>>6]|=(1<<((n>>1)&31))

void prime_sieve() {
    for (int i = 3; i <= LMT; i += 2)
        if (!IsComp(i))
            for (int j = i*i; j <= MAX; j += i+i)
                SetComp(j);

    primes.push_back(2);
    for (int i=3; i <= MAX; i += 2)
        if (!IsComp(i))
            primes.push_back(i);
}

bool is_prime(int n) {
    if (n < 2 || n % 2 == 0) return false;
    return ! IsComp(n);
}

As you can see, this implementation reserves about \(10^8 \div 64\) integers for the main array, which represents about 6MB of memory, a good deal better than the original 95MB. However, it also uses memory to store the actual prime numbers in a vector. Since there are about 5.8 million primes under 108, then this would require an extra \(4 \times 5.8 \times 10^6\) bytes (about 22MB).

Having solved the problem of calculating and storing the primes, now let’s move on to the problem of finding \(p_1\) and \(p_2\). The first idea that comes to mind is something like simply trying to find \(p_1\) starting with \(n \div 2\) and going downwards, checking that \(p_1\) and \(p_2 = n-p_1\) are primes.

However, it’s easy to see that this simple idea would take too long in practise. There’s one thing that we could do that would improve significantly the complexity of the algorithm, which is not iterating through all integers, not even through odd integers only, but only over the prime numbers. Since we already have a vector full of primes below 108, we can binary search the location of \(n \div 2\) and go down from it. That would make for a worse case of \(5.8 \times 10^6\) iterations instead of \(10^8 \div 2\).

We have improved things but, can we do even better? The discussion found in the problem statement about Goldbach’s conjecture and Euler’s ideas on the subject is very interesting on itself, but it also points to something that could be useful: an even number can always be expressed as the sum of two primes (well, it’s a conjecture, but it has been tested for a range of numbers much larger than \((0 : 10^8)\), so we can assume it’s true for our purposes). This means that we can be reasonably confident that the process of finding \(p_1\) will succeed for most even numbers, hopefully quickly enough that it doesn’t have to check millions of primes. Note that I said most even numbers; this is because we need two distinct primes \(p_1\) and \(p_2\), remember? For example, 6 can be expressed as the sum of two primes (3 + 3) but not as the sum of two distinct primes.

Okay, we have concluded that for even numbers our algorithm may not be so bad, but what about odd numbers? Well, it turns out that for odd numbers the algorithm can be \(O(1)\)! If \(n\) is odd, then one of the two primes \(p_1\) and \(p_2\) has to be even, while the other has to be odd (if you don’t see this clearly, think about how even numbers are integers of the form \(2k\) and odd numbers are integers of the form \(2k + 1\)). Do you see where this is going? If one of the primes has to be even, then that prime would have to be 2 (there are no other even primes). So, if \(n\) is odd we simply check if \(n-2\) is prime, and if it is, then we know that \(p_1=2\) and \(p_2=n-2\), otherwise there is no solution.

We have covered all possible cases now, and after testing it, it produces an answer reasonably fast. Prime numbers are lovely :).