Catalan Numbers

So we have just talked about implicit data structures which a programmer can define to store trees and other constructs, but it turns our that some basic principles of mathematics can be used as implicit data structures themselves. A great example of this is the Catalan numbers, and in this section we will explore some of the many (many many many) different problems which can be converted to and from this special set of integers.

The Catalan Numbers follow a simple rule. The n'th Catalan number is equal to...

C_nfrac 1n12n choose nfrac 2nn1nprod limits _k2nfrac nkkqquad mbox for ngeq 0

[shamelessly copied from Wikipedia]

So the first ten Catalan numbers are: 

n n'th Catalan Number
0 1
1 1
2 2
3 5
4 14
5 42
6 132
7 429
8 1,430
9 4,862
10 16,796

As you can see, the Catalan numbers grow very quickly, similar (but not as quick) as factorial numbers.

Catalan Number Problems

The following problems can all convert between one another, and all of them have distinct properties shared because of their relationship to the set of Catalan numbers. We will only cover a handful of these (there are easily over 100 different problems which we could discuss), but hopefully that should provide substantial insight in this topic.

Ballot Sequences:

Imagine we are counting the results of a local election between Alice and Bob. Suspense is high, and people wait with bated breath for the outcome. Suppose the election is a tie, how many ways can you count the votes such that Alice is never behind? In other words, if there are 100 ballots, and you start tallying from the first cast to the last, then how many orderings are there such that each step of the way, Alice has at least as many votes as Bob.


ABABAABABB - Correct Ballot Sequence
BA - Incorrect Ballot Sequence
ABBA - Incorrect Ballot Sequence
AABB - Correct Ballot Sequence

Well, it turns out that if there are N votes cast for Alice, and N votes cast for Bob (2N votes in total) then the number of orderings is the N'th Catalan number!

Balanced Parentheses

Next, imagine you are a Lisp Programmer, for those unfamiliar, here is a little bit of old style Lisp code: 

(defun fibonacci(n)
    ((eq n 1) 0)
    ((eq n 2) 1)
    ((+ (fibonacci (- n 1)) (fibonacci (- n 2))))))))
and you are really concerned about getting your parentheses in order. If you have N '(' characters, and N ')' characters, how many orderings exist such that the overall string makes sense? Once again, this turns out to follow the Catalan numbers.


() - Good
)( - Bad
(()(()())) - Good
()())(() - Bad

Relating the Two

So how can we compare these two?

This one is very straightforward: Lets equate 'A' to '(' and 'B' to ')'. By doing this we can convert between the two.

ABAABABB becomes ()(()())

Note that a ballot sequence meets our ordering requirements if and only if we can formulate it as a valid string of parenthesis.