Zachary W. Huang

The Coolest Math Problem I've Ever Done

December 27, 2023

Generating Functions

One of the coolest math concepts I actually never learned properly is generating functions. That is, until taking Ma 2 at Caltech, a class on ordinary differential equations. Now how might this be the case?

The essence of generating functions is that there is a relationship between a sequence of numbers and its generating function, and we can prove things about one using the other. What this means is that we can use methods of solving “continuous” problems (like ODEs) in order to solve “discrete” ones (like counting problems) and vice versa. This is very cool (and I’ll show you why).

First, let’s go over some definitions. So, for some infinite sequence of real numbers $(a_n)_{n \geq 0}$ (a way of denoting $a_0, a_1, a_2, ...$ ), we define its ordinary generating function as the following power series:

A(t) = \sum_{n=0}^\infty a_n t^n

Note that there are a few details to worry about if you want to be completely rigorous — there is a difference between a formal generating function (which is an algebraic object) and the numeric function it corresponds to (a mapping from real numbers to real numbers), but these definitions should coincide as long as we stay in the generating function’s radius of convergence.

Suppose we have sequences $(a_n)_{n \geq 0}$ and $(b_n)_{n \geq 0}$ with ordinary generating functions $A(t) = \sum_{n=0}^\infty a_n t^n$ and $B(t) = \sum_{n=0}^\infty b_n t^n$ . Now, let’s make some observations about how sequences and their generating functions relate.

Addition: the sequence $(c_n)_{n \geq 0}$ defined by $c_n = a_n + b_n$ is associated with the generating function

C(x) = \sum_{n=0}^\infty (a_n + b_n) t^n

Shifting: the sequence $(a_{n+k})_{n \geq 0}$ (a way of denoting $a_k, a_{k+1}, a_{k+2}, ...$ ) for some constant $k$ has the generating function

\sum_{n=0}^\infty a_{n + k} t^n = \frac{1}{t^k}(A(t) - a_0 - a_1 t - ... - a_{k-1} t^{k-1})

We get this by simply taking out the first $k$ terms of the generating function of $(a_n)_{n \geq 0}$ and then scaling each term by $t^{-k}$ to correct the exponents.

Multiplication/Convolution: the sequence $(a * b)_{n \geq 0}$ , defined by $(a * b)_n = \sum_{k=0}^{n} a_n b_{n-k}$ (this operation is also known as a discrete convolution) has the generating function

A(t) \cdot B(t)

We can see this by multiplying $\sum_{n=0}^\infty a_n t^n$ and $\sum_{n=0}^\infty b_n t^n$ and then grouping the terms by exponent.

Derivatives: the sequence $(d_n)_{n \geq 0}$ , defined by $d_n = n \cdot a_n$ has the generating function

D(x) = t \cdot A'(x)

We see this by $t \cdot \frac{d}{dt} (\sum_{n=0}^\infty a_n t^n) = t \cdot \sum_{n=0}^\infty n \cdot a_n t^{n-1} = \sum_{n=0}^\infty (n \cdot a_n) t^n$ .

Now, let’s see if these can help us solve problems.

The Problem

Disclaimer: if you’re currently taking Ma 2 analytical at Caltech, it may be an honor code violation to read the rest of this blog post. Sorry!

Let me introduce the problem of the day.

For all positive $n$ , find $A_n$ , the number of strings of length $n$ which consist of the characters *, (, and ) such that all the parentheses in the string are “balanced”. This means that every left parenthesis is correctly matched with a right parenthesis, so you can’t have something like ”())(“.

Now, if you’re familiar with the balanced parentheses problem (basically the above problem but without astericks), then you’ll should be able to find an easy combinatorial solution to the problem. However, for the sake of this blog post, let’s try to apply generating functions instead.

First, let’s look at some small cases. We have:

\begin{align*} A_0 &= | \{ \epsilon \} | = 1\\ A_1 &= |\{ * \}| = 1\\ A_2 &= |\{ **, () \}| = 2\\ A_3 &= |\{ ***, *(), (*), ()* \}| = 4\\ \cdots \end{align*}

where $\epsilon$ is the empty string.

Now, let’s try to describe this problem recursively. Suppose we want to construct a length $n$ string that satisfies the above. Say we have $n$ blank slots to fill up with characters.

\_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_

We will assume that we already know exactly how many ways there are to generate strings of length $< n$ . Then all we need to do is fill up one or two of the slots to reduce the problem down to one we already know how to solve (this is basically the same as strong induction).

One thing we can do is place two matching parentheses into the slots and then fill up the rest by recursion.

( \: \_ \: \_ \: \_ \: \_ \: \_ \: ) \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_

Note that in order to prevent overcounting, the left parenthesis must be in first slot. The number of ways to fill in the blanks now is $A_k A_{n-k-2}$ , where $k$ is the number of slots in between the parentheses. Summing this up for all possible $k$ , we get $\sum_{k=0}^{n-2} A_k A_{n-k-2}$ .

Another thing we can do is to place an asterick in the first slot (again, to prevent overcounting) and then recurse.

* \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_ \: \_

There is only one way to do this, and it yields $A_{n-1}$ possibilities.

Thus, we have the recursion:

A_n = A_{n-1} + \sum_{k=0}^{n-2} A_k A_{n-k-2}

or equivalently, by $n \rightarrow n + 2$ ,

A_{n+2} = A_{n+1} + (A * A)_n

What this means is that the sequence $(A_{n+2})_{n \geq 0}$ is equivalent to the sequence defined by $(A_{n+1} + (A * A)_n)_{n \geq 0}$ , and therefore, these must have the same generating function.

Let $f(t) = \sum_{n = 0}^\infty A_n t^n$ be the generating function for the sequence $(A_n)_{n \geq 0}$ .

Using the definitions from above, we have:

\frac{1}{t^2}(f - 1 - t) = \frac{1}{t}(f - 1) + f^2

Simplifying, we get:

f - 1 - t = t(f - 1) + t^2 f^2

t^2 f^2 + (t - 1) f + 1 = 0

Using the quadratic formula, we get:

f = \frac{1 - t \pm \sqrt{-3t^2 - 2t + 1}}{2t^2}

for all positive $t$ in the radius of convergence of $f$ . However, since $\lim_{t \rightarrow 0} \frac{1 - t - \sqrt{-3t^2 - 2t + 1}}{2t^2}$ converges by L’Hopital’s rule and $\lim_{t \rightarrow 0} \frac{1 - t + \sqrt{-3t^2 - 2t + 1}}{2t^2}$ diverges, we must take $f(t) = \frac{1 - t - \sqrt{-3t^2 - 2t + 1}}{2t^2}$ as the only possibility.

Great! We now have an explicit formula for the generating function of $(A_n)_{n \geq 0}$ . Now, if we can turn this back into a power series (e.g. in the form $\sum_{n=0}^\infty g(n) t^n$ for some expression $g(n)$ in terms of $n$ ), then we’ll have a closed form for $A_n$ .

A useful tool here would be the generalized binomial theorem, which states:

(1 + x)^\alpha = \sum_{n=0}^\infty \binom{\alpha}{n} x^n

where the binomial coefficient is generalized as

\binom{\alpha}{n} = \frac{\alpha(\alpha - 1)(\alpha - 2) \cdots (\alpha - n + 1)}{n!}

for all reals $\alpha$ and all nonnegative integers $k$ .

Yes, this means expressions like $\binom{1/2}{4}$ are completely valid and well-defined 🤯.

In other words, we have:

\begin{align*} \sqrt{-3t^2 - 2t + 1} &= \big(t - (3t^2 + 2t) \big)^{1/2}\\ &= \sum_{n=0}^\infty \binom{1/2}{n} (-1)^n (3t^2 + 2t)^n\\ &= \sum_{n=0}^\infty \binom{1/2}{n} (-1)^n (3t + 2)^n t^n\\ &= \sum_{n=0}^\infty \binom{1/2}{n} (-1)^n \Big[\sum_{k=0}^n (3t)^k 2^{n-k} \binom{n}{k} \Big] t^n \end{align*}

To make this a bit simpler, we can define the coefficient

A(n, k) = \binom{1/2}{n} (-1)^n \: 3^k \: 2^{n-k} \binom{n}{k}

so that

\sqrt{-3t^2 - 2t + 1} = \sum_{n=0}^\infty \sum_{k=0}^n A(n, k) \: t^{n+k}

Since we want this expression in the form of a power series, we want to group the terms by exponent. This requires a sort of diagonalization, as seen in the following figure (by lining up the $t$ s).

\begin{smallmatrix} A(0, 0)\\ & + & A(1, 0) \: t & + & A(1, 1) \: t^2\\ &&& + & A(2, 0) \: t^2 & + & A(2, 1) \: t^3 & + & A(2, 2) \: t^4\\ &&&&& + & A(3, 0) \: t^3 & + & A(3, 1) \: t^4 & + & A(3, 2) \: t^5 & + & A(3, 3) \: t^6\\ &&&&&&& + & A(4, 0) \: t^4 & + & A(4, 1) \: t^5 & + & A(4, 2) \: t^6 & + & A(4, 3) \: t^7 & + & A(4, 4) \: t^8\\ \\ &&&&&&\cdots \end{smallmatrix}

What we notice is that for every term $t^m$ , there are $\lfloor \frac{m}{2} \rfloor + 1$ terms contributing to its coefficient in the final sum. In addition, the sum looks something like: $\big[A(m, 0) + A(m - 1, 1) + ... + A(m / 2, m / 2)\big]$ .

Thus, we can reindex our summation in terms of an exponent $m$ and some “offset” $i$ . This gives:

\sqrt{-3t^2 - 2t + 1} = \sum_{m=0}^\infty \sum_{i=0}^{\lfloor m/2 \rfloor} A(m - i, i) \: t^m

We’re almost done! Substituting this power series into the original expression for $f$ , we get the following (with some simplification):

\begin{align*} f(t) &= \frac{1}{2t^2} \Big[1 - t - \big( \sum_{m=0}^\infty \sum_{i=0}^{\lfloor m/2 \rfloor} A(m - i, i) \: t^m \big)\Big]\\ &= \frac{1}{2t^2} \Big[1 - t - \big( 1 - t + \sum_{m=2}^\infty \sum_{i=0}^{\lfloor m/2 \rfloor} A(m - i, i) \: t^m \big)\Big] & \text{taking out the first two terms of the summation}\\ &= -\frac{1}{2t^2} \Big[\sum_{m=2}^\infty \sum_{i=0}^{\lfloor m/2 \rfloor} A(m - i, i) \: t^m \Big]\\ &= \sum_{m=2}^\infty \; -\frac{1}{2} \sum_{i=0}^{\lfloor m/2 \rfloor} A(m - i, i) \: t^{m - 2}\\ &= \sum_{n=0}^\infty \; -\frac{1}{2} \sum_{i=0}^{\lfloor n/2 \rfloor + 1} A(n + 2 - i, i) \: t^n & \text{by substituting $m \rightarrow n + 2$}\\ \end{align*}

And since we originally defined $f(t) = \sum_{n = 0}^\infty A_n t^n$ , we must therefore have:

\begin{align*} A_n &= -\frac{1}{2} \sum_{i=0}^{\lfloor n/2 \rfloor + 1} A(n + 2 - i, i)\\ &= \boxed{-\frac{1}{2} \sum_{i=0}^{\lfloor n/2 \rfloor + 1} \binom{1/2}{n + 2 - i} (-1)^{n + 2 - i} \: 3^i \: 2^{n + 2 - 2i} \binom{n + 2 - i}{i}}\\ \end{align*}

Ok, I admit, this is absolutely horrendous. Not only do we have a summation to the floor of $\frac{n}{2}$ , but there are exponents strewn all over the place, and we have the binomial coefficient of $\frac{1}{2}$ , for christ’s sake! Like, what does that even mean?!

But it’s correct. And its a closed form!

Taking a step back, I was just genuinely amazed by the fact that we were able to take a discrete problem, convert it into a continuous one, solve it, and then convert it back into a discrete solution. These are the kinds of crazy antics I would expect from a 3Blue1Brown video!

Wrapping Up

I wrote a quick script in Julia to try to verify the identity, and it seems match with the original recursive definition.

using Memoization

# The closed form solution found above, with some slight modifications to avoid integer overflow
# Note: make sure you're using Julia version 1.10 or later, as the generalized
# binomial coefficient is not implemented in earlier versions.
function closed_form(n::Integer)
    total = BigInt(0)

    for i in 0:floor(Integer, n / 2 + 1)
        total += (-1)^(n + 1 - i) *
                 binomial(1 // 2, BigInt(n + 2 - i)) *
                 BigInt(3)^i *
                 (1 // BigInt(2))^(-n - 1 + 2i) *
                 binomial(BigInt(n + 2 - i), i)
    end

    return total
end

# Implements the recursion: A_n = A_{n-1} + (A * A)_{n-2}
# Memoized as to not take exponential time
@memoize function recursive_form(n::Integer)
    if n == 0 || n == 1
        return BigInt(1)
    end

    total = recursive_form(n - 1)
    for k in 0:(n-2)
        total += recursive_form(k) * recursive_form(n - 2 - k)
    end

    return total
end


# Testing computational efficiency for fun
# tldr: the recursive solution is wayyy faster and more efficient
T = 500

# 18.047373 seconds (649.79 M allocations: 14.587 GiB, 26.89% gc time, 1.04% compilation time)
@time A1 = collect(closed_form(n) for n = 0:T);

# 0.075175 seconds (1.17 M allocations: 36.745 MiB, 13.54% gc time, 25.54% compilation time)
@time A2 = collect(recursive_form(n) for n = 0:T);

@assert A1 == A2 # verify correctness up to the first T terms of A

# first 15 terms: [1, 1, 2, 4, 9, 21, 51, 127, 323, 835, 2188, 5798, 15511, 41835, 113634]
println(A1[1:15])

It turns out that the numbers we have found are known as the Motzkin numbers (OEIS A001006). Another closed form for $A_n$ is:

A_n = \sum_{k=0}^{\lfloor n / 2 \rfloor} \binom{n}{2k} \: C_k

where $C_n = \frac{1}{n + 1} \binom{2n}{n}$ are the Catalan numbers.

The Catalan numbers are actually the solution to the problem introduced above without astericks (in other words, the number of length- $n$ strings with balanced parentheses). With this in mind, the formulation of $A_n$ in terms of $C_n$ above is pretty trivial — we simply choose the $2k$ out of $n$ slots to be parentheses (there are $\binom{2k}{n}$ ways of doing this), count how many ways there are to fill in these slots ( $C_k$ ways), and then fill the rest of the slots with astericks. Since we can have at most $\lfloor n / 2 \rfloor$ pairs of parentheses in a length $n$ string, $k$ can go from $0$ up to $\lfloor n / 2 \rfloor$ .

As an exercise for the reader, you can try to derive the Catalan numbers using the method I used above to derive the Motzkin numbers (in other words, solve $C_{n+1} = (C * C)_n$ — why is this recursion correct?). It should be quite a bit cleaner than the above problem.

back