The subject of enumerative combinatorics is counting. Generally, there is some set $A$ and we wish to calculate the size $|A|$ of $A$. Here are some sample problems:
There are a number of basic principles that we can use to solve such problems.
The sum principle: Consider $n$ sets $A_i$, for $1 \leq i \leq n$, that are pairwise disjoint, namely $A_i \cap A_j = \emptyset$ for all $i \neq j$. Then
\[\left|\bigcup_{i=1}^{n} A_i\right| = \sum_{i=1}^{n} |A_i|.\]For example, if there are $n$ ways to pick an object from the first pile and $m$ ways to pick an object from the second pile, there are $n+m$ ways to pick an object altogether.
The product principle: If we need to do $n$ things one after the other, and there are $c_1$ ways to do the first, $c_2$ ways to do the second, and so on, the number of possible courses of action is
\[\prod_{i=1}^{n} c_i.\]For example, the number of possible three-letter words in which a letter appears at most once that can be constructed using the English alphabet is $26 \cdot 25 \cdot 24$: There are 26 possibilities for the first letter, then 25 possibilities for the second, and finally 24 possibilities for the third.
The bijection principle: As we have seen, there exists a bijection from $A$ to $B$ if and only if the size of $A$ equals the size of $B$. Thus, one way to count the number of elements in a set $A$ is to show that there is a bijection from $A$ to some other set $B$ of known size.
Choosing an ordered sequence of distinct objects with repetition. How many ways are there to pick an ordered sequence of $k$ objects from a pool with $n$ types of objects, when repetitions are allowed? (That is, we can pick an object of the same type more than once.) By the product principle, there are $n$ options for the first object, $n$ options for the second, and so on. Overall, we get $n^k$ possible sequences. What follows is a somewhat more formal argument by induction. Observe that the number of sequences as above is the same as the number of functions from a set of $k$ elements to a set of $n$ elements.
Theorem 9.2.1.
Given sets $A$ and $B$, such that $|A| = k$ and $|B| = n$, the number of functions $f: A \to B$ is $n^k$.
Proof.
Induction on $k$. If $k = 0$ the set $A$ has no elements and there is only one mapping from $A$ to $B$, the empty mapping. (Recall that a function $f: A \to B$ is a subset of $A \times B$, and if $A = \emptyset$ then $A \times B = \emptyset$.) We suppose the claim holds for $|A| = m$ and treat the case $|A| = m + 1$. Consider some element $a \in A$. To specify a function $f: A \to B$ we can specify $f(a) \in B$ and a mapping $f’: A \setminus {a} \to B$. There are $n$ possible values of $f(a) \in B$, and for each of these there are $n^m$ mappings $f’$ by the induction hypothesis. This results in $n^{m+1}$ mappings $f$ and completes the proof by induction.
Choosing an ordered sequence of distinct objects without repetition.
How many ways are there to pick an ordered sequence of $k$ objects from a set of $n$ objects when only one copy of each object is available, so there can be no repetitions? Again, we can use the product principle. Observe that the first object in the sequence can be chosen from $n$ distinct objects. Once the first one is picked, there are only $n - 1$ possibilities for the second object. After that, there are $n - 2$ objects to choose from, and so on. Overall, we get that the desired quantity is
\[n(n - 1) \cdot \ldots \cdot (n - k + 1) = \prod_{i=0}^{k-1} (n - i).\]This is called a falling factorial and denoted by $(n)_k$ or $n^{\underline{k}}$. We again provide a more formal proof by induction, observing that the number of ways to pick an ordered sequence of $k$ objects from a collection of $n$ distinct ones without replacement is equal to the number of one-to-one functions $f: A \to B$, where $|A| = k$ and $|B| = n.
Theorem 9.2.2. Given sets $A$ and $B$, such that $|A| = k$ and $|B| = n$, the number of one-to-one functions $f: A \to B$ is $(n)_k$.
Proof.
Induction on $k$. When $|A| = 0$, there is one mapping $f$ as described, the empty mapping, and $(n)_k$ is the empty product, equal to 1.
Suppose the claim holds for $|A| = m$ and consider the case $|A| = m + 1$. Fix an element $a \in A$. To specify $f$ we specify $f(a)$ and a mapping $f’: A \setminus {a} \to B$.
There are $n$ possible values for $f(a) \in B$. Consider a specific such value $f(a) = b$. Since $f$ is one-to-one, no element of $A \setminus {a}$ can be mapped to $b$. Thus $f’$ has to be a one-to-one mapping from $A \setminus {a}$ to $B \setminus {b}$.
By the induction hypothesis, the number of such mappings is $(n - 1)^m$. The number of possible mappings $f$ is thus $n \cdot (n - 1)^m = (n)_{m+1}$.
Permutations. How many ways are there to arrange $n$ people in a row? How many ordered $n$-tuples are there of integers from the set ${1, 2, \ldots, n}$? How many distinct rearrangements are there of the integers 1, 2, \ldots, $n$? How many bijections are there from the set ${1, 2, \ldots, n}$ to itself? The answer to these questions is the same and follows from Theorem 9.2.2. A bijection from a set $A$ to itself is called a permutation of $A$. The number of permutations of the set ${1, 2, \ldots, n}$ is precisely the number of one-to-one functions from this set to itself, and this number is $(n)_n = n \cdot (n - 1) \cdot \ldots \cdot 2 \cdot 1$. This quantity is called “$n$ factorial” and is denoted by $n!$. We can now observe that
\[(n)_k = \frac{n!}{(n - k)!}.\]It is important to remember that $0! = 1$, since $0!$ is the empty product. Here is a list of values of $n!$ for $0 \leq n \leq 10:
\[\begin{aligned} 0! & = 1 \\ 1! & = 1 \\ 2! & = 2 \\ 3! & = 6 \\ 4! & = 24 \\ 5! & = 120 \\ 6! & = 720 \\ 7! & = 5040 \\ 8! & = 40320 \\ 9! & = 362880 \\ 10! & = 3628800 \\ \end{aligned}\]Seating at a round table. We’ve arranged $n$ people in a row, now it’s time to sit them down. So how many ways are there to seat $n$ people at a round table? Let’s be precise about what we mean: Two seating arrangements are considered identical if every person has the same neighbor to her right. In other words, rotations around the table do not matter. Here is how this problem can be tackled: Fix one person $a$ and sit her down anywhere. This now fixes $n - 1$ possible positions for the others: “first person to the right of $a$”, “second person to the right of $a$”, and so on until “$(n - 1)$-st person to the right of $a$”. The number of ways to arrange the others in these $n - 1$ positions is $(n - 1)!$, which is also the answer to the original question.
Choosing an unordered collection of distinct objects without repetition. How many ways are there to pick a set of $k$ objects from a set of $n$ objects? Since we are picking a set, we do not care about order, and there are no repetitions. Notice that every such set can be ordered in $k!$ ways. That is, each set corresponds to $k!$ distinct ordered $k$-tuples. Now, we know that the number of ordered $k$-tuples that can be picked from a collection of $n$ distinct objects is $(n)_k$. Thus if we denote by $X$ the number of sets of cardinality $k$ that can be picked from a collection of $n$ distinct objects, we get
\[X \cdot k! = (n)_k\] \[X = \frac{(n)_k}{k!}\] \[X = \frac{n!}{k!(n - k)!}\]This quantity $X$ is denoted by $\binom{n}{k}$, read “$n$ choose $k$”. This is such an important quantity that we emphasize it again: The number of $k$-element subsets of an $n$-element set is
\[\binom{n}{k} = \frac{n!}{k!(n - k)!} = \frac{\prod_{i=0}^{k-1} (n - i)}{k!}\]We can see that $\binom{n}{0} = \binom{n}{n} = 1$, and we define $\binom{n}{k}$ to be zero when $k > n$.
The identity $\binom{n}{k} = \binom{n}{n - k}$ is an immediate consequence of the definition. Another important identity is
\[\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}\]Proof. Consider an $n$-element set $A$ and a specific element $a \in A$. Any $k$-element subset of $A$ either contains $a$ or it doesn’t. If it contains $a$, we need to pick the remaining $k-1$ elements from $A \setminus {a}$. The number of ways to do this is $\binom{n-1}{k-1}$. If the $k$-element subset does not contain $a$, we need to pick all $k$ elements from $A \setminus {a}$, and the number of ways to do this is $\binom{n-1}{k}$. This concludes the proof of the identity.
Choosing an unordered collection of distinct objects with repetition. How many ways are there to pick a multiset of $k$ elements from a set of $n$ elements? Here is how we can count this: Write down $k$ copies of “$x$” in a row, representing the $k$ objects we pick. Also write down $n - 1$ copies of
$ | $ in a row, serving as dividers between the different types of objects. For example, the sequence |
“$$xx | x | xx | $$” represents the multiset ${a, a, b, e, e}$ of 5 objects picked from the set ${a, b, c, d, e}$, of size 5. |
The total number of symbols is $k + n - 1$, and we need to choose $k$ of them to be $x$’s (or, equivalently, $n - 1$ of them to be $ | $’s). |
Thus the desired number of ways to pick a multiset of $k$ objects from a set of $n$ objects is
\[\binom{n + k - 1}{k}\]also written as
\(\binom{n + k - 1}{n - 1}\).
Recall that $\binom{n}{k}$ is the number of $k$-element subsets of an $n$-element set, and
\[\binom{n}{k} = \frac{n!}{k!(n- k)!} = \frac{\prod_{i=0}^{k-1} (n- i)}{k!}.\]The quantities $\binom{n}{k}$ are called binomial coefficients because of their role in the Binomial Theorem, apparently known to the 11th century Persian scholar Omar Khayyam. Before we state and prove the theorem let us consider some important identities that involve binomial coefficients. One that follows immediately from the algebraic definition is
\[\binom{n}{k} = \binom{n}{n- k}.\]This also has a nice combinatorial interpretation: Choosing a $k$-element subset $B$ from an $n$-element set uniquely identifies the complement $A \setminus B$ of $B$ in $A$, which is an $(n-k)$-subset of $A$. This defines a bijection between $k$-element and $(n-k)$-element subsets of $A$, which implies the identity.
Another relation between binomial coefficients is called Pascal’s rule, although it was known centuries before Pascal’s time in the Middle East and India:
\[\binom{n}{k - 1} + \binom{n}{k} = \binom{n + 1}{k}.\]This can be easily proved algebraically:
\[\binom{n}{k - 1} + \binom{n}{k} = \frac{n!}{(k - 1)!(n + 1- k)!} + \frac{n!}{k!(n- k)!} = \frac{n!k}{k!(n + 1- k)!} + \frac{n!(n + 1- k)}{k!(n + 1- k)!} = \frac{n!k + n!(n + 1- k)}{k!(n + 1- k)!} = \frac{(n + 1)!}{k!(n + 1- k)!} = \binom{n + 1}{k}.\]Pascal’s rule also has a combinatorial interpretation: $\binom{n+1}{k}$ is the number of $k$-element subsets of an $n+1$-element set $A$. Fix an element $a \in A$. A subset of $A$ either contains $a$ or it doesn’t. $k$-element subsets of $A$ that do not contain $a$ are in fact $k$-element subsets of $A \setminus {a}$ and their number is $\binom{n}{k}$. $k$-element subsets of $A$ that do contain $a$ bijectively correspond to $(k - 1)$-element subsets of $A \setminus {a}$, the number of which is $\binom{n}{k-1}$. The identity follows.
Another illuminating identity is the Vandermonde convolution:
\[\binom{m + n}{l} = \sum_{k=0}^{l} \binom{m}{k} \binom{n}{l - k}.\]We only give a combinatorial argument for this one. We are counting the number of ways to choose an $l$-element subset of an $(m + n)$-element set $A$. Fix an $m$-element subset $B \subseteq A$. Any $l$-element subset $S$ of $A$ has $k$ elements from $B$ and $l-k$ elements from $A \setminus B$, for some $0 \leq k \leq l$. For a particular value of $k$, the number of $k$-element subsets of $B$ that can be part of $S$ is $\binom{m}{k}$ and the number of $(l - k)$-element subsets of $A \setminus B$ is $\binom{n}{l-k}$. We can now use the sum principle to sum over the possible values of $k$ and obtain the identity. An interesting special case is
\[\sum_{k=0}^{n} \binom{n}{k}^2 = \binom{2n}{n}.\]It follows from the Vandermonde convolution by taking $l = m = n$ and remembering that $\binom{n}{k} = \binom{n}{n-k}$.
Theorem 10.2.1. For $n \in \mathbb{N}$ and $x, y \in \mathbb{R}$,
\[(x + y)^n = \sum_{k=0}^{n} \binom{n}{k} x^k y^{n-k}.\]Proof. By induction on $n$. When $n = 0$ both sides evaluate to 1. Assume the claim holds for $n = m$ and consider the case $n = m + 1$.
\[(x + y)^{m+1} = (x + y) \cdot (x + y)^m\] \[= (x + y) \cdot \sum_{k=0}^{m} \binom{m}{k} x^k y^{m-k}\] \[= x \cdot \sum_{k=0}^{m} \binom{m}{k} x^k y^{m-k} + y \cdot \sum_{k=0}^{m} \binom{m}{k} x^k y^{m-k}\] \[= \sum_{k=0}^{m} \binom{m}{k} x^{k+1} y^{m-k} + \sum_{k=0}^{m} \binom{m}{k} x^k y^{m+1-k}\] \[= \sum_{k=1}^{m+1} \binom{m}{k-1} x^k y^{m+1-k} + \sum_{k=0}^{m} \binom{m}{k} x^k y^{m+1-k}\] \[= \left( x^{m+1} + \sum_{k=1}^{m} \binom{m}{k-1} x^k y^{m+1-k} \right) + \left( y^{m+1} + \sum_{k=1}^{m} \binom{m}{k} x^k y^{m+1-k} \right)\] \[= x^{m+1} + y^{m+1} + \sum_{k=1}^{m} \left( \binom{m}{k-1} + \binom{m}{k} \right) x^k y^{m+1-k}\] \[= x^{m+1} + y^{m+1} + \sum_{k=1}^{m} \binom{m+1}{k} x^k y^{m+1-k}\] \[= \sum_{k=0}^{m+1} \binom{m+1}{k} x^k y^{m+1-k}.\]Here (5) follows from (4) by noting that
\[\sum_{k=0}^{m} f(k) = \sum_{k=1}^{m+1} f(k - 1)\]and (8) follows from (7) by Pascal’s rule. The other steps are simple algebraic manipulation. This completes the proof by induction.
The binomial theorem can be used to immediately derive an identity we have seen before: By substituting $x = y = 1$ into the theorem we get
\[\sum_{k=0}^{n} \binom{n}{k} = 2^n.\]Here is another interesting calculation: Putting $x = -1$ and $y = 1$ yields
\[\sum_{k=0}^{n} (-1)^k \binom{n}{k} = 0.\]This implies
\[\sum_{\text{k odd}} \binom{n}{k} = \sum_{\text{k even}} \binom{n}{k} = 2^{n-1}.\]We have seen the sum principle that states that for $n$ pairwise disjoint sets $A_1, A_2, \ldots, A_n$,
\[\left| \bigcup_{i=1}^{n} A_i \right| = \sum_{i=1}^n |A_i|.\]What happens when the sets are not pairwise disjoint? We can still say something. Namely, the sum $\sum_{i=1}^n |A_i|$ counts every element of $\bigcup_{i=1}^n A_i$ at least once, and thus even with no information about the sets we can still assert that
\[\left| \bigcup_{i=1}^{n} A_i \right| \leq \sum_{i=1}^n |A_i|.\]However, with more information we can do better. For a concrete example, consider a group of people, 10 of whom speak English, 8 speak French, and 6 speak both languages. How many people are in the group? We can sum the number of English- and French-speakers, getting $10 + 8 = 18$. Clearly, the bilinguals were counted twice, so we need to subtract their number, getting the final answer $18 - 6 = 12$. This argument can be carried out essentially verbatim in a completely general setting, yielding the following formula:
\[|A \cup B| = \|A\| + |B| - |A \cap B|.\]What if there are three sets? Suppose in addition to the above English and French speakers, we have 14 German-language enthusiasts, among which 8 also speak English, 5 speak French, and 2 speak all three languages. How many people are there now? We can reason as follows: The sum $10 + 8 + 14 = 32$ counts the people speaking two languages twice, so we should subtract their number, getting $32 - 6 - 8 - 5 = 13$. But now the trilinguals have not been counted: They were counted three times in the first sum, and then subtracted three times as part of the bilinguals. So the final answer is obtained by adding their number: $13 + 2 = 15$. In general,
\[|A \cup B \cup C| = \|A\| + |B| + |C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|.\]In the case of arbitrarily many sets we obtain the inclusion-exclusion principle:
\[\left| \bigcup_{i=1}^n A_i \right| = \sum_{k=1}^n (-1)^{k-1} \sum_{1 \leq i_1 < i_2 < \cdots < i_k \leq n} \left| A_{i_1} \cap A_{i_2} \cap \cdots \cap A_{i_k} \right|.\]Proof. Each element in $\bigcup_{i=1}^n A_i$ is counted exactly once on the left side of the formula. Consider such an element $a$ and let the number of sets $A_i$ that contain $a$ be $j$. Then $a$ is counted
\[\binom{j}{1} - \binom{j}{2} + \cdots + (-1)^{j-1} \binom{j}{j}\]times on the right side. But recall from our exploration of binomial coefficients that
\[\sum_{i=0}^j (-1)^i \binom{j}{i} = \sum_{i=0}^j (-1)^{i-1} \binom{j}{i} = -1 + \sum_{i=1}^j (-1)^{i-1} \binom{j}{i} = 0,\]which implies
\[\binom{j}{1} - \binom{j}{2} + \cdots + (-1)^{j-1} \binom{j}{j} = 1,\]meaning that $a$ is counted exactly once on the right side as well. This establishes the inclusion-exclusion principle.
Given a set $A = {a_1, a_2, \ldots, a_n}$, we know that the number of bijections from $A$ to itself is $n!$. How many such bijections are there that map no element $a \in A$ to itself? That is, how many bijections are there of the form $f: A \to A$, such that $f(a) \neq a$ for all $a \in A$. These are called derangements, or bijections with no fixed points.
We can reason as follows: Let $S_i$ be the set of bijections that map the $i$-th element of $A$ to itself. We are the looking for the quantity
\[n! - \left| \bigcup_{i=1}^n S_i \right|.\]By the inclusion-exclusion principle, this is
\[n! - \sum_{k=1}^n (-1)^{k-1} \sum_{1 \leq i_1 < i_2 < \cdots < i_k \leq n} \left| S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_k} \right|.\]Consider an intersection $S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_k}$. Its elements are the permutations that map $a_{i_1}, a_{i_2}, \ldots, a_{i_k}$ to themselves. The number of such permutations is $(n-k)!$, hence
\[|S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_k}| = (n-k)!.\]This allows expressing the number of derangements as
\[n! - \sum_{k=1}^n (-1)^{k-1} \sum_{1 \leq i_1 < i_2 < \cdots < i_k \leq n} (n - k)! = n! - \sum_{k=1}^n (-1)^{k-1} \binom{n}{k} (n - k)!\] \[= \sum_{k=1}^n (-1)^k \binom{n}{k} (n - k)!\] \[= \sum_{k=0}^n (-1)^k \frac{n!}{k!}\] \[= n! \sum_{k=0}^n \frac{(-1)^k}{k!}.\]Now, $\sum_{k=0}^n \frac{(-1)^k}{k!}$ is the beginning of the Maclaurin series of $e^{-1}$. (No, you are not required to know this for the exam.) This means that as $n$ gets larger, the number of derangements rapidly approaches $\frac{n!}{e}$. In particular, if we just pick a random permutation of a large set, the chance that it will have no fixed points is about $1/e$. Quite remarkable, isn’t it!?
If we put more than $n$ pigeons into $n$ pigeonholes, at least one pigeonhole will house two or more pigeons. This trivial observation is the basis of ingenious combinatorial arguments and is the subject of this chapter. Let’s begin with the various guises of the pigeonhole principle that are encountered in combinatorics.
Basic form. If $m$ objects are put in $n$ boxes and $n < m$, then at least one box contains at least two objects. The one-line proof is by contradiction: If every box contains at most one object, there are at most $n \cdot 1 = n$ objects. A more rigorous formulation of the principle is as follows: Given two sets $A$ and $B$, with $|A| = m > n = |B|$, for any function $f : A \to B$ there exists $b \in B$ such that
\[\left| \{x \in A : f(x) = b\} \right| > 1.\]General form.
If $m$ objects are put in $n$ boxes, then at least one box contains at least $\lceil \frac{m}{n} \rceil$ objects. The proof is again by contradiction: If every box contains at most $\lceil \frac{m}{n} \rceil - 1 < \frac{m}{n}$ objects, there are less than $n \left( \frac{m}{n} \right) = m$ objects. The more rigorous formulation is: Given two sets $A$ and $B$, for any function $f : A \to B$ there exists $b \in B$ such that
\[\left| \{x \in A : f(x) = b\} \right| \geq \left\lceil \frac{m}{n} \right\rceil.\]Dijkstra’s form. For a nonempty finite collection of integers (not necessarily distinct), the maximum value is at least the average value. It is a good exercise to verify that this is equivalent to the general form above.
Let’s begin with some easy applications of the pigeonhole principle.
First application. There are two San Franciscans with the exact same number of hairs on their heads. Indeed, according to P&G Hair Facts, the average person’s head has about 100,000 hairs, while “some people have as many as 150,000.” So it seems safe to bet that every San Franciscan has at most 700,000 hairs on his or her head. On the other hand, the year 2000 US Census counted 776,733 San Francisco residents. The pigeonhole principle implies that at least two of them have the exact same number of hairs.
Second application. At a cocktail party with six or more people, there are three mutual acquaintances or three mutual strangers. Indeed, pick an arbitrary person a. By the pigeonhole principle, among the other five or more people, either there are three of a’s acquaintances, or three people who are strangers to a. Let’s say there are three that are a’s acquaintances, the other case is analogous. If those three are mutual strangers we are done. Otherwise there are two among them, call them b and c, who know each other. Then a, b and c are mutual acquaintances and we are done.
Third application. Consider an infinite two-dimensional plane, every point of which is colored either red or blue; then there are two points one yard apart that are the same color. Indeed, take an arbitrary equilateral triangle with a side length of one yard. By the pigeonhole principle, two of its vertices have the same color.
Fourth application. Consider the numbers 1, 2, . . . , 2n, and take any n + 1 of them; then there are two numbers in this sample that are coprime. Indeed, consider the pairs ${1, 2}$, ${3, 4}$, . . . , ${2n - 1, 2n}$. By the pigeonhole principle, both numbers from one of these pairs are in the sample. These numbers differ by 1 and are thus coprime. (This follows from the same argument as in Euclid’s proof of the infinity of primes.)
The following lemma comes from a classical 1935 paper by Paul Erdös and George Szekeres titled “A combinatorial problem in geometry”:
Lemma 12.3.1. In any ordered sequence of $n^2 + 1$ distinct real numbers $a_1, a_2, \ldots, a_{n^2+1}$, there is either a monotone increasing subsequence of length $n + 1$ or a monotone decreasing subsequence of length $n + 1$. Namely, there is a set of indices $1 \leq i_1 < i_2 < \cdots < i_{n+1} \leq n^2 + 1$, such that either $a_{i_1} > a_{i_2} > \cdots > a_{i_{n+1}}$ or $a_{i_1} < a_{i_2} < \cdots < a_{i_{n+1}}$.
Proof. For $1 \leq i \leq n^2 + 1$, let $\eta_i$ be the length of the longest monotone increasing subsequence that starts at $a_i$. If some $\eta_i > n$, we are done. Otherwise, by the pigeonhole principle, there exists $1 \leq j \leq n$, and some set $i_1 < i_2 < \cdots < i_m$ of size $m \geq \lceil (n^2 + 1)/n \rceil = n + 1$, such that $\eta_{i_1} = \eta_{i_2} = \cdots = \eta_{i_m} = j$. Now, consider two numbers $a_{i_k}$ and $a_{i_{k+1}}$. If $a_{i_k} < a_{i_{k+1}}$, we get an increasing subsequence starting at $a_{i_k}$ of length $j + 1$, which is a contradiction. Hence $a_{i_k} > a_{i_{k+1}}$ in particular, and \(a_{i_1} > a_{i_2} > \cdots > a_{i_m}\) in general, giving us a decreasing subsequence of length at least $n + 1$.
Here is another pigeonhole gem:
Proposition 12.3.2. Given a sequence of $n$ not necessarily distinct integers $a_1, a_2, \ldots, a_n$, there is a nonempty consecutive subsequence $a_i, a_{i+1}, \ldots, a_j$ whose sum $\sum_{m=i}^j a_m$ is a multiple of $n$. (The subsequence might consist of a single element.)
Proof. Consider the collection
\[\left( \sum_{i=1}^0 a_i, \sum_{i=1}^1 a_i, \sum_{i=1}^2 a_i, \ldots, \sum_{i=1}^n a_i \right).\]This collection has size $n + 1$ and its first element is the empty sum $\sum_{i=1}^0 a_i = 0$. There are only $n$ possible remainders modulo $n$, thus by the pigeonhole principle, there are two numbers in the above collection of size $n + 1$ that leave the same remainder. Let these be $\sum_{i=1}^l a_i$ and $\sum_{i=1}^k a_i$, with $l < k$. By a lemma we once proved, it follows that
\[n \Bigg| \left( \sum_{i=1}^k a_i - \sum_{i=1}^l a_i \right),\]which implies
\[n \Bigg| \sum_{i=l+1}^k a_i.\]Analysis of algorithms is concerned with estimating how many steps various algorithms make while solving problems of various sizes. In particular, given an algorithm, we want to make statements like “For input of size $n$, the algorithm will terminate in at most $f(n)$ steps.” If we try to accurately estimate the number of steps, a cumbersome bound like
\[f(n) = \frac{1}{11} n^3 + 12n^2 + 15 \frac{1}{2} n + \log_3 n + 17\]might arise. Such precision only complicates matters and does not add to our understanding of the algorithm’s efficiency. The following notational convention allows to simplify bounds by concentrating on their “main terms.”
Definition 13.1.1. For two functions $f, g : \mathbb{N}^+ \to \mathbb{R}$,
$f(n) = O(g(n))$ if and only if there exists a positive constant $c \in \mathbb{R}$ and a constant $n_0 \in \mathbb{N}$, such that $ | f(n) | \leq c | g(n) | $ for all $n \geq n_0$. |
Asymptotic notation does wonders to the above ugly bound: We can now say that $f(n) = \Theta(n^3)$, which makes it easier to see how the number of steps performed by the algorithm grows as $n$ gets larger and larger. Notice how the asymptotic notation swallowed all the constants and lower-order terms! To prove that $f(n) = \Theta(n^3)$ we need to show that there exist positive constants $c_1, c_2 \in \mathbb{R}$ and a constant $n_0 \in \mathbb{N}$, such that $c_1 n^3 \leq f(n) \leq c_2 n^3$ for all $n \geq n_0$. (We dropped the absolute values that come from Definition 13.1.1 since $f(n)$ and $n^3$ are nonnegative for $n \in \mathbb{N}^+$.) We can take $n_0 = 1$, $c_1 = \frac{1}{11}$, and $c_2 = 45.6$. For the lower bound, clearly $f(n) \geq \frac{1}{11} n^3$ when $n \in \mathbb{N}^+$. For the upper bound, note that in this range $n^3 \geq n^2 \geq n \geq \log_3 n$, and $n^3 \geq 1$. All these inequalities can be proved by elementary algebraic manipulation. Thus we get
\[f(n) \leq \frac{1}{11} n^3 + 12n^3 + 15 \frac{1}{2} n^3 + n^3 + 17n^3 \leq 45.6 n^3.\]We can also perfectly well say that $f(n) = O(n^4)$ or that $f(n) = O(n^{25})$; these bounds are considerably less informative but correct. On the other hand, the bound $f(n) = O(n^2)$ (or even $f(n) = O(n^{2.99})$) is not correct. Indeed, we have seen that $f(n) \geq \frac{1}{11} n^3$. On the other hand, for any positive constant $c \in \mathbb{R}$, $\frac{1}{11} n^3 \geq cn^2$ for all $n \geq 11c$. Thus there is no positive constant $c \in \mathbb{R}$ and a constant $n_0 \in \mathbb{N}$ so that $f(n) \leq cn^2$ for all $n \geq n_0$.
Asymptotic notation is asymmetric, so we never write a statement like $O(g(n)) = f(n)$; the $O$, $\Omega$, and $\Theta$ are always present on the right side of the equality sign. (However, we can write $n^2 + O(n) = \Theta(n^2)$, for example.) The right way to think of statements like $f(n) = O(g(n))$ and $f(n) = \Omega(g(n))$ is as inequalities; always remember what the notation means according to Definition 13.1.1.
The following asymptotic inequalities can all be easily proved and are very useful. Do the proofs as an exercise. You might find induction or tools from elementary calculus helpful for some of these. You’ll also need simple properties of logarithms, like the identity
\[\log_a n = \frac{\log_b n}{\log_b a}.\]For two constants $u, v \in \mathbb{R}$, if $u < v$ then $n^u = O(n^v)$. (“A bigger power swallows a smaller one.”)
If $f(n)$ is a degree-$d$ polynomial in $n$ then $f(n) = O(n^d)$. If the coefficient of $n^d$ in $f(n)$ is nonzero then $f(n) = \Theta(n^d)$.
For any real constants $b > 1$ and $p$, $n^p = O(b^n)$. (“An exponential swallows a power.”)
For any real constants $q > 0$ and $p$, $(\ln n)^p = O(n^q)$. (“A power swallows a logarithm.”)
For any real constants $a, b > 1$, $\log_a n = \Theta(\log_b n)$. This implies that we can write bounds like $O(\log n)$, $O(n \log n)$, etc., without specifying the base of the logarithm. (“Asymptotic notation swallows bases of logarithms.”)
We conclude this lecture by demonstrating how new asymptotic inequalities can be derived from existing ones. These are often used in the analysis of algorithms, although they are so much a part of the folklore that they are rarely referred to explicitly.
Proposition 13.2.1. The following hold:
(a) If $f(n) = O(g(n))$ and $p \in \mathbb{N}$ is a constant then $p \cdot f(n) = O(g(n))$.
(b) If $f(n) = O(h(n))$ and $g(n) = O(w(n))$ then $f(n) + g(n) = O(\max( | h(n) | , | w(n) | ))$. |
(c) If $f(n) = O(h(n))$ and $g(n) = O(w(n))$ then $f(n) \cdot g(n) = O(h(n) \cdot w(n))$.
Proof. We prove each claim individually.
(a) If $f(n) = O(g(n))$ then there exists a positive constant $c \in \mathbb{R}$ and a constant $n_0 \in \mathbb{N}$, such that $|f(n)| \leq c|g(n)|$ for all $n \geq n_0$. Thus for $p \in \mathbb{N}$, $|p \cdot f(n)| = p|f(n)| \leq (pc)|g(n)|$ for all $n \geq n_0$, and by Definition 13.1.1, $p \cdot f(n) = O(g(n))$.
(b) If $f(n) = O(h(n))$ and $g(n) = O(w(n))$ then there exist two positive constants $c_1, c_2 \in \mathbb{R}$ and constants $n_1, n_2 \in \mathbb{N}$, such that $|f(n)| \leq c_1|h(n)|$ for all $n \geq n_1$ and $|g(n)| \leq c_2|w(n)|$ for all $n \geq n_2$. Then
\[\|f(n) + g(n)\| \leq \|f(n)\| + \|g(n)\| \leq c_1\|h(n)\| + c_2\|w(n)\| = (c_1 + c_2) \max(\|h(n)\|, \|w(n)\|)\]for all $n \geq \max(n_1, n_2)$, and by Definition 13.1.1, $f(n) + g(n) = O(\max(|h(n)|, |w(n)|))$.
(c) If $f(n) = O(h(n))$ and $g(n) = O(w(n))$ then there exist two positive constants $c_1, c_2 \in \mathbb{R}$ and constants $n_1, n_2 \in \mathbb{N}$, such that $|f(n)| \leq c_1|h(n)|$ for all $n \geq n_1$ and $|g(n)| \leq c_2|w(n)|$ for all $n \geq n_2$. Then
\[\|f(n) \cdot g(n)\| = \|f(n)\| \cdot \|g(n)\| \leq (c_1\|h(n)\|) \cdot (c_2\|w(n)\|) = (c_1c_2)\|h(n) \cdot w(n)\|\]for all $n \geq \max(n_1, n_2)$, and by Definition 13.1.1, $f(n) \cdot g(n) = O(h(n) \cdot w(n))$.