This tutorial introduces the reader to Big-O, a mathematical notation that helps us describe the limiting behavior of an algorithm's timing or spacing function as the problem size tends towards large values or infinity.
Given multiple algorithms that solve the same problem, which one do you choose? For example, there are many famous algorithms that can be used to sort an array of directly comparable values. This means you have a choice. There are often important trade-offs between each choice that impact, in a very practical way, the use of one algorithm over another in any given application.
You're likely familar with deriving a timing or spacing function for an algorithm. Given such a function, we can classify the algorithm into a complexity class based on the function. We can then say that algorithms whose functions are in the same complexity class behave similarly. Furthermore, if two algorithms are in different complexity classes, then one is clearly better than the other with respect to the unit used for the function (e.g., the set of key processing steps or the unit of space measurement).
Here is a list of symbols and notation that we will use in this tutorial:
Symbol / Notation | Meaning |
---|---|
$\mathbb{N}_0$ | set of natural numbers, including $0$ |
$\mathbb{R}^+$ | set of positive real numbers |
$x \in S$ | $x$ is an element of a set denoted by $S$ (i.e., $x$ is in $S$). |
$f : A \to B$ | $f$ is a function with domain set $A$ and codomain set $B$ (i.e., $f$ is a function from $A$ to $B$). |
$a \to b$ | $a$ implies $b$ |
$a \leftrightarrow b$ | $a$ if, and only if $b$ (sometimes written "$a$ iff $b$") |
${\rm O}(\cdot)$ | Big-O of $\cdot$ |
Please bear with us through the next few sentences; we'll explain the math as we go. Big-O defines a set of functions with a similar upper bound. Here is an informal definition for membership in a particular Big-O set, assuming the functions we wish to classify are strictly positive for large input values:
A Big-O set characterizes a function according to its growth rate -- two functions in the same Big-O set grow similarly for sufficiently large values of $n$ (i.e., in their asymptotic limit).
If we're classifying a timing function, for example, the assumption of strictly positive functions makes sense as you cannot have a negative number of computing steps in practice. The same notion intuitively holds for spacing functions as well.
There is a more mathematically formal definition for Big-O. If you're interested in seeing it, then consult the appendix near the end of this tutorial. For this class, we'll assume that the informal definition is okay for most of our purposes.
Here are some examples of how to check and/or derive the Big-O sets or simple functions:
Consider $T(n) = 2n^2 + 3n - 2$. In this case, $T(n) \in {\rm O}(n^2)$. If this is true, then it means that $T(n) \leq c \cdot n^2$ for some $c$ and for all sufficiently large $n$. To prove this, we'll start by writing out the equation for $T(n)$, as provided, then take steps to strategically increase the right-hand side of (RHS) the equation. Since we're making the RHS bigger each time, we change the $=$ to $\leq$. Each step is an implication and the direction matters (i.e., you cannot simply follow the steps in reverse). We're not going to increase the terms at random -- instead, we increase them so that we can eventually factor out an $n^2$ term. $$\begin{eqnarray} && T(n) &=& 2n^2 + 3n - 2 \\ &\to& T(n) &\leq& 2n^2 + 3n + 2 \\ &\to& T(n) &\leq& 2n^2 + 3n^2 + 2 \\ &\to& T(n) &\leq& 2n^2 + 3n^2 + 2n^2 \\ &\to& T(n) &\leq& (2 + 3 + 2)n^2 \\ &\to& T(n) &\leq& 7n^2 \end{eqnarray}$$ This gives us $T(n) \leq c \cdot n^2$ where $c = 7$. In a Discrete Mathematics course, you would also need to prove that the inequality holds for sufficiently large $n$, perhaps using something like mathematical induction.
Now consider $T(n) \leq 3n^2 + 39$. In this case, $T(n) \in {\rm O}(n^2)$ as well. This can be shown: $$\begin{eqnarray} && T(n) &\leq& 3n^2 + 39 \\ &\to& T(n) &\leq& 3n^2 + 39n^2 \\ &\to& T(n) &\leq& (3 + 39)n^2 \\ &\to& T(n) &\leq& 42n^2 \end{eqnarray}$$ This gives us $T(n) \leq c \cdot n^2$ where $c = 42$. At this point, we also see the ${\rm O}(n^2)$ is a set of functions as the two functions $T(n) = 2n^2 + 3n - 2$ and $T(n) \leq 3n^2 + 42$ are both in ${\rm O}(n^2)$ (i.e., in "Big-O of $n^2$).
Now consider $T(n) = n^3 + 2\log_4(n) + 6$. In this case, $T(n) \in {\rm O}(n^3)$. This can be shown (this time we'll simplify the notation by omitting the implied implication symbol): $$\begin{eqnarray} T(n) &=& n^3 + 2\log_4(n) + 6 \\ &\leq& n^3 + 2n^3 + 6 \\ &\leq& n^3 + 2n^3 + 6n^3 \\ &\leq& (1 + 2 + 6)n^3 \\ &\leq& 9n^3 \end{eqnarray}$$ This gives us $T(n) \leq c \cdot n^3$ where $c = 9$.
Triangle Inequality: Now consider a $T(n)$ that is any polynomial (i.e., a sum where each term is a constant multiplied by a power of $n$). If you combine like terms and write it in standard form (i.e., with decreasing powers of $n$), then you have something like this: $$ T(n) = c_{k}n^{k} + c_{k-1}n^{k-1} + \cdots + c_2n^2 + c_1n^1 + c_0n^0 $$ In this case, $T(n) \in {\rm O}(n^k)$. This can be shown using a usual strategy, taking advantage of the triangle inequality: $$\begin{eqnarray} T(n) &=& c_{k}n^{k} + c_{k-1}n^{k-1} + \cdots + c_2n^2 + c_1n^1 + c_0n^0 && \\ &\leq& |c_{k}n^{k} + c_{k-1}n^{k-1} + \cdots + c_2n^2 + c_1n^1 + c_0n^0| && \\ &\leq& |c_{k}|n^{k} + |c_{k-1}|n^{k-1} + \cdots + |c_2|n^2 + |c_1|n^1 + |c_0|n^0 && \text{by the triangle inequality} \\ &\leq& |c_{k}|n^{k} + |c_{k-1}|n^{k} + \cdots + |c_2|n^k + |c_1|n^k + |c_0|n^k && \\ &\leq& (|c_k| + |c_{k-1}| + \cdots + |c_2| + |c_1| + |c_0|)n^k \end{eqnarray}$$ This gives us $T(n) \leq c \cdot n^k$ where $c = |c_k| + |c_{k-1}| + \cdots + |c_2| + |c_1| + |c_0|$. This is a powerful result. Essentially, it means that for any polynomial, we can take the highest order term, drop the coefficient, and that gives us the Big-O.
Now consider $T(n) = 4n^2 \log(n) + 3n + 2\log(n) + 5$. Then, using the quick derivation theorem, we have that $T(n) \in {\rm O}(n^2\log(n))$ since $4n^2 \log(n)$ is the highest order term in the polynomial-like function.
Now consider $T(n) = n\log_2(n) + 2n + 1$. In this case, $T(n) \in {\rm O}(n\log(n))$ where the logarithm is in any base. This can be shown, taking advantage of the change of base identity for logarithms: $$\begin{eqnarray} T(n) &=& n\log_2(n) + 2n + 1 \\ &=& n\left(\frac{\log(n)}{\log(2)}\right) + 2n + 1 \\ &=& \left(\frac{1}{\log(2)}\right) n\log(n) + 2n + 1 \\ &\leq& \left(\frac{1}{\log(2)}\right) n\log(n) + 2n\log(n) + 1n\log(n) \\ &\leq& \left(\frac{1}{\log(2)}\right) n\log(n) + 2n\log(n) + 1n\log(n) \\ &\leq& \left(\frac{1}{\log(2)} + 2 + 1\right) n\log(n) \end{eqnarray}$$ This gives us $T(n) \leq c \cdot n\log(n)$ where $c = \frac{1}{\log(2)} + 2 + 1$. The $\frac{1}{\log(2)}$ term is constant so it's allowed to be included in $c$. Essentially, logarithms of different bases only differ by a constant amount. Therefore, we can usually factor out the change of base into the constant.
There is an infinite number of Big-O sets / classes. However, we usually classify the timing or spacing functions for algorithms into one of the notable complexity classes below:
Set | Name | Example |
---|---|---|
${\rm O}(1)$ | Constant | $f(n) = 42$ |
${\rm O}(\log(n))$ | Logarithmic | $f(n) = 2 \log(n) + 1$ |
${\rm O}(n)$ | Linear | $f(n) = 4 n + 2$ |
${\rm O}(n\log(n))$ | Linearithmic | $f(n) = 3 n\log(n) + 2n$ |
${\rm O}(n^2)$ | Quadratic | $f(n) = 5 n^2 + 3n$ |
${\rm O}(n^3)$ | Cubic | $f(n) = 2 n^3 + n^2 + 2$ |
${\rm O}(n^k)$ | Polynomial | $f(n) = c_{k}n^{k} + c_{k-1}n^{k-1} + \cdots + c_1n^1 + c_0n^0$ |
${\rm O}(k^n)$ | Exponential | $f(n) = 2^n + 5$ |
In the table above, we assume, for our purposes, that both $n$ and $k$ are integers greater than $1$. As an exercise, take each example in the table above and show that it has the claimed Big-O.
Now consider $T(n) = 4n^2 + n + 2$. Then, using the quick derivation theorem, we have that $T(n) \in {\rm O}(n^2)$. By the nested classes theorem, we also have that $T(n) \in {\rm O}(n^3)$, $T(n) \in {\rm O}(n^3\log(n))$, etc. $$\begin{eqnarray} T(n) &=& 4n^2 + n + 2 && \\ &\leq& (4 + 1 + 2) n^2 \\ &\leq& c \cdot n^2 ~;~ c = 7 &\to& T(n) \in {\rm O}(n^2) \\ &\leq& c \cdot n^3 ~;~ c = 7 &\to& T(n) \in {\rm O}(n^3) \\ &\leq& c \cdot n^3 \log(n) ~;~ c = 7 &\to& T(n) \in {\rm O}(n^3 \log(n))\\ \end{eqnarray}$$ In general, while you can establish that a function is in multiple Big-O sets, you usually want to pick the best Big-O that you can establish the inequality for. Since Big-O sets convey an upper bound, we define best as the smallest set that works.
Here is a diagram that shows the plots for some of the
characteristic functions in the list of notable classes:
The Big-O syntax not only can be used with the $T(n)$ or $S(n)$ that results from an algorithm analysis diagram, it can also be used in the diagram process itself. Consider the following method that we want to analyze. Here, we'll denote the problem size $n$ as the length of the array and the set of key processing steps as only including print-like statements:
void printA(int[] a) {
for(int i = 0; i < a.length; i++) { // ------------------\
for(int j = 0; j < i; j++) { // --------------\ |
System.out.print(a[i] + " "); // --> 1 | <= n | n
System.out.print(a[i] + " "); // --> 1 | |
} // for -------------------------------------/ |
System.out.println(); // ---------------------> 1 |
} // for // ---------------------------------------------/
} // printA
We can substitute each part with Big-O notation, as follows:
void printA(int[] a) {
for(int i = 0; i < a.length; i++) { // ------------------\
for(int j = 0; j < i; j++) { // --------------\ |
System.out.print(a[i] + " "); // --> O(1) | O(n) | O(n)
System.out.print(a[i] + " "); // | |
} // for -------------------------------------/ |
System.out.println(); // ---------------------> O(1) |
} // for // ---------------------------------------------/
} // printA
Now, using the usual "multiply-across and add up", we get the following:
$$\begin{eqnarray} T(n) &\leq& {\rm O}(1) \cdot {\rm O}(n) \cdot {\rm O}(n) \\ &+& {\rm O}(1) \cdot {\rm O}(n) \end{eqnarray}$$
If we simplify this, we get $T(n) \in {\rm O}(n^2)$ print-like statements. This reult happens because when you multiply some Big-O sets, you multiply the inner functions, but when you add them up you only keep the higher order terms.
Congratulations! You've made it through the introductory tutorial for Big-O!
Big-O defines a set or class of functions with a similar upper bound. The formal definition for membership in a particular Big-O set, from Discrete Mathematics, is:
It's okay if you don't know what some of these symbols mean -- if you don't, then you'll learn about them in the Discrete Mathematics class. We're going to work with the informal definition presented below.
We have already established that Big-O can be used to describe a set of functions with a similar upper bound. There is also Big-$\Omega$ (Big-Omega) for lower bounds and Big-$\Theta$ (Big-Theta) for tight bounds. There is even "little" versions, Little-o and Little-$\omega$ (omega) for strict inequalities (e.g., $<$ instead of $\leq$ in the case of Little-o).