Big-O Notation
Big-O notation shows the time and space complexity of an algorithm at a large scale in the worst-case scenario. If you are looking for an item in a list, then the Big-O notation will always assume the item is the last one in the list, and be how much time and space it takes to find that.
Common Notations
Big-O notation is usually represented as O(n), with mathematical formulas affecting the n inside. O represents the algorithm, n represents the number of elements. I'll be using time as my main vector here, but the same applies for space, too.
Time | Notation | Description |
---|---|---|
Constant | O(1) |
This always takes the same length of time, regardless of size. The code does not depend on the size of the problem. |
Logarithmic | O(log(n)) |
log(n) being the inverse of exponentiation. Reduces the problem in half each time through process |
Linear | O(n) |
This takes the same amount of time as there are elements in the list. Simple iterative or recursive programs. |
Log-linear | O(n \* log(n)) |
Usually the result of a sort and an iteration. |
Quadric | O(n^2) |
Time goes up exponentially with the amount of elements. |
Cubic | O(n^3) |
Time goes up even more exponentially. |
Exponential | O(b^n) |
Time goes up an exponent per element. Nested loops or recursive calls. |
Factorial | O(n!) |
Time goes up by factorial. |
Rules of Thumb
How complexity changes is easier to think about
Instead of looking at one test case and saying "in this case, there are X elements and it takes Y long, so the complexity is [...]", consider how the complexity changes over varying inputs. It is very difficult to understand what the complexity is without some greater context.
Constants can always be removed from a Big-O notation
Since Big-O only cares about the really biggest cases, as the numbers of elements in a list go up, even if there is a constant that is being added or multiplied, it won't affect the speed enough to matter. Exception being if the constant is of a very very large size, but even then, if the number of elements grows large enough.
You can also think of this as the idea that constants found with multiplication and division can always be reduced down to 1. If an algorithm is n / 2
space, you could reduce this constant to n / 1
or n
; or if an algorithm is 4n
space, you can reduce this to 1n
or n
space.
Use the largest exponent
If there is a formula that determines the length of time an algorithm will take, like log(n)^3 + 15n^2 + 2n^3
, then Big-O will see the largest possible number or exponent (n^3
) and use that: O(n^3)
.
Big-O is 'rounding' to the nearest and simplest notation that is closest to the real outcome.
The only exception to this is the log-linear complexity. Since it is so common, it seems to stick around.
Add or Multiply: Discerning the time per operation
In general, it is best to go as deep as you can within loops and subprocesses to discern it's complexity before trying to find the complexity of the whole algorithm.
In a program, if you have a loop over your list that will operate n
times, you would write this as f(n) = n = O(n)
. If within that loop, you loop over the whole list again twice, we can call the complexity of this inner loop O(n * 2)
. Combining this inner O(2 * n)
complexity that is happening on each iteration within the outer O(n)
complexity:
f(n) = O(n) * O(2 * n)
O(n) * O(2 * n) = O(n * 2n)
- Since2
is a constant, we can remove itO(n * n) = O(n^2)
Time and Space
When talking about Big O, these are the two complexities that are worried about: how long it will take to do the thing, and how much space is needed (always speaking in the worst-case scenario).
Time
The time complexity of an algorithm is determined by the number of and speed of each of the operations that occur within the algorithm.
For instance, if we had an algorithm that was meant to count the number of items in a list, we would have to always traverse the list one time. This would lead to a time complexity of O(n)
, where n
is the number of items in the list.
Let's say we had an algorithm that determined the sums of each value in a list paired with every other value in the list:
totals = []
for first in nums:
for second in nums:
totals.append(first + second)
The operations in this equation are as follows, with the time cost of the operation in parentheses:
- Iterate through nums (
n
) - Iterate through nums again (
n
) - Sum two numbers (
1
) - Add to a list (
1
)
So this example will always have to traverse the list of n
items n
times, plus two constant operations. Since we see the a complexity greater than constant time (n
), we can leave out the constant complexities (see the riles of thumb above). We're left with an n * n
complexity, so this has a Big O of O(n^2)
.
Another way to do this algorithm is:
totals = []
for i in range(len(nums)):
for j in range(i, len(nums)):
totals.append(nums[i] + nums[j])
Since we know we will have always added any numbers together that occur before i
, we can focus only on the numbers ahead of and including the current index i
. This is a much tighter complexity now, so let's go through the operations:
- Iterate through nums as
i
(n
) - Iterate through nums from
i
to end (n - i
) - Sum two numbers (
1
) - Add to a list (
1
)
How do we notate the second step there? We know that the time needed for second iteration will diminish every pass of the i
iteration step, since it is n - i
, and since i
is increasing by one every pass, we can reduce this to n / 2
.
So like before, we are going to drop the constant operations (steps 3 and 4) since we see a more dominant complexity of n
. We are left with n * n/2
as the complexity. However, we still can drop the constants in this equation, namely the /2
part; while it does make a difference in actuality, it doesn't make a difference in terms of how it is notated or talked about, since the difference on the final resulting time is not an important enough difference in scale of performance to matter.
So our final result is n * n
or O(n^2)
.
Space
The space complexity is determined by the memory needed to complete the algorithm (in the worst case, of course).
Let's start with an example of finding all positive non-zero duplicate integers in a list, where each number can only have one duplicate (e.g. [1,2,3,3,4,5,5]
).
seen_numbers = set()
duplicates = []
for num in nums:
if num in seen_numbers:
duplicates.append(num)
else:
seen_numbers.add(num)
We're not worrying about time of operation here, but how much memory we need to complete the operations. Let's look at our two data structures and how much space they need, calling n
the number of items in nums
and k
the duplicate numbers:
seen_numbers
: This will holdn - k
itemsduplicates
: This will holdk
items
So our space is (n - k) + k
, which resolves to O(n)
space, since our space complexity will grow when the input list grows.
Another way we can do this same algorithm is as follows:
nums.sort()
duplicates = []
for i in range(len(nums) - 1):
if nums[i] == nums[i + 1]:
duplicates.append(nums[i])
Now we only have one data structure we've created, which holds k
items. At worst, this could hold n / 2
items, in the case that every number in the list had a duplicate. Since we can remove the constant from the equation, we are left with n
space, as the size of the array will grow linearly with the size of the input.
One last way to do this:
nums.sort()
i = 0
while i < len(nums) - 1:
# Skip first duplicate number of the two
if nums[i] == nums[i + 1]:
i += 1
# Remove every number except the duplicate
nums.pop(i)
# Remove last element if it wasn't a duplicate
if i < len(nums):
nums.pop(-1)
We don't create any new data structures in this, but we do create a variable i
. This would make our space complexity a constant O(1)
, since it doesn't scale at all with how big our input is..
References:
- https://www.youtube.com/watch?v=zUUkiEllHG0&list=PLDV1Zeh2NRsB6SWUrDFW2RmDotAfPbeHu&index=3
- https://en.wikipedia.org/wiki/Logarithm
- https://en.wikipedia.org/wiki/Natural_logarithm
- Intro to Programming and Computation: Chapters 9.1–9.3.1, 9.3.3, and 9.3.5
- https://www.youtube.com/watch?v=7lQXYl_L28w&list=PLUl4u3cNGP63WbdFxL8giv4yhgdMGaZNA&index=38&t=0s
- https://eli.li/2022/01/26/notes-on-big-o-notation
- https://www.crackingthecodinginterview.com/
- https://www.bigocheatsheet.com/
Incoming Links
Last modified: 202401040446