Including Time Complexities and Code
Have you ever tried to sort a deck of cards by hand? You probably, intuitively, used the insertion sort algorithm. The following article will explain why this algorithm works and how long it takes. The good news is that you do end up with a sorted set, but, the bad news is that it can, unfortunately, take a while! The following article tells you why.
Computational Efficiency is key in Software Engineering and in that, a common time sink is routine tasks involving sorting. As an example, the following domains are directly impacted by the ability to sort something quickly:
- Searching: searching is much more efficient on a sorted set
- Selecting: selecting the n’th element is much more efficient on a sorted set
- Duplicates: identifying duplicates is much quicker on a sorted set
- Distributional Analysis: much quicker on a sorted set.
Moreover, as we enter an age of Big Data, we need to make sure we’re using the right sorting algorithm to efficiently manage our data to help us, as researchers and practitioners, get to conclusions quicker.
Trust me, it can take a while to sort millions of rows if you use the wrong algorithm.
Now before we get into the specific of each of these algorithms, we first need to decide on a metric to compare the speed of each algorithm, or otherwise, the worst case scenario. Lets therefore first go over the concept of Time Complexity.
Time Complexity and Big O Notation
In Layman terms, the computational complexity is the amount of time it takes to run an algorithm. Often, this is represented in what’s called “Big O Notation”, which is a method of writing the limiting (or worst case) behaviour of an algorithm.
So for example, if you wanted to count the number of items in a list, you could just go through the list one by one and this would take n steps. Even for a list that’s monumentally large, this algorithm would take n steps. Therefore the notation, in this case, would be
Here are some other common examples:
O(n²): This is easier to explain with an example. Say you’re checking if any element in your list is duplicated, then you have to compare every element in your list to the chosen element (iteratively), thus, you’re doing
n checks across
n items, thus,
n*n = n²
O(log n): Say you have a list, you cut it in half, and again, and again, until you only have one item left.
The following chart helps you visualise the speeds of different time complexities, for which, you can see
O(log(n)) is quicker than
O(n), which is quicker than
Now we’ve covered the concept of Time Complexity and Big O Notation, let’s move onto our algorithms.
The O.G. (maybe a bit of a stretch…) of sorting algorithms is the Bubble Sort. It’s the most commonly known sorting algorithm that you’ll probably have been asked about it in an interview or two.
The algorithm takes repeated steps through a list and compares adjacent items, swapping them if they’re in the wrong order. The algorithm keeps looping through the list until the list is ordered. Now depending on how you aim to order your list (highest to lowest or vice versa), this can naturally lead to differences in the time complexity of the problem.
Now assuming everything is in exactly the wrong order, then for every change you make, you have to check every other item in your list, so ultimately, if you are in this unfortunate setting, a Bubble Sort algorithm would take you roughly
О(n²). Therefore, bubble sort is not a practical sorting algorithm.
Let’s move onto something a bit better.
Example code at the end
Here we begin to get a bit fancier. Merge Sort works by splitting an unsorted list into separate groups (by half) recursively until there is one element per group. Then, you compare each element and regroup them, repeating until the list is merge and sorted.
This sounds a bit complicated, but
Swfung8 has created a great gif below to show how it works in practise here:
Now the overall time complexity of Merge sort is
O(nLogn) because the splitting phase takes roughly
O(logn) time with the regrouping requiring n steps for each sub-group, coming together to take
O(nlogn), which is much faster than Bubble Sort.
Note if we want to get even fancier, we can also talk about the space complexity of a problem (which is the amount of space a problem requires). For Merge Sort, this is
O(n). This means that this algorithm takes a lot of space and may slow down operations for the last data points.
Example code at the end
Like Merge Sort, QuickSort is a Divide and Conquer algorithm, but it’s famous for being about 2 or 3 times quicker. It picks an element as pivot and then partitions the list around the pivot point. The sub-arrays are then sorted recursively which can be done in-place, requiring small additional amounts of memory to perform the sorting.
In most standard implementations it performs significantly faster than merge sort and rarely reaches its worst case complexity of
O(n²), however, generally speaking, it tends to be quicker than
Example code at the end
Now Insertion Sort deserves a quick shout out here because it’s actually reflective on how people sort cards out manually. On each iteration, people would remove one card from the deck, then find where the card belongs within another sorted deck and inserts it in its place, repeating this until the deck of cards is sorted.
The time complexity of this problem, as you can guess, is quite inefficient as at the worst case, for every step, you have to go through an entirely separate list, and so you can, therefore, reach the worst case quadratic time complexity of
Note that this code has an extremely useful property that because it’s so simple, it can be coded very efficiently, and also, it works ‘offline’, in that it can sort a list as it receives it. However, it still is pretty slow so I’d recommend using quick sort above this.
Example code at the end
So as you can see, the story kind of develops from Bubble Sort to the widely used Quick Sort algorithm, with Insertion Sort showing us the benefits of a code-efficient and intuitive construction, despite it being relatively time inefficient.
A lot of these algorithms are preprogrammed into your language, for example, the sort function in pandas (in Python) uses Quick Sort, so I wouldn’t worry about it too much. Although, if you expect your data is already sorted (in a particular way), then selectively choosing a sorting algorithm you use can definitely improve its efficiency.
Either way, by taking a few extra minutes to think about the problem you’re trying to solve, you can definitely make headway in making quicker and more efficient programs =]
Thanks for reading again!! Let me know if you have any questions and I’ll be happy to help.
Keep up to date with my latest work here!
n = len(array)
for i in range(n):
for j in range(n - i - 1):
if array[j] > array[j + 1]:
array[j], array[j + 1] = array[j + 1], array[j]
already_sorted = False
Assuming the list has a length greater than 1.
while i<len(left_list) and j<len(right_list):
if left_list[i] < right[j]:
less = 
equal = 
greater = 
pivot = list
for x in list:
if x < pivot:
elif x == pivot:
elif x > pivot:
for i in range(1, len(list)):
current = list[i]
while i>0 and list[i-1]>current:
list[i] = list[i-1]
i = i-1
list[i] = current