Sorting Algorithms - A Review

Today we'll review the most common sorting algorithms in computer science. Sorting a list of objects is an extremely common problem when building data driven applications. We may need to sort a list of phone numbers in numerical order, or a list of customer names in lexigraphical order. When we have a large scale distributed system, the number of records that need to be sorted can top 10 million. At that scale, the efficiency of our sorting algorith becomes key.

We'll take a look at our basic sorting algorithms and try to parse out the runtime efficiency of each. Note that the implementation of these algorithms will be in Java, but you can implement them in any coding language.

The Problem

Before we start, think about this problem:

  • Given a list of unsorted objects, design an algorithm to efficienctly put them into sorted order.

Now there are a few questions you should think of right off the bat:

1. What are our objects? Are the letters, numbers, words, things?

This is a good question because it can dictate your actual sorting mechanism. Letters need to be sorted alphabetically or lexigraphically, numbers need to be sorted numerically. If it's a collection of object, you'll need to use an equals() method to determine order. Whenever you come across a sorting problem, always ask yourself what you are sorting.

In our case, let's assume they are positive integers.

2. What is "sorted order"? Is it ascending or descending?

Ascending order means that elements to the left are smaller than elements to the right. Another way to say this is as we traverse along the list, the elements ascend or get larger. This can be written as follows: 0 <= i <= n - 1; array[i] <= array[i + 1]

Descending order means that elements to the left are larger than elements to the right. As you traverse along the list, elements will descend in value or get smalled. This can be wrtten as: 0 <= i <= n - 1; array[i] >= array[i + 1]

For our problem, let's assume we want the integers sorted into ascending order.

With these points clarified, how would you go about solving this problem?

Say your list of integers is java [6, 5, 1, 4, 3]. How would you sort these? Take some time and think about how you would form a solution....

With some ideas in mind, let's take a look at the most intuitive approach: selection sort.

Selection Sort

This algorithm partiitons the input list into two parts. One partition is the already sorted sublist and the second partition is the sublist of remaining elements to be sorted. The algorithm will work its way across the list form left to right, finding the smallest element in the list, swapping it with the leftmost unsorted element, and updating the boundaries of the sublists. It will repeat this iteration until the whole list is sorted.

Selection Sort

Implementation

To implement this sorting algorithm, we'll need a few pieces of functionality.

  1. We'll need a method to swap array elements so that we can move the lowest element in the unsorted list to the leftmost index.

This could be achieved simply:


public static void swap(int i, int j, int[] arr) {
    int temp = arr[i];
    arr[i] = arr[j];
    arr[j] = temp;
}

  1. We'll need a way to iterate over the array multiple times, while keeping track of the two sub arrays.

We can use a nested for loop to achieve this. The outer for loop will allow use to iterate over every element in the array while the inner for loop will allow us to iterate over all of the so-far unsorted elements.

  1. We'll want to keep track of some sort of running minimum as we're iterating over the unsorted items.

When all of this comes together our algorithm will look like this:

public class Sort {

    public static void main(String[] args) {
        int[] unsorted = { 6, 5, 1, 4, 3 };
        selectionSort(unsorted);
        printArrayOf(unsorted);
    }

    public static int[] selectionSort(int[] arr) {
        int stop = arr.length;
        for (int i = 0; i < stop; i++) {
            int min = i;
            for (int j = i + 1; j < stop; j++) {
                if (arr[j] < arr[min]) min = j;
            }
            if (min != i) swap(i, min, arr);
        }
        return arr;
    }

    private static void swap(int i, int j, int[] arr) {
        int temp = arr[i];
        arr[i] = arr[j];
        arr[j] = temp;
    }

        private static void printArrayOf(int[] arr) {
        for (int value : arr) {
            System.out.print(" " + value + " ");
        }
    }

}

The runtime complexity of this sorting algortihm is O(n^2) where n is the size of the list. This is because for every item in the list, we have to visit every item, thus we have to visit n^2 items. This, unfortunately, is not very efficient.

Bubble Sort

Our next most common sorting algorithm is Bubble Sort. Bubble sort compares two adjacent elements and swaps them if they are not in the desired order. It continues until the list is sorted. Another way to think about this is to imagine a sliding window of size 2, moving down the array and comparing the elements in the window.

Below is a visual representation of bubble sort:

Bubble Sort

When wathcing this visual, notice how the largest element will get shifted to the rightmost postion for every pass over the array. This is where bubble sort gets its name, as the largest elements "bubble up" to the end of the list with each pass.

Implementation

Bubble sort requires looping over the list as well as swapping elements when they are out of order. Thus we'll need our swap() method from the Selection Sort implementation and also the familiar nested for loops.

    
 public class Sort {   

    public static void main(String[] args) {
        int[] unsorted = { 6, 5, 1, 4, 3 };
        bubbleSort(unsorted);
        printArrayOf(unsorted);
    }
    
    public static int[] bubbleSort(int[] arr) {
        for (int i = 0; i < arr.length; i++) {
            for (int j = 0; j < arr.length - i; j++) {
                if (arr[j + 1] < arr[j]) swap(j, j+1, arr);
            }
        }
        return arr;
    }

    private static void swap(int i, int j, int[] arr) {
        int temp = arr[i];
        arr[i] = arr[j];
        arr[j] = temp;
    }

    private static void printArrayOf(int[] arr) {
        for (int value : arr) {
            System.out.print(" " + value + " ");
        }
    }
 }

Note the bounds of our for loops are different then they were for Selection Sort. The outer for loop runs until the counter reaches arr.length, but the nested for loop runs until the counter reaches arr.length - i. This is due to the nature of the bubble sort algorithm. For each iteration of the outer loop, the largest element in the array is moved to the rightmost index. Thus it becomes sorted. Our inner loop will not need to check those elements, so we can end our inner loop an ith distance from the end of the array. (If this still doesn't make sense, try drawing out the state of the list at each iteration of the outer for loop and you should see the pattern).

The runtime complexity is the same as selection sort: O(n^2) in the worst case.

Insertion Sort

Next up in the ranks of sorting algorithms is Insertion Sort--an algorithm that may be the most intuitive of the three so far. Insertion sort takes a look at each item in the list and inserts it into the correct spot in the list to achieve sorted order.

Insertion Sort

Think about how this could be implemented in Java. We'll need one for loop iterating over the whole list, but how will we achieve the back-tracking for each element to find its correct spot in the sorted list? Also, notice that as we find the correct spot for each element, that we'll have to shift all the elements in the array up a position as we move the given item back. We can achieve this using a correctly tailored while loop.

Implementation

Check out the implementation of Insertion Sort below.

public class Sort {

    public static void main(String[] args) {
        int[] unsorted = { 6, 5, 1, 4, 3 };
        insertionSort(unsorted);
        printArrayOf(unsorted);
    }

    private static int[] insertionSort(int[] arr) {
        int j;
        int element;
        for (int i = 0; i < arr.length; i++) {
            element = arr[i];
            j = i - 1;
            while (j >= 0 && arr[j] > element) {
                arr[j + 1] = arr[j];
                j--;
            }
            arr[j + 1] = element;
        }
        return arr;
    }

    private static void printArrayOf(int[] arr) {
        for (int value : arr) {
            System.out.print(" " + value + " ");
        }
    }

}

The key to this implementation is getting the numbers right. There's room for off-by-one errors everywhere. We start with two pointers: element is the item we are evaluating to see where it best belongs in the list and j is our pointer to back track along the list to find the correct spot to insert element.

Our first for loop is straight forward. It loops from one end of the list to the other. Inside our for loop we set element to the value in the array at index i. Then we set j to i -1. This way we can compare element with the items behind it. As i increments and we look at items further down the list, we'll have to back track further and further. This is where the while loop comes in. While our j counter is greater than 0 (we haven't gone off the frotnend of the list) and the value in the array at index j is greater than element, then we haven't found the right spot to insert element. So we shift the item at arr[j] to index j + 1 and then we decrement j. This achieves the shifting of items up the array as we backtrack. Once the condition in our while loop fails, we know we've found the index to place element. We set it at the index j + 1.

The algorithm continues this process for all items in the array, ending with a sorted list.

Again, as we've seen with the above algorithms, Insertion Sort has a nested loop structure. If you think about the worst possible case (the array being reverse sorted) then this algorithm would take O(n^2) time to run, where n is the size of the array.

Summary

Let's look at the runtime of all the algorithms we've seen in this post.

SelectionBubbleInsertion
Best CaseO(n^2)O(n)O(n)
Avg. CaseO(n^2)O(n^2)O(n^2)
Worst CaseO(n^2)O(n^2)O(n^2)

In the average and worst cases, these algorithms are all O(n^2), which if you remember your Big-O Complexity chart, is not very good.

Conlcusion

In my next post, we'll look at Merge Sort and Quick Sort, two of the most ubiquitous sorting algorithms in computer science.