Adventures in Merge Sorting with Parallelism

Some parallel algorithms aren't actually faster

Feb. 19, 2019, 9:14 p.m.

I was recently tasked with implementing a parallel version of a common sorting algorithm, specifically MergeSort. When I first started thinking about how to tackle this problem I thought it would be a simple task. Naively, I assumed that I could just hand sublists to different threads, and have them sort them and merge them all together at the end. I set off to implement this design in Java and ended up with something like the following, with each of the Threads running a Sorter that followed the basic implementation of the MergeSort algorithm:

private void sortParallel() {
    int itemsPerSublist = randomIntList.size() / NUMBER_OF_THREADS;
    List<Thread> threads = new ArrayList<>();
    List<Sorter<Integer>> sorters = new ArrayList<>();
    for (int i = 0; i < NUMBER_OF_THREADS; i++) {
        List<Integer> sublist = new ArrayList<>();
        for (int j = 0; j < itemsPerSublist; j++) {

    Sorter<Integer> sorter = new Sorter<>(sublist);
    Thread sorterThread = new Thread(sorter);;

    for (Thread thread : threads) {

    List<Integer> result = sorters.get(0).getAllElements();
    for (int i = 1; i < sorters.size(); i++) {
        result = mergeLists(result, sorters.get(i).getAllElements());

I was surprised to find that, depending on the number of threads I chose to run this algorithm with, the sequential version was much faster! In fact, the more threads I used the slower the parallel version got. Even though I was benchmarking this algorithm with millions of elements to sort, the overhead of: creating the Threads, waiting for them all to halt, and merging the resulting lists together to get the final sorted list was way too heavy to justify the speed up from completing the initial sorting in parallel. I won't bore you with hard numbers here, but suffice it to say if more than around four Threads were used it was nearly twice as slow as the sequential version of MergeSort!

I went back to the drawing board still believing it possible to achieve an actual speed up with a parallel version of MergeSort. I soon came up with a different way to parallelize MergeSort that involved passing a synchronized list of the currently sorted elements around amongst the Threads. This required quite a few changes to my algorithm of which the most interesting follow:

public class Sorter<T extends Comparable> implements Runnable {
    private List<T> elementsToSort;
    private List<T> sortedElements;

    public Sorter(List<T> elementsToSort, List<T> sortedElements) {
        this.elementsToSort = elementsToSort;
        this.sortedElements = sortedElements;

    public void run() {
        List<T> newlySortedElements = mergeSort(elementsToSort);

    private synchronized void mergeWithCurrentResults(List<T> newlySortedElements) {

    private List<T> mergeSort(List<T> elements) {

    private void mergeWithSortedElements(List<T> elements) {
        int index = 0;
        while (index < sortedElements.size() && !elements.isEmpty()) {
            if (sortedElements.get(index).compareTo(elements.get(0)) > 0) {
                sortedElements.add(index, elements.remove(0));

        // There may be leftover elements to add at the end

Upon benchmarking this version of my parallel MergeSort algorithm I found that there was an actual speedup over the sequential version even when using more than two or even eight Threads! With some benchmarking and rethinking of my solution I was able to succeed at creating a much quicker parallel version of the MergeSort algorithm. If you care to look at the entirety of the final algorithm it can be found here.

Comments about Adventures in Merge Sorting with Parallelism

No one has left a comment yet, be the first to voice your opinion on Adventures in Merge Sorting with Parallelism!