Skip to content
Comparative Analysis of Sorting Algorithms
14 min read

Comparative Analysis of Sorting Algorithms

As part of a typical university algorithms exercise, I ran a small comparative analysis of the most popular sorting algorithms. The idea was to study complexity and see how different approaches to the same problem can produce very different execution times.

This is a simple academic analysis, but I wanted to document it in case it helps other computer science students in the future.

I started with a small Java script to generate random five-digit numbers and store them in a text file, so I could benchmark all algorithms against exactly the same dataset. The script is available in this repository:

# File path
> algorithms/java/RandomNumbers.java

# Run Java script
$ javac RandomNumbers.java && java RandomNumbers

That produces numbers/numbers.txt with n random numbers configured in the script. In my experiments I generated up to 1,000,000,000 values (about 5 GB), so I did not include that dataset in the repository.


Evaluated algorithms

I implemented the following algorithms:

I used C for implementations, under algorithms/c/sortAlgorithms.


Benchmark automation

Since this required many runs, I automated execution with two scripts:

# Base benchmark script
> algorithms/c/benchmark.c

# Run one benchmark
$ gcc benchmark.c -o benchmark.out && ./benchmark.out arg1 arg2

# Multi-run script
> algorithms/c/runTest.c

# Run all tests
$ gcc runTest.c -o runTest.out && ./runTest.out

I ran this in a small controlled environment to minimize interference from other processes.

I created two Digital Ocean droplets:

Digital Ocean droplets
The two DigitalOcean droplets used as benchmark machines — M1 (1 core, 1 GB RAM) and M2 (2 cores, 2 GB RAM).

The second machine had double resources, so better performance seemed expected.

I also provisioned Java and C using this script: ServerConfig/provision.sh

sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y build-essential gcc python-dev python-pip python-setuptools
sudo apt-get install -y git
sudo apt-get install default-jre -y
sudo apt-get install default-jdk -y
sudo apt-get install openjdk-7-jre -y
sudo apt-get install openjdk-7-jdk -y

Results

Both machines used the same random dataset, with growing input sizes. Full results are in results/analysis.ods.

I also used this background execution trick:

$ gcc runTest.c -o runTest.out && ./runTest.out
# Ctrl + z
disown -h %1
bg 1

After 3-4 days, the O(n^2) algorithms had only reached 1,600,000 elements, so I stopped there and analyzed.

M1 = Machine 1 (1 core, 1GB RAM)
M2 = Machine 2 (2 cores, 2GB RAM)

Bubble Sort: O(n^2)

Bubble Sort M1
Bubble Sort on Machine 1 — execution time grows sharply as input size increases past ~1 million elements.
Bubble Sort M2
Bubble Sort on Machine 2 — slower per-core clock (1.8 GHz vs 2.4 GHz) results in worse performance despite more RAM.
Bubble Sort M1 vs M2
Bubble Sort M1 vs M2 — single-threaded O(n²) growth clearly visible; M1 outperforms due to higher CPU frequency.

Counting Sort: O(n+k)

Counting Sort M1
Counting Sort on Machine 1 — near-flat O(n+k) curve, finishing in milliseconds even at 1.6M elements.
Counting Sort M2
Counting Sort on Machine 2 — similar flat growth; the O(n+k) advantage holds regardless of hardware tier.
Counting Sort M1 vs M2
Counting Sort M1 vs M2 — both machines show nearly identical sub-millisecond times, confirming the O(n+k) bound.

Heap Sort: O(n log n)

Heap Sort M1
Heap Sort on Machine 1 — O(n log n) growth, staying well under 2 seconds at 1.6M elements.
Heap Sort M2
Heap Sort on Machine 2 — consistent O(n log n) performance; slightly slower than M1 due to lower clock speed.
Heap Sort M1 vs M2
Heap Sort M1 vs M2 — overlapping curves confirm that single-threaded performance favors higher CPU frequency.

Insertion Sort: O(n^2)

Insertion Sort M1
Insertion Sort on Machine 1 — O(n²) curve steeper than Selection Sort but far below Bubble Sort.
Insertion Sort M2
Insertion Sort on Machine 2 — slower per-core speed makes the O(n²) penalty more apparent at larger inputs.
Insertion Sort M1 vs M2
Insertion Sort M1 vs M2 — M1 leads throughout, reinforcing that clock speed dominates for sequential algorithms.

Merge Sort: O(n log n)

Merge Sort M1
Merge Sort on Machine 1 — stable O(n log n) curve, competitive with Heap Sort and Quicksort.
Merge Sort M2
Merge Sort on Machine 2 — consistent performance; extra RAM may help with the memory allocations Merge Sort requires.
Merge Sort M1 vs M2
Merge Sort M1 vs M2 — results are close, showing the algorithm's memory usage can partially offset the CPU frequency gap.

Quicksort: O(n log n)

Quick Sort M1
Quicksort on Machine 1 — second fastest overall, staying consistently under 0.5 seconds up to 1.6M elements.
Quick Sort M2
Quicksort on Machine 2 — slightly slower than M1 but still well within the O(n log n) fast group.
Quick Sort M1 vs M2
Quicksort M1 vs M2 — both machines track closely, confirming Quicksort's practical efficiency on random data.

Selection Sort: O(n^2)

Selection Sort M1
Selection Sort on Machine 1 — O(n²) growth, faster than Bubble Sort but still far behind the O(n log n) group.
Selection Sort M2
Selection Sort on Machine 2 — similar quadratic scaling; fewer swaps than Bubble Sort but same asymptotic bound.
Selection Sort M1 vs M2
Selection Sort M1 vs M2 — M1 consistently faster; clock speed advantage is the deciding factor for single-threaded O(n²) work.

Full comparison chart

All algorithms M1
All seven algorithms on Machine 1 — the O(n²) group dominates the scale; the fast group clusters near the x-axis.
All algorithms M2
All seven algorithms on Machine 2 — same pattern as M1; Bubble Sort's curve dwarfs everything else.
All algorithms M1 vs M2
All algorithms on both machines — M1's higher clock speed keeps it faster despite M2's greater core count and RAM.

As expected, Bubble Sort is the clear loser. The fast group (quickSort, mergeSort, heapSort, countingSort) overlaps because of chart scale.


Last 7 response-time points

Machine 1

SizeBubbleSortCountingSortHeapSortInsertionSortMergeSortQuickSortSelectionSort
1,000,0005584.2544990.0166090.7473952592.4989770.7042810.2914991935.487457
1,100,0006637.2222520.0191870.9257643171.4457150.6534550.4710392269.966268
1,200,0008045.9536820.0236520.9135373722.6388850.5130990.2394542783.279525
1,300,00010169.3833780.0452080.7133084824.2502850.5751490.2612893514.914589
1,400,00012053.6587980.0346131.4890845658.7399510.6761120.2794784066.729922
1,500,00013798.8541230.0275251.0942576555.3654990.7436510.3156024839.340426
1,600,00015205.6805440.0284780.9966486794.5121190.7253470.3259905056.213092

Machine 2

SizeBubbleSortCountingSortHeapSortInsertionSortMergeSortQuickSortSelectionSort
1,000,0007069.3170380.0324150.7521683178.6942370.5582000.3156892454.531144
1,100,0008458.1503870.0241570.8420383666.3597870.5574810.2845792804.449695
1,200,0009495.8987080.0265300.8828194084.5819240.6166360.3585023081.748250
1,300,00010626.0237710.0273090.9138144933.2018830.7538900.4014563912.714921
1,400,00013439.2500820.0300091.0612215790.7978040.6331800.4424494066.729922
1,500,00015102.7365920.0318261.0647446565.6303580.7885510.4002385114.565289
1,600,00016483.6948080.0392981.3651297311.3470040.7606180.4492845676.768371

Fast algorithms only

Fastest sorting algorithms comparison — Machine 1 benchmark results
Fast algorithms on Machine 1 — Counting Sort leads at near-zero time; Quicksort and Heap Sort trail closely.
Fastest sorting algorithms comparison — Machine 2 benchmark results
Fast algorithms on Machine 2 — Counting Sort still dominates; note slight uptick in Heap Sort at larger sizes.
Fastest sorting algorithms comparison — Machine 1 vs Machine 2 side-by-side
Fast algorithms side-by-side on both machines — the performance gap between algorithms outweighs hardware differences.

Counting Sort wins in this setup (O(n+k)), but it has practical limits:

  • It only works on integer ranges.
  • Memory usage can explode when max - min is large.

Quicksort was second and is broadly useful, but it can degrade to O(n^2) in bad pivot/distribution cases.


Machine 1 vs Machine 2

Why was M1 sometimes faster than M2 despite fewer resources?

Because these algorithm implementations were not parallelized. Even with two cores, M2 was effectively using one core for each single-threaded run.

Per-core frequency mattered:

  • M1: 2399.998 MHz
  • M2: 1799.998 MHz

System architecture screenshots

System architecture of benchmark Machine 1 — DigitalOcean droplet specifications
Machine 1 specs via lscpu: 1 core at 2399.998 MHz — higher clock speed compensated for the smaller resource pool.
System architecture of benchmark Machine 2 — DigitalOcean droplet specifications
Machine 2 specs via lscpu: 2 cores at 1799.998 MHz — more cores and RAM but lower per-core frequency.

Frequency units:

  • 1 Hz = one cycle/second
  • 1 KHz = 1024 Hz
  • 1 MHz = 1024 KHz
  • 1 GHz = 1024 MHz
  • 1 THz = 1024 GHz

Command used:

$ lscpu

Extended memory-capacity phase

To better use machine resources, I ran only the efficient algorithms at larger scales:

Fast algorithms at maximum memory capacity — Machine 1 benchmark results
Fast algorithms pushed to M1's memory limit — curves diverge as dataset size approaches available RAM.
Fast algorithms at maximum memory capacity — Machine 2 benchmark results
Fast algorithms at M2's memory ceiling — extra RAM extends the viable dataset range before degradation.

This is where M2’s extra RAM clearly helped.

Counting Sort at max scale

Counting Sort at maximum memory scale — Machine 1 benchmark results
Counting Sort at max scale on M1 — still near-linear, but memory constraints cap the practical upper limit.
Counting Sort at maximum memory scale — Machine 2 benchmark results
Counting Sort at max scale on M2 — extra RAM lets it handle larger integer ranges before hitting its memory bound.
Counting Sort at maximum memory scale — Machine 1 vs Machine 2 comparison
Counting Sort M1 vs M2 at max scale — M2's additional RAM provides a clear advantage for large integer datasets.

Heap Sort at max scale

Heap Sort at maximum memory scale — Machine 1 benchmark results
Heap Sort at max scale on M1 — O(n log n) curve remains well-behaved even as dataset size grows large.
Heap Sort at maximum memory scale — Machine 2 benchmark results
Heap Sort at max scale on M2 — slightly slower than M1 but stable; in-place sorting keeps memory usage low.
Heap Sort at maximum memory scale — Machine 1 vs Machine 2 comparison
Heap Sort M1 vs M2 at max scale — the CPU frequency gap widens slightly with larger inputs.

Merge Sort at max scale

Merge Sort at maximum memory scale — Machine 1 benchmark results
Merge Sort at max scale on M1 — auxiliary memory allocations become measurable at very large inputs.
Merge Sort at maximum memory scale — Machine 2 benchmark results
Merge Sort at max scale on M2 — extra RAM reduces memory pressure, keeping the curve comparatively smooth.
Merge Sort at maximum memory scale — Machine 1 vs Machine 2 comparison
Merge Sort M1 vs M2 at max scale — M2's extra RAM helps offset the lower clock speed for this memory-intensive algorithm.

Quick Sort at max scale

Quick Sort at maximum memory scale — Machine 1 benchmark results
Quicksort at max scale on M1 — remains among the fastest; in-place design keeps memory overhead minimal.
Quick Sort at maximum memory scale — Machine 2 benchmark results
Quicksort at max scale on M2 — comparable to M1; randomized input avoids worst-case O(n²) degradation.
Quick Sort at maximum memory scale — Machine 1 vs Machine 2 comparison
Quicksort M1 vs M2 at max scale — curves nearly overlap; neither machine gains a decisive edge over the other.

Final fast-algorithm comparison

All fast algorithms at maximum memory scale — Machine 1 benchmark results
All fast algorithms at M1's memory ceiling — curves separate clearly, revealing the practical ordering: Counting, Quick, Merge, Heap.
All fast algorithms at maximum memory scale — Machine 2 benchmark results
All fast algorithms at M2's memory ceiling — extra RAM lets the comparison extend further before results become unreliable.
All fast algorithms at maximum memory scale — Machine 1 vs Machine 2 comparison
Final comparison at max scale on both machines — behavior aligns with theoretical complexity at large n.

With larger input volumes, curves became more stable and easier to compare against theoretical complexity behavior.


Closing thoughts

This exercise reinforced a few core lessons:

  • Algorithmic complexity matters, but implementation and hardware context matter too.
  • Extra RAM may not speed single-threaded runs directly, but it can increase practical dataset limits.
  • More cores do not help unless the implementation is actually parallel.

If you are getting started in computer science, I hope this serves as a useful reference.

Resources: GitHub repository


“Intelligence consists not only in knowledge, but also in the skill to apply knowledge in practice.”
Aristotle

Sergio Alexander Florez Galeano

Sergio Alexander Florez Galeano

CTO & Co-founder at DailyBot (YC S21). I write about building products, startups, and the craft of software engineering.

Share this post:

Stay in the loop

Get notified when I publish something new. No spam, unsubscribe anytime.

No spam. Unsubscribe anytime.