Competitive Coding

Unit 6: Efficient Sorting Algorithms & Analysis

From O(n²) to O(n log n) — master Merge Sort, Quick Sort, Counting Sort, and sorting-based problem solving for competitive programming.

⏱️ 8 hrs theory + 6 hrs practice | 💰 Earning Potential: ₹8,000–₹30,000/month | 📝 30 MCQs (Bloom's Mapped)

💼 Jobs this unlocks: SDE-I (₹6–12 LPA) | Backend Developer (₹5–10 LPA) | Competitive Programmer

Section A

Opening Hook — The Algorithm Behind Every "Sort By" Button

🔢 How Naukri.com Sorts 10 Crore Resumes in Milliseconds

Every time a recruiter clicks "Sort by Relevance" on Naukri.com, an O(n log n) algorithm fires across 10 crore+ resumes. Amazon India sorts 50 crore products when you click "Price: Low to High." Swiggy sorts 5,000 restaurants by delivery time when you open the app. Every single "Sort by" button you click triggers an O(n log n) algorithm.

These companies can't use Bubble Sort — it would take hours on 10 crore records. They use Merge Sort, Quick Sort, or hybrid sorts like Timsort and IntroSort — algorithms that finish in milliseconds even on massive datasets. The difference between O(n²) and O(n log n) isn't academic — it's the difference between a website that loads instantly and one that crashes.

After this unit, YOU will understand exactly how these systems work — and you'll implement them from scratch, optimize them, and use them to solve competitive programming problems that earn you job offers at these very companies.

🇮🇳 Naukri.com🇮🇳 Amazon India🇮🇳 Flipkart🇮🇳 Paytm🇮🇳 Swiggy🌍 Google

Sorting is the most studied problem in computer science. Donald Knuth devoted an entire volume (Vol. 3) of "The Art of Computer Programming" to sorting and searching alone. More than 50% of all CPU cycles in the world are spent on sorting operations. In competitive programming, 40% of problems require sorting as a preprocessing step.

Section B

Learning Outcomes — Bloom's Taxonomy Mapped

Bloom's Level	Learning Outcome
🔵 Remember	List the time and space complexities of Merge Sort, Quick Sort, and Counting Sort
🔵 Remember	Define stable sorting vs unstable sorting with examples
🔵 Understand	Explain why comparison-based sorting has an Ω(n log n) lower bound using the decision tree argument
🔵 Understand	Explain the divide-and-conquer approach in Merge Sort and Quick Sort with recursion tree diagrams
🟢 Apply	Implement Merge Sort (iterative & recursive) and Quick Sort in C++ and Python
🟢 Apply	Apply Counting Sort to sort elements within a limited integer range
🟢 Analyze	Compare Merge Sort vs Quick Sort on 8 parameters including stability, space, and cache performance
🟢 Analyze	Analyze when to use which sorting algorithm based on input size, data distribution, and constraints
🟠 Evaluate	Evaluate the impact of pivot selection strategies on Quick Sort's worst-case performance
🟠 Evaluate	Assess stability requirements in real-world sorting scenarios (database records, e-commerce)
🟠 Create	Design a frequency-based sorting solution for trending hashtags problem
🟠 Create	Construct an optimized sorting strategy combining multiple algorithms for a competitive programming problem

Section C

Concept Explanation — Efficient Sorting from Scratch

1. Introduction to O(n log n) Sorting

In competitive coding, time limits are strict — usually 1–2 seconds. A modern computer executes roughly 10⁸ operations per second. If you use Bubble Sort (O(n²)) on n = 10⁶ elements, you need 10¹² operations — that's 10,000 seconds. You'd get TLE (Time Limit Exceeded) every single time.

Why O(n²) Sorts Fail in Competitive Programming

n (Input Size)	O(n²) Operations	O(n log n) Operations	Speed Difference
1,000	1,000,000	~10,000	100×
10,000	100,000,000	~133,000	750×
100,000	10,000,000,000	~1,700,000	5,800×
1,000,000	1,000,000,000,000	~20,000,000	50,000×

At n = 10⁶, O(n log n) takes ~0.2 seconds. O(n²) takes ~10,000 seconds. That's the difference between AC (Accepted) and TLE.

📐 The Lower Bound: Why We Can't Beat O(n log n) with Comparisons

THE DECISION TREE ARGUMENT

Any comparison-based sorting algorithm can be modeled as a binary decision tree. Each internal node represents a comparison (a[i] < a[j]?), and each leaf represents a possible sorted permutation.

Key facts:

For n elements, there are n! possible permutations
A binary tree of height h has at most 2ʰ leaves
We need 2ʰ ≥ n!, therefore h ≥ log₂(n!)
By Stirling's approximation: log₂(n!) ≈ n log₂(n) − n log₂(e) ≈ n log₂(n)

Conclusion: Any comparison-based sort needs at least Ω(n log n) comparisons in the worst case. Merge Sort and Heap Sort achieve this bound. Quick Sort achieves it on average.

O(n²) sorts are still useful when: (1) n is very small (n < 50), where constant factors matter more; (2) the array is nearly sorted (Insertion Sort runs in O(n) on nearly sorted data); (3) you need simplicity over speed (embedded systems with tiny memory).

2. Iterative Merge Sort (Bottom-Up)

Intuition: Instead of recursively splitting the array, start from the bottom. First merge pairs of single elements, then pairs of size 2, then pairs of size 4, and so on until the whole array is sorted. It's like building a pyramid from the base up.

Step-by-Step Trace: [38, 27, 43, 3, 9, 82, 10]

Pass	Sub-array Size	Array State
Initial	—	[38, 27, 43, 3, 9, 82, 10]
Pass 1	1 → merge pairs	[27, 38, 3, 43, 9, 82, 10]
Pass 2	2 → merge quads	[3, 27, 38, 43, 9, 10, 82]
Pass 3	4 → merge octets	[3, 9, 10, 27, 38, 43, 82]

C++ Implementation

C++
#include <iostream>
#include <vector>
using namespace std;

void merge(vector<int>& arr, int l, int m, int r) {
    vector<int> temp;
    int i = l, j = m + 1;
    while (i <= m && j <= r) {
        if (arr[i] <= arr[j]) temp.push_back(arr[i++]);
        else temp.push_back(arr[j++]);
    }
    while (i <= m) temp.push_back(arr[i++]);
    while (j <= r) temp.push_back(arr[j++]);
    for (int k = l; k <= r; k++)
        arr[k] = temp[k - l];
}

void iterativeMergeSort(vector<int>& arr) {
    int n = arr.size();
    // Start with sub-arrays of size 1, double each pass
    for (int size = 1; size < n; size *= 2) {
        for (int left = 0; left < n - size; left += 2 * size) {
            int mid = left + size - 1;
            int right = min(left + 2 * size - 1, n - 1);
            merge(arr, left, mid, right);
        }
    }
}

Python Implementation

Python
def iterative_merge_sort(arr):
    n = len(arr)
    size = 1
    while size < n:
        for left in range(0, n - size, 2 * size):
            mid = left + size - 1
            right = min(left + 2 * size - 1, n - 1)
            # Merge arr[left..mid] and arr[mid+1..right]
            merged = []
            i, j = left, mid + 1
            while i <= mid and j <= right:
                if arr[i] <= arr[j]:
                    merged.append(arr[i]); i += 1
                else:
                    merged.append(arr[j]); j += 1
            while i <= mid:
                merged.append(arr[i]); i += 1
            while j <= right:
                merged.append(arr[j]); j += 1
            arr[left:right+1] = merged
        size *= 2
    return arr

Property	Value
Time (Best)	O(n log n)
Time (Average)	O(n log n)
Time (Worst)	O(n log n)
Space	O(n)
Stable?	✅ Yes
In-place?	❌ No
When to use	When recursion stack depth is a concern (very large arrays), or in embedded systems with limited stack

3. Recursive Merge Sort (Top-Down)

Intuition: Imagine you have a deck of unsorted cards. Split the deck in half. Give each half to a friend. Each friend splits their half again and again until they hold just one card (which is trivially sorted). Then everyone merges their sorted halves back together. That's Merge Sort.

Algorithm

Divide: Split the array into two halves
Conquer: Recursively sort each half
Combine: Merge the two sorted halves into one sorted array

Full Recursion Tree: [38, 27, 43, 3, 9, 82, 10]

🌳 Recursion Tree Trace

                    [38, 27, 43, 3, 9, 82, 10]
                   /                            \
          [38, 27, 43, 3]                   [9, 82, 10]
          /            \                    /          \
      [38, 27]      [43, 3]            [9, 82]       [10]
      /     \       /     \            /     \          |
    [38]   [27]   [43]   [3]        [9]    [82]       [10]
      \     /       \     /            \     /          |
      [27, 38]      [3, 43]          [9, 82]         [10]
          \            /                \              /
        [3, 27, 38, 43]                 [9, 10, 82]
                   \                        /
              [3, 9, 10, 27, 38, 43, 82]

The Merge Procedure — Step by Step

Merging [3, 27, 38, 43] and [9, 10, 82]:

Step	Compare	Pick	Result So Far
1	3 vs 9	3	[3]
2	27 vs 9	9	[3, 9]
3	27 vs 10	10	[3, 9, 10]
4	27 vs 82	27	[3, 9, 10, 27]
5	38 vs 82	38	[3, 9, 10, 27, 38]
6	43 vs 82	43	[3, 9, 10, 27, 38, 43]
7	Left exhausted	82	[3, 9, 10, 27, 38, 43, 82]

C++ Implementation

C++
#include <iostream>
#include <vector>
using namespace std;

void merge(vector<int>& arr, int l, int m, int r) {
    int n1 = m - l + 1, n2 = r - m;
    vector<int> L(n1), R(n2);
    for(int i = 0; i < n1; i++) L[i] = arr[l + i];
    for(int j = 0; j < n2; j++) R[j] = arr[m + 1 + j];

    int i = 0, j = 0, k = l;
    while(i < n1 && j < n2) {
        if(L[i] <= R[j]) arr[k++] = L[i++];
        else arr[k++] = R[j++];
    }
    // Copy remaining elements
    while(i < n1) arr[k++] = L[i++];
    while(j < n2) arr[k++] = R[j++];
}

void mergeSort(vector<int>& arr, int l, int r) {
    if(l < r) {
        int m = l + (r - l) / 2;
        mergeSort(arr, l, m);       // Sort left half
        mergeSort(arr, m + 1, r);  // Sort right half
        merge(arr, l, m, r);       // Merge sorted halves
    }
}

int main() {
    vector<int> arr = {38, 27, 43, 3, 9, 82, 10};
    mergeSort(arr, 0, arr.size() - 1);
    for(int x : arr) cout << x << " ";
    return 0;
}

Python Implementation

Python
def merge_sort(arr):
    if len(arr) <= 1:
        return arr
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    result.extend(left[i:])
    result.extend(right[j:])
    return result

arr = [38, 27, 43, 3, 9, 82, 10]
print(merge_sort(arr))
# Output: [3, 9, 10, 27, 38, 43, 82]

Forgetting remaining elements after the merge loop. After the main while loop in merge, one of the two halves may still have elements left. You MUST copy them. Many students get partially sorted arrays because they forget the two trailing while loops.

Flipkart's search engine uses Merge Sort internally for sorting search results when stability is required — e.g., when products with the same price must maintain their original order (by seller rating). Python's built-in sorted() uses Timsort, which is a hybrid of Merge Sort + Insertion Sort.

4. Quick Sort

Intuition: Pick one element (the "pivot"). Put all smaller elements to its left and all larger elements to its right. Now the pivot is in its correct sorted position. Recursively do the same for the left and right sub-arrays.

Lomuto Partition Scheme

The simpler scheme: pivot is the last element. Maintain a pointer i that tracks the boundary of "elements ≤ pivot."

Full Trace: [10, 80, 30, 90, 40, 50, 70] — Lomuto, Pivot = 70

Step	j	arr[j]	Action	Array State	i
Init	—	—	i = -1, pivot = 70	[10, 80, 30, 90, 40, 50, 70]	-1
1	0	10	10 ≤ 70 → i++, swap(0,0)	[10, 80, 30, 90, 40, 50, 70]	0
2	1	80	80 > 70 → skip	[10, 80, 30, 90, 40, 50, 70]	0
3	2	30	30 ≤ 70 → i++, swap(1,2)	[10, 30, 80, 90, 40, 50, 70]	1
4	3	90	90 > 70 → skip	[10, 30, 80, 90, 40, 50, 70]	1
5	4	40	40 ≤ 70 → i++, swap(2,4)	[10, 30, 40, 90, 80, 50, 70]	2
6	5	50	50 ≤ 70 → i++, swap(3,5)	[10, 30, 40, 50, 80, 90, 70]	3
Final	—	—	swap(i+1, pivot) = swap(4,6)	[10, 30, 40, 50, 70, 90, 80]	4

After partition: 70 is at index 4 — its correct sorted position. Left sub-array [10,30,40,50] and right sub-array [90,80] are recursively sorted.

Hoare Partition Scheme

Faster in practice: uses two pointers starting from both ends, moving inward. Fewer swaps on average than Lomuto.

C++ — Lomuto Partition + Quick Sort

C++
#include <iostream>
#include <vector>
using namespace std;

int partition(vector<int>& arr, int low, int high) {
    int pivot = arr[high];  // Pivot = last element
    int i = low - 1;
    for(int j = low; j < high; j++) {
        if(arr[j] <= pivot) {
            i++;
            swap(arr[i], arr[j]);
        }
    }
    swap(arr[i + 1], arr[high]);
    return i + 1;
}

void quickSort(vector<int>& arr, int low, int high) {
    if(low < high) {
        int pi = partition(arr, low, high);
        quickSort(arr, low, pi - 1);
        quickSort(arr, pi + 1, high);
    }
}

int main() {
    vector<int> arr = {10, 80, 30, 90, 40, 50, 70};
    quickSort(arr, 0, arr.size() - 1);
    for(int x : arr) cout << x << " ";
    return 0;
}

C++ — Hoare Partition

C++
int hoarePartition(vector<int>& arr, int low, int high) {
    int pivot = arr[low];
    int i = low - 1, j = high + 1;
    while(true) {
        do { i++; } while(arr[i] < pivot);
        do { j--; } while(arr[j] > pivot);
        if(i >= j) return j;
        swap(arr[i], arr[j]);
    }
}

Python — Quick Sort

Python
def partition(arr, low, high):
    pivot = arr[high]
    i = low - 1
    for j in range(low, high):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    return i + 1

def quick_sort(arr, low, high):
    if low < high:
        pi = partition(arr, low, high)
        quick_sort(arr, low, pi - 1)
        quick_sort(arr, pi + 1, high)

arr = [10, 80, 30, 90, 40, 50, 70]
quick_sort(arr, 0, len(arr) - 1)
print(arr)
# Output: [10, 30, 40, 50, 70, 80, 90]

Randomized Quick Sort

To avoid worst case on sorted/reverse-sorted arrays, choose a random pivot:

C++
#include <cstdlib>

int randomPartition(vector<int>& arr, int low, int high) {
    int randIdx = low + rand() % (high - low + 1);
    swap(arr[randIdx], arr[high]); // Move random pivot to end
    return partition(arr, low, high);
}

Property	Value
Time (Best)	O(n log n)
Time (Average)	O(n log n)
Time (Worst)	O(n²) — when pivot is always min/max
Space	O(log n) — recursion stack
Stable?	❌ No
In-place?	✅ Yes
When to use	General purpose, arrays (not linked lists), when average-case performance matters

Using first or last element as pivot on already sorted arrays gives O(n²) worst case every time. Always use randomized pivot or median-of-three (pick median of first, middle, last elements) in production code.

C++ std::sort() uses IntroSort — a hybrid that starts with Quick Sort, switches to Heap Sort if recursion depth exceeds 2·log₂(n) (to guarantee O(n log n) worst case), and uses Insertion Sort for small sub-arrays (n < 16). This is why sort() is so fast in practice.

5. Sorting Elements by Frequency

Intuition: Instead of sorting by value, sort by how often each element appears. Elements that appear more frequently come first. Ties are broken by the value or by order of first appearance.

Algorithm

Count frequency of each element using a HashMap
Sort using a custom comparator: higher frequency first; if tie, smaller value first

Trace: [2, 5, 2, 8, 5, 6, 8, 8]

Element	Frequency
8	3
2	2
5	2
6	1

Output (sorted by frequency desc, then value asc): [8, 8, 8, 2, 2, 5, 5, 6]

C++ Implementation

C++
#include <iostream>
#include <vector>
#include <unordered_map>
#include <algorithm>
using namespace std;

void sortByFrequency(vector<int>& arr) {
    unordered_map<int, int> freq;
    for(int x : arr) freq[x]++;

    sort(arr.begin(), arr.end(),
        [&freq](int a, int b) {
            if(freq[a] != freq[b])
                return freq[a] > freq[b]; // Higher freq first
            return a < b;             // Tie: smaller value first
        });
}

int main() {
    vector<int> arr = {2, 5, 2, 8, 5, 6, 8, 8};
    sortByFrequency(arr);
    for(int x : arr) cout << x << " ";
    return 0;
}
// Output: 8 8 8 2 2 5 5 6

Python Implementation

Python
from collections import Counter

def sort_by_frequency(arr):
    freq = Counter(arr)
    return sorted(arr, key=lambda x: (-freq[x], x))

arr = [2, 5, 2, 8, 5, 6, 8, 8]
print(sort_by_frequency(arr))
# Output: [8, 8, 8, 2, 2, 5, 5, 6]

Twitter India's trending hashtags use frequency-based sorting. When you see #IPL trending above #CricketWorldCup, it's because #IPL was used more times in the last hour. Same concept — count frequencies, sort descending.

6. Finding Minimum Length Sorted Sub-array

Problem: Given an unsorted array, find the shortest sub-array such that sorting only this sub-array makes the entire array sorted.

Approach (Two Pointers from Both Ends)

From left: Find the first element that is smaller than its predecessor → start of unsorted region
From right: Find the first element that is larger than its successor → end of unsorted region
Find min and max within this sub-array
Extend left boundary to include any element > min of sub-array
Extend right boundary to include any element < max of sub-array

Trace: [2, 6, 4, 8, 10, 9, 15]

Step	Action	Result
1	Scan left → right: 6 > 4 breaks order at index 2	start = 1
2	Scan right → left: 10 > 9 breaks order at index 5	end = 5
3	Min in [6,4,8,10,9] = 4, Max = 10	min=4, max=10
4	No element before start > 4, so start stays = 1	start = 1
5	No element after end < 10, so end stays = 5	end = 5

Answer: Sort sub-array from index 1 to 5 → [6, 4, 8, 10, 9] → sorted as [4, 6, 8, 9, 10]. Full array becomes [2, 4, 6, 8, 9, 10, 15]. Length = 5.

C++ Implementation

C++
int findMinSubarray(vector<int>& arr) {
    int n = arr.size();
    int start = -1, end = -1;

    // Find start: first element out of order from left
    for(int i = 0; i < n - 1; i++) {
        if(arr[i] > arr[i + 1]) { start = i; break; }
    }
    if(start == -1) return 0; // Already sorted

    // Find end: first element out of order from right
    for(int i = n - 1; i > 0; i--) {
        if(arr[i] < arr[i - 1]) { end = i; break; }
    }

    // Find min and max in sub-array
    int subMin = *min_element(arr.begin() + start, arr.begin() + end + 1);
    int subMax = *max_element(arr.begin() + start, arr.begin() + end + 1);

    // Extend boundaries
    for(int i = 0; i < start; i++)
        if(arr[i] > subMin) { start = i; break; }
    for(int i = n - 1; i > end; i--)
        if(arr[i] < subMax) { end = i; break; }

    return end - start + 1;
}

Python Implementation

Python
def find_min_subarray(arr):
    n = len(arr)
    start, end = -1, -1

    for i in range(n - 1):
        if arr[i] > arr[i + 1]:
            start = i; break
    if start == -1: return 0

    for i in range(n - 1, 0, -1):
        if arr[i] < arr[i - 1]:
            end = i; break

    sub_min = min(arr[start:end + 1])
    sub_max = max(arr[start:end + 1])

    for i in range(start):
        if arr[i] > sub_min:
            start = i; break
    for i in range(n - 1, end, -1):
        if arr[i] < sub_max:
            end = i; break

    return end - start + 1

print(find_min_subarray([2, 6, 4, 8, 10, 9, 15]))
# Output: 5

7. Sorting Strings

Lexicographic Order: Strings are compared character by character from left to right, using their ASCII values. "apple" < "banana" because 'a' (97) < 'b' (98). "app" < "apple" because "app" is a prefix and shorter.

Comparison Functions Across Languages

Language	Function	Example
C	`strcmp(s1, s2)`	Returns <0, 0, or >0
C++	`s1 < s2` (operator)	Returns bool
Java	`s1.compareTo(s2)`	Returns int
Python	`s1 < s2` (operator)	Returns bool

C++ — Sort Array of Strings

C++
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

int main() {
    vector<string> words = {"banana", "apple", "cherry",
                              "date", "elderberry"};
    sort(words.begin(), words.end()); // Lexicographic sort
    for(auto& w : words)
        cout << w << " ";
    return 0;
}
// Output: apple banana cherry date elderberry

Python — Sort Strings

Python
words = ["banana", "apple", "cherry", "date", "elderberry"]
words.sort()  # In-place lexicographic sort
print(words)
# Output: ['apple', 'banana', 'cherry', 'date', 'elderberry']

# Sort by string length
words.sort(key=len)
print(words)
# Output: ['date', 'apple', 'banana', 'cherry', 'elderberry']

Time Complexity: O(n · m · log n) where n is the number of strings and m is the average string length.

8. Case-Specific Sorting

Problem: Given a string with uppercase and lowercase letters, sort uppercase and lowercase letters separately, but maintain their original relative positions (uppercase stays where uppercase was, lowercase stays where lowercase was).

Trace: "eDbACa"

Step	Action	Result
1	Extract lowercase: e, b, a → sorted: a, b, e	lower = [a, b, e]
2	Extract uppercase: D, A, C → sorted: A, C, D	upper = [A, C, D]
3	Reconstruct: pos 0 (lower)→a, pos 1 (upper)→A, pos 2 (lower)→b, pos 3 (upper)→C, pos 4 (lower)→e, pos 5 (upper)→D	"aAbCeD"

C++ Implementation

C++
#include <iostream>
#include <algorithm>
using namespace std;

string caseSort(string s) {
    string lower, upper;
    for(char c : s) {
        if(islower(c)) lower += c;
        else upper += c;
    }
    sort(lower.begin(), lower.end());
    sort(upper.begin(), upper.end());

    int li = 0, ui = 0;
    for(int i = 0; i < s.size(); i++) {
        if(islower(s[i])) s[i] = lower[li++];
        else s[i] = upper[ui++];
    }
    return s;
}

int main() {
    cout << caseSort("eDbACa") << endl;
    return 0;
}
// Output: aAbCeD

Python Implementation

Python
def case_sort(s):
    lower = sorted([c for c in s if c.islower()])
    upper = sorted([c for c in s if c.isupper()])
    li = ui = 0
    result = []
    for c in s:
        if c.islower():
            result.append(lower[li]); li += 1
        else:
            result.append(upper[ui]); ui += 1
    return "".join(result)

print(case_sort("eDbACa"))
# Output: aAbCeD

9. Count Distinct Pairs with Difference of K

Problem: Given an array of integers and an integer k, count the number of distinct pairs (a, b) such that |a − b| = k.

Approach 1: Sort + Two Pointers

Sort the array
Use two pointers i and j (j > i)
If arr[j] − arr[i] == k → count it, advance both
If difference < k → advance j
If difference > k → advance i

Approach 2: HashSet

Insert all elements into a set
For each element x, check if (x + k) exists in the set

Trace: arr = [1, 5, 3, 4, 2], k = 3

Sorted: [1, 2, 3, 4, 5]

i	j	arr[j]−arr[i]	Action
0 (1)	1 (2)	1	< 3, advance j
0 (1)	2 (3)	2	< 3, advance j
0 (1)	3 (4)	3	= 3, count! Pair (1,4). Advance both.
1 (2)	4 (5)	3	= 3, count! Pair (2,5). Advance both.

Answer: 2 pairs → (1,4) and (2,5)

C++ — Sort + Two Pointer

C++
int countPairs(vector<int>& arr, int k) {
    sort(arr.begin(), arr.end());
    int count = 0, i = 0, j = 1;
    int n = arr.size();
    while(j < n) {
        int diff = arr[j] - arr[i];
        if(diff == k) {
            count++;
            i++; j++;
            // Skip duplicates
            while(j < n && arr[j] == arr[j-1]) j++;
        } else if(diff < k) {
            j++;
        } else {
            i++;
            if(i == j) j++;
        }
    }
    return count;
}

Python — HashSet Approach

Python
def count_pairs(arr, k):
    s = set(arr)
    count = 0
    for x in s:
        if x + k in s:
            count += 1
    return count

print(count_pairs([1, 5, 3, 4, 2], 3))
# Output: 2

10. Counting Sort

Intuition: Don't compare elements at all! Instead, count how many times each value appears, then reconstruct the sorted array. It's like having 10 boxes labeled 0–9, dropping each number into its box, then reading boxes left to right.

Algorithm (Step-by-Step)

Find the maximum element in the array → determines count array size
Create a count array of size (max + 1), initialized to 0
Count occurrences: for each element, increment count[element]
Compute prefix sums: count[i] += count[i−1] (gives final positions)
Build output: traverse input right-to-left, place each element at count[element]−1, decrement count

Full Trace: [4, 2, 2, 8, 3, 3, 1]

Step	Action	Array/Count State
1	max = 8	count = [0,0,0,0,0,0,0,0,0]
2	Count occurrences	count = [0,1,2,2,1,0,0,0,1]
3	Prefix sums	count = [0,1,3,5,6,6,6,6,7]
4a	Place 1 at pos count[1]−1=0	output[0]=1, count=[0,0,3,5,6,6,6,6,7]
4b	Place 3 at pos count[3]−1=4	output[4]=3, count=[0,0,3,4,6,6,6,6,7]
4c	Place 3 at pos count[3]−1=3	output[3]=3, count=[0,0,3,3,6,6,6,6,7]
4d	Place 8 at pos count[8]−1=6	output[6]=8, count=[0,0,3,3,6,6,6,6,6]
4e	Place 2 at pos count[2]−1=2	output[2]=2, count=[0,0,2,3,6,6,6,6,6]
4f	Place 2 at pos count[2]−1=1	output[1]=2, count=[0,0,1,3,6,6,6,6,6]
4g	Place 4 at pos count[4]−1=5	output[5]=4, count=[0,0,1,3,5,6,6,6,6]

Output: [1, 2, 2, 3, 3, 4, 8]

C++ Implementation

C++
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

void countingSort(vector<int>& arr) {
    int maxVal = *max_element(arr.begin(), arr.end());
    vector<int> count(maxVal + 1, 0);
    vector<int> output(arr.size());

    // Count occurrences
    for(int x : arr) count[x]++;

    // Compute prefix sums
    for(int i = 1; i <= maxVal; i++)
        count[i] += count[i - 1];

    // Build output (right to left for stability)
    for(int i = arr.size() - 1; i >= 0; i--) {
        output[count[arr[i]] - 1] = arr[i];
        count[arr[i]]--;
    }
    arr = output;
}

int main() {
    vector<int> arr = {4, 2, 2, 8, 3, 3, 1};
    countingSort(arr);
    for(int x : arr) cout << x << " ";
    return 0;
}
// Output: 1 2 2 3 3 4 8

Python Implementation

Python
def counting_sort(arr):
    max_val = max(arr)
    count = [0] * (max_val + 1)
    output = [0] * len(arr)

    for x in arr:
        count[x] += 1

    for i in range(1, max_val + 1):
        count[i] += count[i - 1]

    for x in reversed(arr):
        output[count[x] - 1] = x
        count[x] -= 1

    return output

print(counting_sort([4, 2, 2, 8, 3, 3, 1]))
# Output: [1, 2, 2, 3, 3, 4, 8]

Property	Value
Time	O(n + k) where k is the range of input values
Space	O(n + k)
Stable?	✅ Yes (when built right-to-left)
In-place?	❌ No
Limitation	Only works for non-negative integers with small range. If k is huge (e.g., 10⁹), the count array would be impossibly large.

📊 Master Comparison Table

Parameter	Merge Sort	Quick Sort	Counting Sort
Time (Best)	O(n log n)	O(n log n)	O(n + k)
Time (Average)	O(n log n)	O(n log n)	O(n + k)
Time (Worst)	O(n log n)	O(n²)	O(n + k)
Space	O(n)	O(log n)	O(n + k)
Stable?	✅ Yes	❌ No	✅ Yes
In-place?	❌ No	✅ Yes	❌ No
Best Use Case	Linked lists, external sorting, when stability needed	General purpose arrays, when average performance matters	Small integer range (ages, scores, ASCII chars)
When to Avoid	Memory-constrained systems (needs O(n) extra)	Adversarial inputs without randomization	Large value ranges (k >> n)

Indian Analogies for Sorting:
Quick Sort = Partition your bookshelf — pick one book (pivot), put smaller books left, bigger right, repeat for each side. Fast but depends on which book you pick!
Merge Sort = Two friends sort cards — each friend sorts half the deck, then they merge cards together one by one. Always works reliably.
Counting Sort = Organizing exam papers by roll number — you have 100 slots labeled 1–100. Just drop each paper in its slot. No comparisons needed!

Section D

Learn by Doing — 3-Tier Lab Structure

🟢 Tier 1 — GUIDED: Implement Merge Sort Step-by-Step

⏱️ 45–60 minutesBeginnerFollow every step exactly

Step 1: Set Up

Create a new C++ file merge_sort.cpp. Initialize array: int arr[] = {38, 27, 43, 3, 9, 82, 10};

Step 2: Write the Merge Function

Create function merge(arr, left, mid, right) that merges two sorted halves. Use two temporary arrays L[] and R[]. Compare elements one by one.

Step 3: Write the Recursive mergeSort

Base case: if left >= right, return. Find mid = (left + right) / 2. Recurse on both halves. Call merge.

Step 4: Add Print Statements

After each merge call, print the current state of the array. You should see progressive sorting.

Step 5: Verify Output

After merge: [27, 38, 43, 3, 9, 82, 10] After merge: [27, 38, 3, 43, 9, 82, 10] After merge: [3, 27, 38, 43, 9, 82, 10] After merge: [3, 27, 38, 43, 9, 82, 10] After merge: [3, 27, 38, 43, 9, 10, 82] After merge: [3, 9, 10, 27, 38, 43, 82]

Step 6: Test Edge Cases

Test with: already sorted [1,2,3,4,5], reverse sorted [5,4,3,2,1], all same [7,7,7,7], single element [42].

🟡 Tier 2 — SEMI-GUIDED: Quick Sort with Random Pivot

⏱️ 60–90 minutesIntermediateHints provided, you fill the gaps

Your Mission:

Implement Quick Sort with randomized pivot selection. Test on 5 different arrays and count comparisons for each.

Hints:

Use rand() % (high - low + 1) + low to pick random pivot index
Swap random element with last element, then use Lomuto partition
Add a global counter int comparisons = 0; and increment it every time you compare two elements in partition
Test arrays: {1,2,3,4,5} (sorted), {5,4,3,2,1} (reverse), {3,3,3,3,3} (all same), {10,80,30,90,40} (random), {1,5,3,2,4} (random)
Run each test 3 times and note how comparison count varies (because pivot is random)

Stretch Goal: Implement median-of-three pivot selection. Compare the average number of comparisons against random pivot across 10 runs on the same array.

🔴 Tier 3 — OPEN CHALLENGE: Frequency Sort + String Sort Combo

⏱️ 2–3 hoursAdvancedNo hand-holding — real problem solving

The Brief:

Given a list of 20 Indian names, perform the following:

Frequency sort by first letter: Group names by their first letter, sort groups by frequency (most common first letter first)
Within each group: Sort names lexicographically (case-insensitive)
Case-specific sort: For each name, sort the characters while maintaining uppercase/lowercase positions
Solve 2 Codeforces problems: 451B - Sort the Array and 785A - Tom Riddle's Diary

Completing this tier gives you a portfolio-worthy project. "Sorting Algorithm Suite with Custom Comparators" — add it to your GitHub and mention it in your resume's Projects section.

Section E

Problem Set — Tracing, Programming & Interview

🔍 Tracing Questions (5)

T1. Trace Merge Sort on [15, 3, 9, 8, 5, 2, 7, 1]

Show the complete recursion tree with every split and merge step. Write the array state after each merge operation.

Expected final output: [1, 2, 3, 5, 7, 8, 9, 15]

T2. Trace Quick Sort (Lomuto) on [10, 7, 8, 9, 1, 5] — Pivot = last element

Show each partition step: the pivot, the value of i after each comparison, and the array state after partitioning. Then show recursive calls.

T3. Trace Counting Sort on [1, 4, 1, 2, 7, 5, 2]

Show: (1) count array after counting, (2) count array after prefix sums, (3) output array built right-to-left step by step.

T4. How many comparisons does Merge Sort make on [5, 3, 1, 4, 2]?

Count every comparison in every merge step. Draw the recursion tree and count comparisons at each level.

T5. Show the recursion tree for Quick Sort on [3, 1, 2, 5, 4] with last-element pivot

Draw the full tree showing pivot selection, partition result, left sub-array, and right sub-array at each level.

💻 Programming Questions (8)

#	Problem	Difficulty	Key Concept
P1	Implement Merge Sort and count the number of inversions in an array. An inversion is a pair (i, j) where i < j but arr[i] > arr[j].	Intermediate	Modified merge sort
P2	Implement Quick Sort with median-of-three pivot selection (median of first, middle, last elements)	Intermediate	Pivot optimization
P3	Sort an array of 0s, 1s, and 2s in O(n) time and O(1) space (Dutch National Flag problem)	Beginner	Three-way partition
P4	Given two sorted arrays of sizes m and n, merge them into a single sorted array in O(m+n) time	Beginner	Merge procedure
P5	Find the kth smallest element using Quick Select (partition-based) in O(n) average time	Advanced	Quick Select
P6	Sort an array of strings: first by length, then lexicographically for strings of equal length	Beginner	Custom comparator
P7	Implement Counting Sort for characters — sort a string alphabetically	Beginner	Counting sort on chars
P8	Sort a nearly sorted array where each element is at most k positions away from its sorted position. Achieve O(n log k) time.	Advanced	Min-heap / insertion sort

🏢 Industry Problems (3)

I1. Flipkart Gaming Leaderboard

Design a system that maintains a leaderboard of top-100 players sorted by score. Players' scores update in real-time. When a score changes, the leaderboard must update in O(log n). Which data structure + sorting strategy would you use?

I2. Amazon India Product Sorting

You need to sort 10 crore e-commerce products by price. All prices are integers between ₹1 and ₹10,000. Which sorting algorithm gives the best performance? What is the exact time and space complexity?

I3. Twitter/X Trending Topics

Given a real-time stream of hashtags (1 million per minute), maintain the top-10 trending hashtags by frequency. Design the data structures and algorithm. Consider: what happens when a hashtag's count changes?

🎤 Interview Questions (3)

Interview Q1: Why is Quick Sort preferred for arrays but Merge Sort for linked lists?

Answer: Quick Sort benefits from array's O(1) random access for partitioning and is in-place (O(log n) stack space). Merge Sort on arrays needs O(n) extra space. For linked lists, Merge Sort is preferred because: (1) merging linked lists is easy and in-place (just change pointers), (2) Quick Sort's random access (needed for Hoare partition) is O(n) on linked lists, making it slow.

Interview Q2: Can we sort in better than O(n log n)? When and how?

Answer: Yes, but only for non-comparison-based sorts. Counting Sort: O(n+k) when range k is small. Radix Sort: O(d·(n+k)) where d is number of digits. Bucket Sort: O(n) average when input is uniformly distributed. The Ω(n log n) lower bound only applies to comparison-based sorts.

Interview Q3: Why is Quick Sort not stable? Give an example.

Answer: During partitioning, equal elements can be rearranged. Example: Array [(3,a), (2,b), (3,c), (1,d)] sorted by first value. If pivot is 3: during Lomuto partition, (3,a) and (3,c) might swap positions relative to each other. The relative order of equal elements is not preserved. Merge Sort preserves it because it always takes the left element first when values are equal.

Section F

MCQ Assessment Bank — 30 Questions (Bloom's Mapped)

Remember / Identify (Q1–Q5)

What is the worst-case time complexity of Merge Sort?

O(n)
O(n log n)
O(n²)
O(log n)

Remember

✅ Answer: (B) O(n log n) — Merge Sort always divides the array in half (log n levels) and does O(n) work at each level, giving O(n log n) in all cases.

Which of the following sorting algorithms is NOT stable?

Merge Sort
Counting Sort
Quick Sort
Insertion Sort

Remember

✅ Answer: (C) Quick Sort — During partitioning, equal elements can swap relative positions. Merge Sort, Counting Sort, and Insertion Sort all preserve relative order of equal elements.

The space complexity of Merge Sort is:

O(1)
O(log n)
O(n)
O(n²)

Remember

✅ Answer: (C) O(n) — Merge Sort requires a temporary array of size n to merge the two halves. Plus O(log n) for the recursion stack, but O(n) dominates.

Counting Sort has a time complexity of:

O(n log n)
O(n²)
O(n + k)
O(k log k)

Remember

✅ Answer: (C) O(n + k) — where n is the number of elements and k is the range of input values. We iterate through the array (O(n)) and through the count array (O(k)).

What does "in-place sorting" mean?

Sorting without using any variables
Sorting using O(1) or O(log n) extra space
Sorting that doesn't move elements
Sorting that works only on arrays

Remember

✅ Answer: (B) — An in-place sort uses at most O(1) extra space (or O(log n) for recursion stack). Quick Sort is in-place; Merge Sort is not (needs O(n) extra).

Understand / Explain (Q6–Q10)

Why is Merge Sort considered stable?

It uses less memory than Quick Sort
During merging, when two elements are equal, the one from the left half is placed first
It always picks the middle element as pivot
It sorts strings better than numbers

Understand

✅ Answer: (B) — In the merge step, when left[i] == right[j], we pick left[i] first. This preserves the original relative order of equal elements, making Merge Sort stable.

Quick Sort's worst case O(n²) occurs when:

The array has all distinct elements
The pivot always splits the array into two equal halves
The pivot is always the smallest or largest element
The array size is a power of 2

Understand

✅ Answer: (C) — When the pivot is always min or max, one partition has n−1 elements and the other has 0. This gives n levels of recursion with O(n) work each = O(n²). This happens with sorted arrays when first/last element is chosen as pivot.

Counting Sort fails when:

The array has duplicate elements
The range of values (k) is much larger than n
The array has negative numbers (without modification)
Both B and C

Understand

✅ Answer: (D) — Counting Sort needs a count array of size k+1. If k >> n (e.g., k=10⁹), the count array would need gigabytes of memory. It also doesn't handle negative numbers directly (needs offset).

The recursion depth of Merge Sort for an array of n elements is:

O(n)
O(log n)
O(n log n)
O(1)

Understand

✅ Answer: (B) O(log n) — The array is halved at each level of recursion. Starting from n, after log₂(n) halvings, we reach sub-arrays of size 1 (base case).

Q10

How does random pivot selection improve Quick Sort?

It guarantees O(n log n) worst case
It makes the expected running time O(n log n) regardless of input distribution
It reduces space complexity to O(1)
It makes Quick Sort stable

Understand

✅ Answer: (B) — Random pivot doesn't guarantee O(n log n) worst case (that's still O(n²) in extremely unlikely cases), but it makes the expected time O(n log n) for ANY input, including sorted arrays.

Apply / Implement (Q11–Q15)

Q11

After the first merge pass of bottom-up Merge Sort on [5, 1, 3, 2, 8, 6, 4, 7], the array becomes:

[1, 5, 2, 3, 6, 8, 4, 7]
[1, 2, 3, 5, 4, 6, 7, 8]
[1, 3, 5, 2, 6, 8, 4, 7]
[5, 1, 2, 3, 6, 8, 4, 7]

Apply

✅ Answer: (A) — First pass merges pairs of size 1: (5,1)→[1,5], (3,2)→[2,3], (8,6)→[6,8], (4,7)→[4,7]. Result: [1, 5, 2, 3, 6, 8, 4, 7].

Q12

Using Lomuto partition with pivot = 6 on array [3, 8, 2, 6], what is the partition index?

Apply

✅ Answer: (C) 2 — Pivot=6. j=0: 3≤6, i→0, swap(0,0)=[3,8,2,6]. j=1: 8>6, skip. j=2: 2≤6, i→1, swap(1,2)=[3,2,8,6]. Final swap(i+1=2, pivot=3)=[3,2,6,8]. Pivot 6 is at index 2.

Q13

For Counting Sort on [3, 1, 3, 2, 1], what is the count array after counting (before prefix sum)?

[0, 2, 1, 2]
[0, 1, 2, 3]
[2, 1, 2, 0]
[0, 2, 1, 2, 0]

Apply

✅ Answer: (A) [0, 2, 1, 2] — max=3, so count array has indices 0-3. count[0]=0, count[1]=2 (two 1s), count[2]=1 (one 2), count[3]=2 (two 3s).

Q14

What is the result of merging sorted arrays [1, 3, 5] and [2, 4, 6]?

[1, 2, 3, 4, 5, 6]
[2, 4, 6, 1, 3, 5]
[1, 3, 5, 2, 4, 6]
[6, 5, 4, 3, 2, 1]

Apply

✅ Answer: (A) [1, 2, 3, 4, 5, 6] — Two-pointer merge: compare 1 vs 2→take 1, 3 vs 2→take 2, 3 vs 4→take 3, 5 vs 4→take 4, 5 vs 6→take 5, take 6.

Q15

Frequency sort of [4, 5, 6, 5, 4, 3] produces:

[3, 4, 4, 5, 5, 6]
[4, 4, 5, 5, 3, 6]
[5, 5, 4, 4, 3, 6]
[6, 5, 5, 4, 4, 3]

Apply

✅ Answer: (B) [4, 4, 5, 5, 3, 6] — Frequencies: 4→2, 5→2, 3→1, 6→1. Sort by freq desc, then value asc: 4(2), 5(2), 3(1), 6(1).

Analyze / Compare (Q16–Q20)

Q16

For sorting 10 million integers in range [1, 100], which algorithm is optimal?

Merge Sort
Quick Sort
Counting Sort
Bubble Sort

Analyze

✅ Answer: (C) Counting Sort — Range k=100 is tiny. O(n+k) = O(10⁷+100) ≈ O(n). Much faster than O(n log n) ≈ 2.3×10⁸ for Merge/Quick Sort.

Q17

Which sorting algorithm has the best space complexity among the following?

Merge Sort — O(n)
Quick Sort — O(log n)
Counting Sort — O(n+k)
All have the same space complexity

Analyze

✅ Answer: (B) Quick Sort — O(log n) for recursion stack. Merge Sort needs O(n) extra array. Counting Sort needs O(n+k). Quick Sort is the most memory-efficient of the three.

Q18

You need to sort student records by marks, and students with the same marks must appear in the order they were enrolled. Which algorithm should you use?

Quick Sort
Merge Sort
Selection Sort
Heap Sort

Analyze

✅ Answer: (B) Merge Sort — The requirement "same marks → maintain enrollment order" requires a STABLE sort. Among the options, only Merge Sort is stable. Quick Sort, Selection Sort, and Heap Sort are all unstable.

Q19

Lomuto partition performs more swaps than Hoare partition because:

Lomuto uses two nested loops
Lomuto swaps an element every time it finds one ≤ pivot, even if it's already in place
Lomuto uses a random pivot
Hoare doesn't swap elements at all

Analyze

✅ Answer: (B) — Lomuto increments i and swaps every time arr[j] ≤ pivot, even when the element is already on the correct side. Hoare only swaps when elements on both sides are on the wrong side, resulting in ~3× fewer swaps on average.

Q20

For sorting a linked list of 1 million nodes, the best choice is:

Quick Sort
Merge Sort
Counting Sort
Insertion Sort

Analyze

✅ Answer: (B) Merge Sort — Merge Sort on linked lists uses O(1) extra space (merge by pointer manipulation) and O(n log n) time. Quick Sort suffers because random access is O(n) on linked lists.

Evaluate / Judge (Q21–Q25)

Q21

A company needs to sort 100 million user records daily. They're considering Merge Sort vs Quick Sort. Which evaluation is correct?

Quick Sort is always better because it's in-place
Merge Sort is always better because it's stable
Quick Sort with randomized pivot is preferred for speed, but Merge Sort is used if stability is critical
Neither works at this scale; only Counting Sort can handle it

Evaluate

✅ Answer: (C) — Quick Sort with random pivot is 2-3× faster in practice (better cache performance, in-place). But if the sort must be stable (preserve order of equal elements), Merge Sort or Timsort is required. The choice depends on the requirement.

Q22

A student claims: "Randomized Quick Sort guarantees O(n log n) time." Is this correct?

Yes, randomization eliminates the worst case
No, it makes the expected time O(n log n) but worst case is still O(n²)
Yes, but only for arrays with distinct elements
No, randomization makes it O(n)

Evaluate

✅ Answer: (B) — Randomized Quick Sort's EXPECTED time is O(n log n) for any input. But in extremely rare cases (probability ≈ 1/n!), the random pivots could all be worst-case choices, giving O(n²). However, this is so unlikely it never happens in practice.

Q23

IntroSort (used in C++ STL's std::sort) combines Quick Sort, Heap Sort, and Insertion Sort. Why this combination?

To make the sort stable
Quick Sort for speed, Heap Sort to prevent O(n²) worst case, Insertion Sort for small sub-arrays
To reduce space complexity to O(1)
To handle strings better than numbers

Evaluate

✅ Answer: (B) — Quick Sort is fastest on average. If recursion depth exceeds 2·log₂(n) (indicating bad pivots), it switches to Heap Sort (guaranteed O(n log n)). For sub-arrays < 16 elements, Insertion Sort is used (fastest for tiny arrays due to low overhead).

Q24

For an e-commerce site sorting products by price (₹1–₹10,000) and then by rating (1–5 stars), stability matters because:

Unstable sorts are slower
When sorting by rating after price, a stable sort preserves the price order within same ratings
Stability reduces memory usage
Only stable sorts work on floating-point numbers

Evaluate

✅ Answer: (B) — If you first sort by price, then by rating using a stable sort, products with the same rating remain sorted by price. With an unstable sort, the price order within same-rated products gets scrambled.

Q25

Counting Sort on ages (0–150) of 1 billion people would use approximately how much extra memory?

~600 bytes (count array only)
~4 GB (output array of 1 billion integers)
~4 GB + 600 bytes (output array + count array)
~150 GB

Evaluate

✅ Answer: (C) — Count array: 151 × 4 bytes ≈ 604 bytes (tiny). Output array: 10⁹ × 4 bytes ≈ 4 GB. Total extra: ~4 GB. The count array size is negligible; the output array dominates.

Create / Design (Q26–Q30)

Q26

To sort objects with properties {name, age, salary} by salary (primary) and name (secondary), you would:

Sort by name first (stable), then sort by salary (stable)
Sort by salary first, then sort by name
Use Quick Sort with a combined comparator
Both A and C would work correctly

Create

✅ Answer: (D) — Method A: Sort by secondary key (name) first with stable sort, then by primary key (salary) with stable sort. The stability preserves name order within same salary. Method C: Single sort with comparator that compares salary first, then name. Both are correct.

Q27

For competitive programming, you're given an array of 10⁵ integers in range [−10⁹, 10⁹]. The best sorting approach is:

Counting Sort
std::sort() (IntroSort)
Merge Sort
Radix Sort

Create

✅ Answer: (B) std::sort() — Range is too large for Counting Sort. n=10⁵ is moderate. std::sort() (IntroSort) is the fastest practical choice with O(n log n) guaranteed. No need to implement from scratch in contests.

Q28

You're designing a hybrid sort that uses Counting Sort for arrays where max−min ≤ 1000, and Quick Sort otherwise. This is a good design because:

Counting Sort is O(n+k) when k is small, beating O(n log n)
Quick Sort handles the general case efficiently
Both A and B
This design never works in practice

Create

✅ Answer: (C) — When the range is small (k ≤ 1000), Counting Sort runs in O(n+1000) ≈ O(n), faster than O(n log n). For general data, Quick Sort's O(n log n) average with O(log n) space is excellent. The hybrid adaptively picks the best algorithm.

Q29

To create a frequency-based ranking system (like trending hashtags), the most efficient approach is:

Sort all hashtags by name, then count
Use a HashMap for counting + Min-Heap of size k for top-k
Use Bubble Sort on frequencies
Use Counting Sort on hashtag names

Create

✅ Answer: (B) — HashMap gives O(1) per count update. Min-Heap of size k maintains the top-k in O(log k) per update. Total: O(n log k) for n hashtags and top-k. Much better than sorting all n hashtags every time.

Q30

You need to sort 1 TB of data that doesn't fit in RAM (16 GB). The best approach is:

Quick Sort with virtual memory
External Merge Sort: split into chunks, sort each in RAM, merge using k-way merge
Counting Sort with disk-based count array
Bubble Sort with disk caching

Create

✅ Answer: (B) External Merge Sort — Split 1 TB into ~64 chunks of 16 GB each. Sort each chunk in RAM using IntroSort. Write sorted chunks to disk. K-way merge all 64 sorted chunks using a min-heap. This is how databases and Hadoop sort massive datasets.

Section G

Short Answer Questions (8)

SA1. What is the lower bound for comparison-based sorting? Prove it briefly.

Answer: The lower bound is Ω(n log n). Any comparison-based sort can be modeled as a binary decision tree where each leaf represents one of the n! possible permutations. A binary tree with n! leaves has height h ≥ log₂(n!). By Stirling's approximation, log₂(n!) ≈ n·log₂(n) − n·log₂(e) = Θ(n log n). Therefore, at least Ω(n log n) comparisons are needed in the worst case.

SA2. Explain the merge procedure in Merge Sort with an example.

Answer: The merge procedure combines two sorted sub-arrays into one sorted array using two pointers. Example: Merge [2, 5, 8] and [1, 4, 7]. Pointer i starts at index 0 of left, pointer j at index 0 of right. Compare left[i] and right[j]: 2 vs 1 → pick 1. 2 vs 4 → pick 2. 5 vs 4 → pick 4. 5 vs 7 → pick 5. 8 vs 7 → pick 7. Remaining: pick 8. Result: [1, 2, 4, 5, 7, 8]. Time: O(n₁ + n₂) where n₁, n₂ are the sizes of the two sub-arrays.

SA3. What is the difference between Lomuto and Hoare partition schemes?

Answer: Lomuto: Pivot is the last element. One pointer (i) tracks the boundary; another (j) scans left to right. Simpler to implement but does ~3× more swaps. Hoare: Pivot is typically the first element. Two pointers start from both ends and move inward, swapping misplaced elements. Fewer swaps (~n/6 vs n/2 on average). Hoare is faster in practice but slightly harder to implement correctly. Hoare returns an index such that arr[lo..j] ≤ pivot ≤ arr[j+1..hi], while Lomuto places the pivot at its final position.

SA4. Why is Quick Sort not stable? Give a concrete example.

Answer: Quick Sort is not stable because the partitioning step can rearrange equal elements. Example: Array [(5,A), (3,B), (5,C), (2,D)] sorted by first value with Lomuto partition (pivot=2): After partitioning, 2 goes to position 0, but (5,A) and (5,C) might end up as (5,C), (5,A) — their relative order is reversed. The swap operation in partition doesn't respect original ordering of equal elements.

SA5. When would you choose Counting Sort over Quick Sort?

Answer: Choose Counting Sort when: (1) The range of values k is small relative to n (e.g., sorting ages 0–150, exam scores 0–100, ASCII characters 0–127). (2) You need a stable sort. (3) You can afford O(n+k) extra space. If k ≤ n, Counting Sort runs in O(n) — faster than Quick Sort's O(n log n). Avoid Counting Sort when k >> n (e.g., sorting arbitrary 32-bit integers where k=4×10⁹).

SA6. Explain how frequency sort works with an example.

Answer: Frequency sort orders elements by their occurrence count (most frequent first). Steps: (1) Count frequency of each element using a hash map. (2) Sort using a custom comparator: compare by frequency descending; if equal, by value ascending. Example: [4, 6, 2, 6, 4, 4]. Frequencies: 4→3, 6→2, 2→1. Sorted by frequency: [4,4,4,6,6,2]. Time: O(n log n) for the sort step. Space: O(n) for the hash map.

SA7. What is the minimum length unsorted subarray problem? Describe the approach.

Answer: Given an array, find the shortest sub-array that needs to be sorted to make the entire array sorted. Approach: (1) From the left, find the first position where arr[i] > arr[i+1] — this is the start of disorder. (2) From the right, find the first position where arr[i] < arr[i−1] — this is the end. (3) Find the min and max within this window. (4) Extend left until no element before start is > sub-min. (5) Extend right until no element after end is < sub-max. Result: end − start + 1. Time: O(n), Space: O(1).

SA8. How does randomized pivot selection improve Quick Sort's performance?

Answer: Without randomization, an adversary can craft inputs that always give worst-case O(n²) performance (e.g., sorted array with first-element pivot). With random pivot, no fixed input can consistently cause worst case — the expected number of comparisons is 2n·ln(n) ≈ 1.39·n·log₂(n) for any input. This makes the algorithm input-oblivious: its expected performance is O(n log n) regardless of whether the input is sorted, reverse-sorted, or random.

Section H

Long Answer Questions (3)

LA1. Compare Merge Sort and Quick Sort in detail (15 marks)

Requirement: Compare on at least 8 parameters. Include code for both (C++ or Python). Trace both algorithms on the array [12, 11, 13, 5, 6, 7]. Explain when each is preferred with real-world examples.

Expected Answer Structure:

Algorithm Description — Divide-and-conquer approach for each
Code — Complete implementation of both
Trace — Step-by-step execution on [12, 11, 13, 5, 6, 7]
Comparison Table: Time (best/avg/worst), Space, Stability, In-place, Cache performance, Linked list performance, Parallelizability, Worst-case guarantee
When to use Merge Sort: Linked lists, external sorting, when stability needed, when worst-case guarantee required (e.g., real-time systems)
When to use Quick Sort: Arrays, in-memory sorting, general purpose, when average performance matters more than worst case

LA2. Counting Sort — Complete Analysis (10 marks)

Requirement: Explain the algorithm step by step. Provide full trace on [4, 2, 2, 8, 3, 3, 1]. Include C++ code. Analyze time and space complexity. Discuss limitations and when Counting Sort outperforms comparison-based sorts.

Expected Answer Structure:

Algorithm — 5 steps with explanation
Trace — Show count array at each step, prefix sums, output construction
Code — Complete C++ implementation
Complexity: Time O(n+k), Space O(n+k), proof
Stability proof: Why building output right-to-left preserves order
Limitations: Large range, negative numbers, non-integer data
Advantages: Linear time, beats O(n log n) lower bound (non-comparison based)

LA3. Multi-Field Sorting Strategy for 10 Crore Records (15 marks)

Scenario: You are given 10 crore user records with fields: name (string), age (int, 0–150), salary (int, ₹0–₹50,00,000). Design a complete sorting strategy for sorting by multiple fields.

Expected Answer Structure:

For age: Use Counting Sort — range is 0–150 (k=151), O(n+k) ≈ O(n). Perfect for small integer range.
For salary: Radix Sort on salary (7 digits max) — O(d·(n+10)) where d=7. Or Counting Sort with range 50,00,001 if memory permits (~20 MB).
For name: Must use comparison-based sort — O(n·m·log n) where m is average name length. Use Merge Sort for stability.
Multi-field sort strategy: Sort by least significant key first (name), then salary, then age — each with a STABLE sort. This gives correct multi-key ordering.
Alternative: Single sort with compound comparator that compares primary key first, then secondary, then tertiary.
Memory analysis: 10⁸ records × ~50 bytes each = 5 GB. External sort needed if RAM < 5 GB.

Section I

Lab Programs — Complete Code with Output & Viva

Lab 1: Merge Sort Implementation

📁 merge_sort.cpp⏱️ 30 minutesBeginner

Complete Code

C++
#include <iostream>
using namespace std;

void merge(int arr[], int l, int m, int r) {
    int n1 = m - l + 1;
    int n2 = r - m;
    int L[n1], R[n2];

    for(int i = 0; i < n1; i++) L[i] = arr[l + i];
    for(int j = 0; j < n2; j++) R[j] = arr[m + 1 + j];

    int i = 0, j = 0, k = l;
    while(i < n1 && j < n2) {
        if(L[i] <= R[j])
            arr[k++] = L[i++];
        else
            arr[k++] = R[j++];
    }
    while(i < n1) arr[k++] = L[i++];
    while(j < n2) arr[k++] = R[j++];
}

void mergeSort(int arr[], int l, int r) {
    if(l < r) {
        int m = l + (r - l) / 2;
        mergeSort(arr, l, m);
        mergeSort(arr, m + 1, r);
        merge(arr, l, m, r);
    }
}

void printArray(int arr[], int n) {
    for(int i = 0; i < n; i++)
        cout << arr[i] << " ";
    cout << endl;
}

int main() {
    int arr[] = {38, 27, 43, 3, 9, 82, 10};
    int n = sizeof(arr) / sizeof(arr[0]);

    cout << "Original array: ";
    printArray(arr, n);

    mergeSort(arr, 0, n - 1);

    cout << "Sorted array:   ";
    printArray(arr, n);
    return 0;
}

Output

Original array: 38 27 43 3 9 82 10 Sorted array: 3 9 10 27 38 43 82

Viva Questions & Answers

#	Question	Answer
V1	What is the time complexity of Merge Sort in all cases?	O(n log n) in best, average, and worst cases. The array is always split in half (log n levels) and each level does O(n) merge work.
V2	Why does Merge Sort need O(n) extra space?	The merge function creates temporary arrays L[] and R[] to hold the two halves before merging. Total extra space is proportional to n.
V3	Is Merge Sort stable? Why?	Yes. In the merge step, when L[i] == R[j], we pick L[i] first (≤ condition). This preserves the original order of equal elements.
V4	What happens if you forget the trailing while loops in merge?	Remaining elements from the longer sub-array won't be copied. The merged result will be incomplete and the array will have garbage values in those positions.
V5	Why use m = l + (r - l) / 2 instead of m = (l + r) / 2?	To prevent integer overflow. If l and r are both large (near INT_MAX), l + r could overflow. The first formula avoids this.

Lab 2: Quick Sort Implementation

📁 quick_sort.cpp⏱️ 30 minutesIntermediate

Complete Code

C++
#include <iostream>
using namespace std;

int partition(int arr[], int low, int high) {
    int pivot = arr[high]; // Lomuto: pivot = last element
    int i = low - 1;

    for(int j = low; j < high; j++) {
        if(arr[j] <= pivot) {
            i++;
            swap(arr[i], arr[j]);
        }
    }
    swap(arr[i + 1], arr[high]);
    return i + 1;
}

void quickSort(int arr[], int low, int high) {
    if(low < high) {
        int pi = partition(arr, low, high);
        quickSort(arr, low, pi - 1);
        quickSort(arr, pi + 1, high);
    }
}

void printArray(int arr[], int n) {
    for(int i = 0; i < n; i++)
        cout << arr[i] << " ";
    cout << endl;
}

int main() {
    int arr[] = {10, 7, 8, 9, 1, 5};
    int n = sizeof(arr) / sizeof(arr[0]);

    cout << "Original array: ";
    printArray(arr, n);

    quickSort(arr, 0, n - 1);

    cout << "Sorted array:   ";
    printArray(arr, n);
    return 0;
}

Output

Original array: 10 7 8 9 1 5 Sorted array: 1 5 7 8 9 10

Viva Questions & Answers

#	Question	Answer
V1	What is the worst-case time complexity of Quick Sort?	O(n²). Occurs when the pivot is always the smallest or largest element (e.g., sorted array with last element as pivot).
V2	How do you avoid the worst case?	Use randomized pivot selection (swap a random element to the pivot position) or median-of-three strategy. This makes expected time O(n log n) for any input.
V3	Is Quick Sort in-place?	Yes. It uses O(log n) extra space for the recursion stack only. No temporary arrays are needed (unlike Merge Sort).
V4	What is the difference between Lomuto and Hoare partition?	Lomuto: pivot at end, single scan left→right, more swaps. Hoare: pivot at start, two pointers from both ends, fewer swaps (~3× fewer on average). Hoare is faster in practice.
V5	Why is Quick Sort not stable?	The partition step swaps elements across the pivot boundary, which can change the relative order of equal elements. Example: [3a, 2, 3b, 1] might become [1, 2, 3b, 3a] after partitioning.

Lab 3: Counting Sort Implementation

📁 counting_sort.cpp⏱️ 25 minutesBeginner

Complete Code

C++
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

void countingSort(int arr[], int n) {
    int maxVal = *max_element(arr, arr + n);
    vector<int> count(maxVal + 1, 0);
    vector<int> output(n);

    // Step 1: Count occurrences
    for(int i = 0; i < n; i++)
        count[arr[i]]++;

    // Step 2: Prefix sums
    for(int i = 1; i <= maxVal; i++)
        count[i] += count[i - 1];

    // Step 3: Build output (right to left for stability)
    for(int i = n - 1; i >= 0; i--) {
        output[count[arr[i]] - 1] = arr[i];
        count[arr[i]]--;
    }

    // Copy back
    for(int i = 0; i < n; i++)
        arr[i] = output[i];
}

int main() {
    int arr[] = {4, 2, 2, 8, 3, 3, 1};
    int n = sizeof(arr) / sizeof(arr[0]);

    cout << "Original array: ";
    for(int i = 0; i < n; i++) cout << arr[i] << " ";
    cout << endl;

    countingSort(arr, n);

    cout << "Sorted array:   ";
    for(int i = 0; i < n; i++) cout << arr[i] << " ";
    cout << endl;
    return 0;
}

Output

Original array: 4 2 2 8 3 3 1 Sorted array: 1 2 2 3 3 4 8

Viva Questions & Answers

#	Question	Answer
V1	What is the time complexity of Counting Sort?	O(n + k) where n is the number of elements and k is the range (max value). We traverse the array once (O(n)) and the count array once (O(k)).
V2	Why does Counting Sort beat the O(n log n) lower bound?	Because it's NOT comparison-based. The Ω(n log n) lower bound only applies to comparison-based sorts. Counting Sort uses element values as indices, not comparisons.
V3	Why do we build the output right-to-left?	To maintain stability. Elements appearing later in the input should appear later among duplicates in the output. Going right-to-left ensures this: the last occurrence of a value gets the highest index.
V4	Can Counting Sort handle negative numbers?	Not directly. You need to add an offset: find the minimum value, add \|min\| to all elements, sort, then subtract \|min\| back. Or use a shifted count array.
V5	When should you NOT use Counting Sort?	When the range k is much larger than n. E.g., sorting [1, 1000000000] — you'd need a count array of size 10⁹ (4 GB!) for just 2 elements. Use comparison-based sort instead.

Lab 4: String Sorting (Lexicographic)

📁 string_sort.cpp⏱️ 20 minutesBeginner

Complete Code

C++
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

int main() {
    vector<string> words = {"banana", "apple", "cherry",
                              "date", "elderberry"};

    // Lexicographic sort (default)
    cout << "Original: ";
    for(auto& w : words) cout << w << " ";
    cout << endl;

    sort(words.begin(), words.end());
    cout << "Lexicographic: ";
    for(auto& w : words) cout << w << " ";
    cout << endl;

    // Sort by length
    sort(words.begin(), words.end(),
        [](const string& a, const string& b) {
            if(a.length() != b.length())
                return a.length() < b.length();
            return a < b; // Same length → lexicographic
        });
    cout << "By length:     ";
    for(auto& w : words) cout << w << " ";
    cout << endl;

    // Case-insensitive sort
    vector<string> mixed = {"Banana", "apple", "Cherry", "date"};
    sort(mixed.begin(), mixed.end(),
        [](string a, string b) {
            transform(a.begin(), a.end(), a.begin(), ::tolower);
            transform(b.begin(), b.end(), b.begin(), ::tolower);
            return a < b;
        });
    cout << "Case-insensitive: ";
    for(auto& w : mixed) cout << w << " ";
    cout << endl;

    return 0;
}

Output

Original: banana apple cherry date elderberry Lexicographic: apple banana cherry date elderberry By length: date apple banana cherry elderberry Case-insensitive: apple Banana Cherry date

Viva Questions & Answers

#	Question	Answer
V1	What does lexicographic order mean?	Dictionary order. Strings are compared character by character from left to right using ASCII values. "apple" < "banana" because 'a' (97) < 'b' (98).
V2	What is the time complexity of sorting n strings of average length m?	O(n · m · log n). We do O(n log n) comparisons, and each string comparison takes O(m) time.
V3	How does a custom comparator work in C++ sort?	You pass a lambda or function that takes two elements and returns true if the first should come before the second. std::sort uses this to determine ordering.
V4	How is "app" compared to "apple" lexicographically?	"app" < "apple" because "app" is a prefix of "apple" and is shorter. When one string is a prefix of another, the shorter one comes first.
V5	How would you sort strings in reverse order?	Use sort(words.begin(), words.end(), greater<string>()) or sort with a custom comparator that returns b < a instead of a < b.

Section J

Industry Spotlight — A Day in the Life

👨‍💻 Amit Kumar, 26 — SDE-I at Flipkart, Bangalore

Background: B.Tech in Computer Science from NIT Kurukshetra. Started competitive programming in 2nd year on Codeforces (reached rating 1800+ Expert). Solved 500+ problems with sorting as a key skill. Cracked Flipkart's interview in final year — 2 out of 3 coding rounds had sorting-based problems.

A Typical Day:

9:30 AM — Morning standup with the Search & Discovery team. Review metrics: search latency, click-through rate, result relevance scores.

10:00 AM — Work on search ranking algorithm. Products are sorted by a relevance score = f(price, rating, reviews, seller_score). Uses a custom comparator with weighted factors.

11:30 AM — Optimize a database query that sorts 2 crore products by multiple fields. Switch from naive sort to indexed sort — response time drops from 800ms to 45ms.

1:00 PM — Lunch at Flipkart campus. Discuss algorithm design with senior engineer.

2:00 PM — Code review for a teammate's implementation of external merge sort for offline analytics pipeline processing 500 GB of user behavior data.

4:00 PM — Write unit tests for the sorting module. Test edge cases: empty arrays, single element, all duplicates, reverse sorted.

5:30 PM — Personal growth: solve 2 problems on Codeforces. Today's theme: sorting + binary search combinations.

Detail	Info
Tools Used Daily	C++, Python, STL sort, custom comparators, MySQL ORDER BY, Redis sorted sets
Entry Salary (2024)	₹6–12 LPA + ESOPs
Mid-Level (3–5 yrs)	₹15–25 LPA
Senior (7+ yrs)	₹30–60 LPA
Companies Hiring	Flipkart, Amazon, Google, Microsoft, Paytm, PhonePe, Swiggy, Razorpay, Zomato, Cred, Meesho

Amit's advice to students: "Sorting isn't just about implementing algorithms — it's about knowing WHEN to use WHICH sort. In my Flipkart interview, they didn't ask me to write Merge Sort from scratch. They asked me: 'Given 10 crore products with prices in range ₹1–₹10,000, how would you sort them?' Counting Sort was the answer. Know your tools."

Section K

Earn With It — Freelance & Income Roadmap

💰 Your Earning Path After This Unit

Portfolio Piece: "Sorting Algorithm Visualizer" — a web app or desktop tool that visually demonstrates Merge Sort, Quick Sort, and Counting Sort with step-by-step animations.

Earning Paths:

• Algorithm tutoring on Chegg/Doubtnut/StudyX: answer DSA questions — ₹8,000–₹15,000/month

• Competitive programming content creation on YouTube/Instagram: teach sorting concepts — ₹5,000–₹20,000/month

• Freelance DSA problem solving and assignment help: ₹10,000–₹30,000/month

• Contribute solutions to GeeksforGeeks/LeetCode: build brand + earn referral traffic — ₹3,000–₹8,000/month

Platform	Best For	Typical Rate
Chegg	Answering CS questions	₹150–₹300/question (₹8K–15K/month)
Doubtnut / StudyX	Indian student doubt solving	₹100–₹200/answer
YouTube	Algorithm tutorial videos	₹5K–₹50K/month (with 10K+ subscribers)
Unacademy	Teaching DSA courses	₹15K–₹40K/month (part-time educator)
Fiverr	Coding assignment help	$10–$30/assignment

⏱️ Time to First Earning: 1–2 weeks (if you sign up on Chegg and start answering sorting-related questions)

Build a YouTube channel explaining algorithms visually. Sorting is the #1 most searched algorithm topic. A 10-minute video on "Merge Sort Explained Simply" can get 50K+ views. Indian CS students desperately need clear algorithm explanations in Hinglish. Start with 3 videos: Merge Sort, Quick Sort, Counting Sort. Use Manim or simple animations.

Section L

Chapter Summary — Key Takeaways

🎯 What You Learned in Unit 6

O(n²) vs O(n log n): The performance gap is massive at scale — 50,000× faster at n=10⁶
Lower Bound: Comparison-based sorting can't do better than Ω(n log n) — proven via decision tree argument
Merge Sort: O(n log n) always, stable, O(n) space. Best for linked lists and external sorting.
Quick Sort: O(n log n) average, in-place, cache-friendly. Fastest in practice for arrays. Use random pivot!
Counting Sort: O(n+k) linear time for small ranges. Beats the lower bound because it's non-comparison-based.
Frequency Sort: HashMap + custom comparator. Used in trending systems.
String Sorting: Lexicographic comparison. O(n·m·log n) for n strings of avg length m.
C++ std::sort uses IntroSort — a hybrid of Quick Sort + Heap Sort + Insertion Sort
Python's sorted() uses Timsort — a hybrid of Merge Sort + Insertion Sort

📊 Complete Sorting Algorithm Comparison

Algorithm	Best	Average	Worst	Space	Stable	In-place	Best Use Case
Bubble Sort	O(n)	O(n²)	O(n²)	O(1)	✅	✅	Teaching, tiny arrays
Selection Sort	O(n²)	O(n²)	O(n²)	O(1)	❌	✅	Minimum swaps needed
Insertion Sort	O(n)	O(n²)	O(n²)	O(1)	✅	✅	Nearly sorted, small n
Merge Sort	O(n log n)	O(n log n)	O(n log n)	O(n)	✅	❌	Linked lists, stability needed
Quick Sort	O(n log n)	O(n log n)	O(n²)	O(log n)	❌	✅	General-purpose arrays
Counting Sort	O(n+k)	O(n+k)	O(n+k)	O(n+k)	✅	❌	Small integer range

Section M

Earning Checkpoint — Are You Ready?

Skill	Tool Practiced	Portfolio Evidence	Ready to Earn?
Merge Sort	C++ / Python	Working implementation with edge case tests	✅ Yes — can solve interview problems
Quick Sort	C++ / Python	Lomuto + Hoare + Randomized implementations	✅ Yes — competitive coding problems
Counting Sort	C++ / Python	Working code with trace verification	✅ Yes — can answer Chegg questions
Frequency Sort	HashMap + Custom Comparator	Solution for frequency-based sorting problems	✅ Yes — trending systems design
String Sorting	C++ STL sort / Python sorted	Custom comparator implementations	✅ Yes — ready for string manipulation tasks
Algorithm Analysis	Complexity tables, traces	Comparison table of 6 algorithms	✅ Yes — can teach/tutor DSA
Problem Solving	Codeforces, LeetCode	5+ sorting problems solved	✅ Yes — competitive programming contests

Minimum Viable Earning Setup after this unit: Chegg/StudyX account + ability to solve sorting problems with clear explanations = ₹8,000–₹15,000/month from answering DSA questions. Combine with a GitHub portfolio of sorting implementations + a YouTube video on "Quick Sort vs Merge Sort" = ₹15,000–₹30,000/month potential.

✅ Unit 6 complete. Ready for Unit 7: Searching & Hashing!

[QR: Link to EduArtha video tutorial — Sorting Algorithms Masterclass]