Operating Systems

Unit 5: Deadlock

From deadlock characterization to Banker's Algorithm โ€” master numerical problems, resource allocation graphs, and prevention strategies used in real operating systems.

โฑ๏ธ Time to Complete: 7 hrs theory + 5 hrs lab  |  ๐Ÿ’ฐ Earning Potential: โ‚น8Kโ€“โ‚น25K/month  |  ๐Ÿ“ 30 MCQs (Bloom's Mapped)

๐Ÿ’ผ Jobs this unlocks: Systems Engineer (โ‚น6โ€“10 LPA)  |  Site Reliability Engineer (โ‚น8โ€“18 LPA)

Section A

Opening Hook โ€” The Deadlock You Can See

๐Ÿš— Mumbai's Chandni Chowk Lane โ€” Two Autos, Zero Movement

Picture this: a narrow lane in Mumbai's Chandni Chowk area. Two auto-rickshaws enter from opposite ends โ€” nose-to-nose. Neither can reverse because behind each auto, 50 more autos have queued up. The lane is barely wide enough for one vehicle. Auto A holds position in the left half and needs the right half to pass. Auto B holds position in the right half and needs the left half to pass. Both hold what the other needs. Neither will back down.

The entire lane is DEADLOCKED. Traffic police arrive, but even they can't solve it without forcibly towing one auto (preemption) or making all autos in one direction reverse (rollback). Now imagine this happening inside your operating system โ€” but instead of autos, there are 4 processes, and instead of lane space, they hold printers, memory blocks, and disk drives.

"Could YOU have prevented this?" โ€” That's exactly what this unit teaches. You'll learn to detect, prevent, avoid, and recover from deadlocks. And you'll solve Banker's Algorithm numericals that appear in every OS exam.

๐Ÿ‡ฎ๐Ÿ‡ณ Flipkart๐Ÿ‡ฎ๐Ÿ‡ณ RBI Core Banking๐Ÿ‡ฎ๐Ÿ‡ณ IRCTC๐Ÿ‡ฎ๐Ÿ‡ณ Google India๐Ÿ‡ฎ๐Ÿ‡ณ Amazon India๐Ÿ‡ฎ๐Ÿ‡ณ Razorpay
In 2015, a real deadlock in the MySQL database at Flipkart during Big Billion Days caused thousands of orders to hang for 47 minutes. Two transaction threads each held a row lock and waited for the other's lock. The fix? Implementing a deadlock detection timeout of 5 seconds. This single bug cost an estimated โ‚น2.3 crore in lost sales. The engineer who fixed it? She used the exact Banker's Algorithm logic you'll learn in this chapter.
Section B

Learning Outcomes โ€” Bloom's Taxonomy Mapped

Bloom's LevelLearning Outcome
๐Ÿ”ต RememberList the 4 necessary conditions for deadlock and define safe/unsafe states
๐Ÿ”ต UnderstandExplain why circular wait + hold & wait cause deadlock using RAG diagrams
๐ŸŸข ApplyExecute Banker's Algorithm step-by-step: compute Need matrix, find safe sequence, evaluate resource requests
๐ŸŸข AnalyzeAnalyze Resource Allocation Graphs to detect cycles and determine if deadlock exists
๐ŸŸ  EvaluateCompare prevention vs avoidance vs detection strategies and recommend the best approach for given scenarios
๐ŸŸ  CreateImplement a complete Banker's Algorithm simulator in Python and build a RAG visualizer
Section C

Concept Explanation โ€” Deadlock from Scratch

1. What is Deadlock?

Plain English: Deadlock is a situation where two or more processes are stuck forever, each waiting for a resource that the other process holds. Nobody makes progress. Nobody gives up. The system is frozen.

๐Ÿ”’ Formal Definition

A set of processes is in a deadlocked state when every process in the set is waiting for an event (resource release) that can only be caused by another process in the same set.

Analogy

Four friends sit at a round table for a Chinese dinner. There are only 4 chopsticks โ€” one between each pair of friends. Each person grabs the chopstick on their left. Now everyone has ONE chopstick and needs the one on their right (held by their neighbor). Nobody eats. Nobody puts their chopstick down. Deadlock.

System Example

Process P1 holds Printer, needs Scanner. Process P2 holds Scanner, needs Printer. Both wait forever.

2. Deadlock Characterization โ€” The 4 Necessary Conditions

A deadlock can arise if and only if all four conditions hold simultaneously. If even one condition is absent, deadlock cannot occur.

#ConditionMeaningIndian Analogy
1Mutual ExclusionAt least one resource must be non-sharable โ€” only one process can use it at a time๐Ÿšป Single-seat Indian Railways toilet โ€” only one person at a time. You can't share it.
2Hold & WaitA process holding at least one resource is waiting to acquire additional resources held by others๐Ÿ“ฑ You're holding your phone (resource 1) while waiting for your roommate to free the charger (resource 2). You won't put the phone down.
3No PreemptionResources cannot be forcibly taken away from a process; they must be released voluntarily๐Ÿฝ๏ธ In a thali restaurant, once a customer is served a katori, the waiter can't snatch it back mid-meal โ€” even if someone else desperately needs it.
4Circular WaitA circular chain of processes exists where each is waiting for a resource held by the next in the chain๐Ÿฅข The 4-chopstick dinner problem: P1โ†’waits for P2โ†’waits for P3โ†’waits for P4โ†’waits for P1. A circle.
Students often say "any one condition causes deadlock." WRONG! All four conditions must hold simultaneously. If you prevent even ONE condition, deadlock is impossible. This is the foundation of deadlock prevention strategies. Exam questions frequently test whether you understand this "AND" relationship (not "OR").
Mnemonic: "MH-NC" โ€” Mutual exclusion, Hold & wait, No preemption, Circular wait. Or remember: "My Horse Never Canters" โ€” works great for viva answers. Always list all four when asked about deadlock conditions.

3. Resource Allocation Graph (RAG)

A Resource Allocation Graph is a directed graph that helps us visualize and detect deadlocks. It has two types of nodes and two types of edges.

๐Ÿ“Š RAG Components

Nodes

โ€ข Process nodes โ€” drawn as circles: P1, P2, P3...

โ€ข Resource nodes โ€” drawn as rectangles with dots inside (each dot = one instance)

Edges

โ€ข Request edge (Pi โ†’ Rj): Process Pi is waiting for resource Rj (arrow from process to resource)

โ€ข Assignment edge (Rj โ†’ Pi): Resource Rj is held by process Pi (arrow from resource to process)

Key Rules

โ€ข If RAG has no cycle โ†’ No deadlock (guaranteed)

โ€ข If RAG has a cycle AND each resource has only 1 instance โ†’ Deadlock exists (guaranteed)

โ€ข If RAG has a cycle AND resources have multiple instances โ†’ Deadlock is possible but NOT guaranteed

Worked Example: RAG with Cycle Detection

Consider 3 processes (P1, P2, P3) and 4 resources (R1 with 1 instance, R2 with 2 instances, R3 with 1 instance, R4 with 1 instance):

RAG (ASCII Art)
  Resource Allocation Graph
  ========================

  Assignments (Resource โ†’ Process):
    R1 โ†’ P1    (P1 holds R1)
    R2 โ†’ P2    (P2 holds one instance of R2)
    R2 โ†’ P3    (P3 holds one instance of R2)
    R3 โ†’ P3    (P3 holds R3)

  Requests (Process โ†’ Resource):
    P1 โ†’ R2    (P1 wants R2, but both instances are taken)
    P2 โ†’ R3    (P2 wants R3, but P3 holds it)
    P3 โ†’ R1    (P3 wants R1, but P1 holds it)

  Cycle Detection:
    P1 โ†’ R2 โ†’ P3 โ†’ R1 โ†’ P1   โ† CYCLE FOUND!
    P2 โ†’ R3 โ†’ P3 โ†’ R1 โ†’ P1 โ†’ R2 โ†’ P2   โ† CYCLE FOUND!

  Analysis:
    R1 has 1 instance, R3 has 1 instance โ†’ single-instance resources in cycle
    โˆด Cycle with single-instance resources = DEADLOCK CONFIRMED
If R2 had 3 instances (one free), P1 could get R2, finish, release R1, then P3 gets R1 โ€” NO deadlock! With multiple instances, a cycle in RAG does NOT always mean deadlock. You must check if any process can proceed. This is a classic exam trick question.

4. Deadlock Handling Strategies โ€” Overview

There are four approaches an OS can take:

StrategyApproachWhen UsedReal-World Analogy
Ignorance (Ostrich)Pretend deadlocks don't happenMost desktop OS (Windows, Linux)Ignoring the pothole on your street โ€” hope nobody falls
PreventionEnsure at least one of 4 conditions never holdsSafety-critical systemsOne-way streets prevent head-on collision
AvoidanceDynamically check if granting a request is "safe"Banking, real-time systemsRBI checking if a bank has enough reserves before approving a loan
Detection + RecoveryAllow deadlocks, detect them, then break themDatabase systems, distributed systemsTraffic police clearing a jam after it forms

5. Deadlock Prevention โ€” Negate Each Condition

The idea is simple: if we ensure at least one of the four necessary conditions never holds, deadlock becomes impossible.

ConditionHow to PreventDrawbackIndian Example
Mutual ExclusionMake resources sharable (e.g., read-only files)Not possible for all resources โ€” printers, tape drives can't be sharedLibrary books can be read by many (sharable), but the borrowing card is exclusive
Hold & WaitProcess must request ALL resources at once before startingLow resource utilization โ€” process holds resources it doesn't need yet; may cause starvationLike forcing a train passenger to book all connecting trains at once โ€” even if the second train is 12 hours later
No PreemptionIf a process can't get a resource, it releases all held resources and restartsCan cause work loss โ€” process might have to redo computationIf you can't get a gas cylinder refill, you give up your stove too and start cooking from scratch
Circular WaitImpose a total ordering on resources; processes must request in increasing orderProgrammers must know the ordering; may be hard to enforce in complex systemsIn a government office, you MUST visit Window 1 โ†’ 2 โ†’ 3 โ†’ 4 in order โ€” no skipping allowed
Circular Wait prevention is the MOST practical. Most real OS and database systems use resource ordering. For example, in Linux kernel, locks are always acquired in a pre-defined order. If you see a question "which prevention method is most commonly used?" โ€” the answer is almost always circular wait prevention via resource ordering.

6. Deadlock Avoidance โ€” Banker's Algorithm โญ MOST IMPORTANT

This is the most-tested topic in OS exams. You WILL get a numerical on this. Let's master it step by step.

Banker's Algorithm = RBI deciding loan approvals. Imagine you're the RBI. A bank has โ‚น100 crore in cash reserves. Multiple companies want loans. You'll only approve a loan if โ€” AFTER giving the money โ€” the bank still has enough reserves to satisfy at least one company's full loan need, so that company can repay, freeing up money for others. If granting a loan would put the bank in a state where NO sequence of repayments is possible โ†’ you DENY the loan. That's exactly what the Banker's Algorithm does with CPU resources.

Key Terminology

TermMeaningFormula
nNumber of processesโ€”
mNumber of resource typesโ€”
Available[m]Vector of currently available instances of each resource typeโ€”
Max[n][m]Maximum demand of each processDeclared at start
Allocation[n][m]Currently allocated resources to each processโ€”
Need[n][m]Remaining resources each process might needNeed = Max โˆ’ Allocation

Safe State vs Unsafe State

๐Ÿ›ก๏ธ Safe State

A state is safe if there exists at least one safe sequence โ€” an ordering of all processes such that each process can get its remaining needed resources from currently available resources + resources held by all preceding processes in the sequence.

Key Insight

Safe state โ†’ No deadlock (guaranteed)

Unsafe state โ†’ Deadlock is POSSIBLE but NOT guaranteed

Deadlocked state โ†’ Always unsafe

"Unsafe = Deadlock" is WRONG! An unsafe state means there's a POSSIBILITY of deadlock, not a certainty. Safe โ†’ definitely no deadlock. Unsafe โ†’ might or might not deadlock. This distinction is asked in almost every university exam.

Safety Algorithm โ€” Pseudocode

Pseudocode
// Safety Algorithm โ€” determines if system is in safe state
Step 1: Initialize:
        Work[m] = Available[m]     // copy of available resources
        Finish[n] = false          // no process has finished yet

Step 2: Find a process Pi such that:
        Finish[i] == false  AND  Need[i] <= Work
        // i.e., process hasn't finished AND its need can be satisfied
        If no such Pi exists โ†’ go to Step 4

Step 3: Work = Work + Allocation[i]  // Pi finishes, releases its resources
        Finish[i] = true
        Add Pi to safe sequence
        Go to Step 2

Step 4: If Finish[i] == true for ALL i โ†’ System is in SAFE STATE
        Else โ†’ System is in UNSAFE STATE

๐Ÿ“ WORKED EXAMPLE 1 โ€” Finding Safe Sequence (5 processes, 3 resource types)

Given: 5 processes (P0โ€“P4), 3 resource types (A, B, C)

Allocation Matrix: A B C P0 [ 0 1 0 ] P1 [ 2 0 0 ] P2 [ 3 0 2 ] P3 [ 2 1 1 ] P4 [ 0 0 2 ] Max Matrix: A B C P0 [ 7 5 3 ] P1 [ 3 2 2 ] P2 [ 9 0 2 ] P3 [ 2 2 2 ] P4 [ 4 3 3 ] Available: [3, 3, 2]
Step 1: Compute Need Matrix (Need = Max โˆ’ Allocation)
Need Matrix: A B C P0 [ 7-0 5-1 3-0 ] = [ 7 4 3 ] P1 [ 3-2 2-0 2-0 ] = [ 1 2 2 ] P2 [ 9-3 0-0 2-2 ] = [ 6 0 0 ] P3 [ 2-2 2-1 2-1 ] = [ 0 1 1 ] P4 [ 4-0 3-0 3-2 ] = [ 4 3 1 ]
Step 2: Run Safety Algorithm

Initialize: Work = [3, 3, 2], Finish = [F, F, F, F, F]

๐Ÿ”„ Iteration 1 โ€” Scan all processes

โ€ข P0: Need[0] = [7,4,3], Work = [3,3,2] โ†’ 7 โ‰ค 3? NO โ†’ Skip

โ€ข P1: Need[1] = [1,2,2], Work = [3,3,2] โ†’ 1 โ‰ค 3? YES, 2 โ‰ค 3? YES, 2 โ‰ค 2? YES โ†’ โœ… P1 can run!

โ†’ P1 finishes. Work = Work + Allocation[1] = [3,3,2] + [2,0,0] = [5, 3, 2]

โ†’ Finish = [F, T, F, F, F]. Safe sequence so far: <P1>

๐Ÿ”„ Iteration 2 โ€” Scan unfinished processes

โ€ข P0: Need[0] = [7,4,3], Work = [5,3,2] โ†’ 7 โ‰ค 5? NO โ†’ Skip

โ€ข P2: Need[2] = [6,0,0], Work = [5,3,2] โ†’ 6 โ‰ค 5? NO โ†’ Skip

โ€ข P3: Need[3] = [0,1,1], Work = [5,3,2] โ†’ 0 โ‰ค 5? YES, 1 โ‰ค 3? YES, 1 โ‰ค 2? YES โ†’ โœ… P3 can run!

โ†’ P3 finishes. Work = [5,3,2] + [2,1,1] = [7, 4, 3]

โ†’ Finish = [F, T, F, T, F]. Safe sequence so far: <P1, P3>

๐Ÿ”„ Iteration 3 โ€” Scan unfinished processes

โ€ข P0: Need[0] = [7,4,3], Work = [7,4,3] โ†’ 7 โ‰ค 7? YES, 4 โ‰ค 4? YES, 3 โ‰ค 3? YES โ†’ โœ… P0 can run!

โ†’ P0 finishes. Work = [7,4,3] + [0,1,0] = [7, 5, 3]

โ†’ Finish = [T, T, F, T, F]. Safe sequence so far: <P1, P3, P0>

๐Ÿ”„ Iteration 4 โ€” Scan unfinished processes

โ€ข P2: Need[2] = [6,0,0], Work = [7,5,3] โ†’ 6 โ‰ค 7? YES, 0 โ‰ค 5? YES, 0 โ‰ค 3? YES โ†’ โœ… P2 can run!

โ†’ P2 finishes. Work = [7,5,3] + [3,0,2] = [10, 5, 5]

โ†’ Finish = [T, T, T, T, F]. Safe sequence so far: <P1, P3, P0, P2>

๐Ÿ”„ Iteration 5 โ€” Scan unfinished processes

โ€ข P4: Need[4] = [4,3,1], Work = [10,5,5] โ†’ 4 โ‰ค 10? YES, 3 โ‰ค 5? YES, 1 โ‰ค 5? YES โ†’ โœ… P4 can run!

โ†’ P4 finishes. Work = [10,5,5] + [0,0,2] = [10, 5, 7]

โ†’ Finish = [T, T, T, T, T]. All processes finished!

โœ… RESULT: System is in SAFE STATE

Safe Sequence: < P1, P3, P0, P2, P4 >

This is one valid safe sequence. Other safe sequences may also exist (e.g., <P1, P3, P0, P4, P2>).

Quick-check shortcut: Always start scanning from the process with the SMALLEST Need values โ€” it's most likely to fit within Available first. This saves time in exams. In this example, P3's Need [0,1,1] is smallest, but P1's [1,2,2] was encountered first in the scan order.

๐Ÿ“ WORKED EXAMPLE 2 โ€” Resource Request Algorithm

Question: In the above system (after finding it safe), process P1 requests (1, 0, 2). Can this request be granted?

Resource Request Algorithm โ€” Pseudocode
Pseudocode
// Process Pi requests resources Request[i]
Step 1: If Request[i] <= Need[i]  โ†’ proceed
        Else โ†’ ERROR (process exceeded max claim)

Step 2: If Request[i] <= Available โ†’ proceed
        Else โ†’ Pi must WAIT (resources not available)

Step 3: Pretend to allocate:
        Available = Available โˆ’ Request[i]
        Allocation[i] = Allocation[i] + Request[i]
        Need[i] = Need[i] โˆ’ Request[i]
        Run Safety Algorithm on new state.
        If safe โ†’ GRANT request
        If unsafe โ†’ DENY request, restore old values
Solving: P1 requests (1, 0, 2)
Step 1: Check Request โ‰ค Need

Request[1] = (1, 0, 2)

Need[1] = (1, 2, 2)

Is (1, 0, 2) โ‰ค (1, 2, 2)? โ†’ 1โ‰ค1 โœ…, 0โ‰ค2 โœ…, 2โ‰ค2 โœ… โ†’ PASS

Step 2: Check Request โ‰ค Available

Request[1] = (1, 0, 2)

Available = (3, 3, 2)

Is (1, 0, 2) โ‰ค (3, 3, 2)? โ†’ 1โ‰ค3 โœ…, 0โ‰ค3 โœ…, 2โ‰ค2 โœ… โ†’ PASS

Step 3: Pretend to allocate and check safety

New Available = (3,3,2) โˆ’ (1,0,2) = (2, 3, 0)

New Allocation[1] = (2,0,0) + (1,0,2) = (3, 0, 2)

New Need[1] = (1,2,2) โˆ’ (1,0,2) = (0, 2, 0)

Now run Safety Algorithm with updated state:

Updated State: Allocation Need Available = (2, 3, 0) P0 [ 0 1 0 ] [ 7 4 3 ] P1 [ 3 0 2 ] [ 0 2 0 ] โ† Updated P2 [ 3 0 2 ] [ 6 0 0 ] P3 [ 2 1 1 ] [ 0 1 1 ] P4 [ 0 0 2 ] [ 4 3 1 ]
Safety Check โ€” Iteration 1

Work = [2, 3, 0]

โ€ข P0: Need = [7,4,3] โ†’ 7 โ‰ค 2? NO โ†’ Skip

โ€ข P1: Need = [0,2,0] โ†’ 0 โ‰ค 2? YES, 2 โ‰ค 3? YES, 0 โ‰ค 0? YES โ†’ โœ… P1 runs!

โ†’ Work = [2,3,0] + [3,0,2] = [5, 3, 2]. Finish[1] = T

Safety Check โ€” Iteration 2

Work = [5, 3, 2]

โ€ข P0: Need = [7,4,3] โ†’ 7 โ‰ค 5? NO โ†’ Skip

โ€ข P3: Need = [0,1,1] โ†’ 0 โ‰ค 5? YES, 1 โ‰ค 3? YES, 1 โ‰ค 2? YES โ†’ โœ… P3 runs!

โ†’ Work = [5,3,2] + [2,1,1] = [7, 4, 3]. Finish[3] = T

Safety Check โ€” Iterations 3, 4, 5

Work = [7,4,3] โ†’ P0 can run (Need [7,4,3] โ‰ค [7,4,3]) โ†’ Work = [7,5,3] โ†’ P2 can run โ†’ Work = [10,5,5] โ†’ P4 can run โ†’ Work = [10,5,7]

All Finish = T โ†’ SAFE!

โœ… RESULT: Request (1, 0, 2) by P1 can be GRANTED

Safe sequence after granting: < P1, P3, P0, P2, P4 >

The system remains in a safe state, so the OS should grant P1's request.

๐Ÿ“ WORKED EXAMPLE 3 โ€” Detecting an UNSAFE State

Given: 4 processes (P0โ€“P3), 3 resource types (A, B, C)

Allocation Matrix: A B C P0 [ 0 1 0 ] P1 [ 2 0 0 ] P2 [ 3 0 3 ] P3 [ 2 1 1 ] Max Matrix: A B C P0 [ 7 5 3 ] P1 [ 3 2 2 ] P2 [ 9 0 3 ] P3 [ 4 2 2 ] Available: [1, 1, 0]
Step 1: Compute Need Matrix
Need = Max โˆ’ Allocation: A B C P0 [ 7 4 3 ] P1 [ 1 2 2 ] P2 [ 6 0 0 ] P3 [ 2 1 1 ]
Step 2: Run Safety Algorithm

Initialize: Work = [1, 1, 0], Finish = [F, F, F, F]

๐Ÿ”„ Iteration 1 โ€” Scan all processes

โ€ข P0: Need = [7,4,3], Work = [1,1,0] โ†’ 7 โ‰ค 1? NO โ†’ Skip

โ€ข P1: Need = [1,2,2], Work = [1,1,0] โ†’ 1 โ‰ค 1? YES, 2 โ‰ค 1? NO โ†’ Skip

โ€ข P2: Need = [6,0,0], Work = [1,1,0] โ†’ 6 โ‰ค 1? NO โ†’ Skip

โ€ข P3: Need = [2,1,1], Work = [1,1,0] โ†’ 2 โ‰ค 1? NO โ†’ Skip

โ†’ No process can be selected! None of them have Need โ‰ค Work.

Result: Go to Step 4

Finish = [F, F, F, F] โ€” NOT all true.

โ†’ โš ๏ธ System is in UNSAFE STATE

โŒ RESULT: System is in UNSAFE STATE

No safe sequence exists. If all processes were to request their maximum remaining resources right now, the OS cannot guarantee all will finish. Deadlock is possible (but not certain โ€” processes might not actually request their maximum).

Don't confuse "unsafe" with "deadlocked." In this example, if P1 voluntarily requests only (1,0,0) instead of its full need (1,2,2), it might finish and free resources. Unsafe โ‰  deadlocked. Unsafe = no GUARANTEED safe sequence exists.

7. Deadlock Detection

Instead of preventing or avoiding deadlocks, some systems allow them to happen and then detect & recover.

Wait-For Graph (WFG)

For single-instance resources, simplify the RAG by removing resource nodes. If Pi is waiting for a resource held by Pj, draw edge Pi โ†’ Pj. A cycle in WFG = deadlock.

Wait-For Graph
Original RAG:
  P1 โ†’ R1 โ†’ P2 โ†’ R2 โ†’ P3 โ†’ R3 โ†’ P1

Wait-For Graph (remove resource nodes):
  P1 โ†’ P2 โ†’ P3 โ†’ P1   โ† CYCLE = DEADLOCK!

Detection Algorithm (Multiple Instance Resources)

Works almost identically to the Safety Algorithm, but uses Request matrix instead of Need matrix (current pending requests, not maximum need).

Pseudocode
Step 1: Work = Available
        For each Pi: if Allocation[i] โ‰  0, then Finish[i] = false
                     else Finish[i] = true  // process with no resources can't be deadlocked

Step 2: Find Pi such that Finish[i] == false AND Request[i] <= Work
        If not found โ†’ go to Step 4

Step 3: Work = Work + Allocation[i]
        Finish[i] = true
        Go to Step 2

Step 4: If any Finish[i] == false โ†’ Pi is DEADLOCKED

When to Invoke Detection?

OptionFrequencyOverheadResponse Time
Every resource requestVery frequentVery high (O(m ร— nยฒ) each time)Immediate detection
Periodically (e.g., every 5 min)MediumMediumDelayed detection
When CPU utilization drops below thresholdAdaptiveLowMay detect late

8. Deadlock Recovery

Once deadlock is detected, the OS must break it. Two main approaches:

A. Process Termination

MethodHow It WorksProsCons
Abort ALL deadlocked processesKill every process in the deadlock cycleGuaranteed to break deadlock immediatelyExpensive โ€” may lose lots of computation; some processes may have been running for hours
Abort ONE at a timeKill one process, re-run detection; repeat if still deadlockedMinimal damage โ€” only kill the minimum necessarySlower โ€” must run detection algorithm after each kill; which one to kill?

Which process to kill first? Factors to consider:

  • Priority of the process (kill low-priority first)
  • How long the process has been running (kill the one with least computation done)
  • Resources held by the process (kill the one holding most resources to free maximum)
  • How many more resources does it need (kill the one that needs a lot more)
  • Is it interactive or batch? (prefer killing batch processes)

B. Resource Preemption

ConceptMeaning
Selecting a victimChoose which process's resources to preempt (take away) โ€” based on minimum cost
RollbackRoll the victim process back to a safe checkpoint and restart it
StarvationSame process might always be chosen as victim โ€” fix by adding a "rollback count" limit
IRCTC's Tatkal booking system uses detection + recovery. During peak Tatkal hours (10 AM), thousands of concurrent database transactions can deadlock when two users try to book the last seat on the same train. The system detects the deadlock (using MySQL's InnoDB deadlock detector) and aborts the younger transaction (the one that started later), giving the seat to the older transaction.

9. Starvation vs Deadlock โ€” Comparison

ParameterDeadlockStarvation
DefinitionProcesses waiting for each other's resources in a cycle โ€” NONE can proceedA process waits indefinitely because other higher-priority processes keep getting served first
BlockingAll processes in the cycle are blockedOnly the starving process is blocked; others proceed normally
CauseCircular wait + hold & wait + no preemption + mutual exclusionUnfair scheduling (e.g., priority scheduling without aging)
ResourcesInvolved โ€” processes hold resources others needMay not involve resource holding โ€” just scheduling unfairness
SolutionPrevention, avoidance, detection + recoveryAging (gradually increase priority of waiting processes)
AnalogyTwo trucks stuck on a single-lane bridge โ€” neither can moveA person waiting at a government counter where "VIP" token holders always get served first โ€” the person with a "general" token waits forever
Self-resolving?No โ€” deadlock never resolves on its ownPossible โ€” if the higher-priority processes eventually finish
Section D

Learn by Doing โ€” 3-Tier Lab Structure

๐ŸŸข Tier 1 โ€” GUIDED: Banker's Algorithm Simulator in Python

โฑ๏ธ 90โ€“120 minutesBeginnerComplete code provided

Complete Python Code โ€” Banker's Algorithm

Python
def bankers_algorithm(processes, resources, allocation, max_matrix, available):
    """
    Banker's Algorithm Implementation
    Returns: (is_safe, safe_sequence)
    """
    n = len(processes)     # number of processes
    m = len(resources)     # number of resource types
    
    # Step 1: Calculate Need matrix
    need = []
    for i in range(n):
        need.append([max_matrix[i][j] - allocation[i][j] for j in range(m)])
    
    print("Need Matrix:")
    for i in range(n):
        print(f"  {processes[i]}: {need[i]}")
    print()
    
    # Step 2: Initialize Work and Finish
    work = available[:]
    finish = [False] * n
    safe_sequence = []
    
    # Step 3: Find safe sequence
    for _ in range(n):
        found = False
        for i in range(n):
            if not finish[i]:
                # Check if Need[i] <= Work
                if all(need[i][j] <= work[j] for j in range(m)):
                    print(f"  {processes[i]}: Need {need[i]} <= Work {work[j] for j in range(m)}")
                    # Process can finish โ€” release its resources
                    work = [work[j] + allocation[i][j] for j in range(m)]
                    finish[i] = True
                    safe_sequence.append(processes[i])
                    print(f"    โœ… {processes[i]} finishes. Work = {work}")
                    found = True
                    break
        if not found:
            break
    
    # Step 4: Check if all finished
    is_safe = all(finish)
    return is_safe, safe_sequence


# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# TEST: Worked Example 1 from textbook
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
processes = ["P0", "P1", "P2", "P3", "P4"]
resources = ["A", "B", "C"]

allocation = [
    [0, 1, 0],   # P0
    [2, 0, 0],   # P1
    [3, 0, 2],   # P2
    [2, 1, 1],   # P3
    [0, 0, 2],   # P4
]

max_matrix = [
    [7, 5, 3],   # P0
    [3, 2, 2],   # P1
    [9, 0, 2],   # P2
    [2, 2, 2],   # P3
    [4, 3, 3],   # P4
]

available = [3, 3, 2]

print("โ•" * 50)
print("  BANKER'S ALGORITHM โ€” SAFETY CHECK")
print("โ•" * 50)
print(f"Available: {available}\n")

is_safe, sequence = bankers_algorithm(processes, resources, allocation, max_matrix, available)

print(f"\n{'โœ… SAFE STATE' if is_safe else 'โŒ UNSAFE STATE'}")
if is_safe:
    print(f"Safe Sequence: < {', '.join(sequence)} >")

Expected Output

โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• BANKER'S ALGORITHM โ€” SAFETY CHECK โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• Available: [3, 3, 2] Need Matrix: P0: [7, 4, 3] P1: [1, 2, 2] P2: [6, 0, 0] P3: [0, 1, 1] P4: [4, 3, 1] P1: Need [1, 2, 2] <= Work [3, 3, 2] โœ… P1 finishes. Work = [5, 3, 2] P3: Need [0, 1, 1] <= Work [5, 3, 2] โœ… P3 finishes. Work = [7, 4, 3] P0: Need [7, 4, 3] <= Work [7, 4, 3] โœ… P0 finishes. Work = [7, 5, 3] P2: Need [6, 0, 0] <= Work [7, 5, 3] โœ… P2 finishes. Work = [10, 5, 5] P4: Need [4, 3, 1] <= Work [10, 5, 5] โœ… P4 finishes. Work = [10, 5, 7] โœ… SAFE STATE Safe Sequence: < P1, P3, P0, P2, P4 >

๐ŸŸก Tier 2 โ€” SEMI-GUIDED: Add Resource Request Handling

โฑ๏ธ 60โ€“90 minutesIntermediateHints provided, you fill the gaps

Your Mission:

Extend the Tier 1 code to handle resource requests. Implement the resource_request() function.

Hints:

  1. Function signature: resource_request(process_id, request, allocation, max_matrix, available)
  2. Step 1: Check if request[j] <= need[process_id][j] for all j. If not โ†’ error
  3. Step 2: Check if request[j] <= available[j] for all j. If not โ†’ process must wait
  4. Step 3: Create COPIES of allocation, available, and need. Apply the request to the copies. Run bankers_algorithm() on the copies.
  5. If safe: Apply the request to the originals, print "Request GRANTED"
  6. If unsafe: Don't modify originals, print "Request DENIED"
Test your code with: P1 requests (1, 0, 2) โ€” should be GRANTED. Then test P4 requests (3, 3, 0) โ€” should be DENIED (exceeds available). Then test P0 requests (0, 2, 0) โ€” should be DENIED (leads to unsafe state).

๐Ÿ”ด Tier 3 โ€” OPEN CHALLENGE: RAG Visualizer with NetworkX

โฑ๏ธ 2โ€“3 hoursAdvancedNo instructions โ€” real-world mini-project

The Brief:

Build a Resource Allocation Graph Visualizer using Python's NetworkX and Matplotlib:

  1. Input: Number of processes, resources, allocation matrix, request matrix
  2. Output: A visual graph showing process nodes (circles), resource nodes (squares), request edges (dashed), assignment edges (solid)
  3. Cycle Detection: Highlight cycles in red and print "DEADLOCK DETECTED" or "NO DEADLOCK"
  4. Libraries: pip install networkx matplotlib
  5. Hint: Use nx.DiGraph(), nx.find_cycle(), and nx.draw()
This project is portfolio-worthy! A RAG visualizer demonstrates your understanding of graph theory + OS concepts + Python skills. Add it to your GitHub with a README and screenshots. Hiring managers at companies like Google, Amazon, and Flipkart look for OS projects on student GitHub profiles.
Section E

Industry Spotlight โ€” A Day in the Life

๐Ÿ‘จโ€๐Ÿ’ป Rohan Gupta, 30 โ€” Site Reliability Engineer (SRE) at Google India, Bangalore

Background: B.Tech from NIT Trichy. Joined Google as an SDE-1 in 2018. Moved to SRE role in 2020 after debugging a production deadlock in Google Cloud's internal task scheduler that affected 50,000+ VMs.

A Typical Day:

9:00 AM โ€” Review overnight alerts. Check if any distributed locks in Spanner (Google's global database) timed out โ€” a sign of potential deadlock.

10:30 AM โ€” Investigate a latency spike in a microservice. Root cause: two gRPC services holding mutexes and waiting for each other's response. Classic distributed deadlock. Fix: implement a lock ordering protocol.

1:00 PM โ€” Lunch at Google's Bangalore campus cafeteria. Discuss with the team about adding deadlock detection timeouts to a new service.

2:30 PM โ€” Write a postmortem (incident report) for last week's Bigtable deadlock. Include root cause analysis, timeline, and prevention measures. This is shared company-wide.

4:00 PM โ€” Design review for a new resource allocation system. Apply Banker's Algorithm concepts to ensure the system never over-commits GPU resources for ML training jobs.

5:30 PM โ€” Mentor a junior engineer on lock ordering and deadlock prevention patterns in Go.

DetailInfo
Tools Used DailyGo, Python, Borgmon (monitoring), Spanner, Bigtable, gRPC, Mutex/Locks
Entry Salary (SDE-1)โ‚น18โ€“25 LPA + benefits
Mid-Level SRE (3โ€“5 yrs)โ‚น30โ€“50 LPA
Senior SRE (7+ yrs)โ‚น50โ€“80 LPA
Companies Hiring SREsGoogle, Microsoft, Amazon, Flipkart, Razorpay, PhonePe, Uber India, Atlassian
Section F

Earn With It โ€” Freelance & Income Roadmap

๐Ÿ’ฐ Your Earning Path After This Chapter

Portfolio Piece: "Banker's Algorithm Simulator" โ€” a Python tool that takes allocation/max/available matrices as input and outputs safe sequence + handles resource requests. Host on GitHub with a clean README.

Beginner Gig Ideas:

โ€ข Deadlock detection consulting for startups using microservices โ€” โ‚น5,000โ€“โ‚น15,000 per audit

โ€ข OS assignment solving for engineering students (Banker's Algorithm numericals) โ€” โ‚น200โ€“โ‚น500 per problem

โ€ข Database deadlock analysis for small e-commerce companies โ€” โ‚น8,000โ€“โ‚น25,000 per project

โ€ข Building RAG visualization tools for CS professors โ€” โ‚น3,000โ€“โ‚น8,000 per tool

PlatformBest ForTypical Rate
GitHub + LinkedInShowcasing OS projects, attracting recruitersIndirect โ€” leads to โ‚น6โ€“10 LPA jobs
Chegg / Course HeroSolving OS numericals for students worldwide$3โ€“$8 per question (โ‚น250โ€“โ‚น650)
UpworkDatabase deadlock analysis, system optimization$20โ€“$50/hour
InternshalaOS/Systems internships at Indian startupsโ‚น8,000โ€“โ‚น25,000/month
Topmate / Preplaced1:1 OS tutoring for placement prepโ‚น500โ€“โ‚น1,500 per session

โฑ๏ธ Time to First Earning: 1โ€“2 weeks (if you complete the Banker's Algorithm lab and list on Chegg/GitHub)

The highest-paying skill here is distributed deadlock analysis. As Indian startups move to microservices (Razorpay, Zerodha, CRED), they face distributed deadlocks daily. An engineer who can read logs, identify circular waits in distributed systems, and implement lock-ordering fixes is worth โ‚น15โ€“25 LPA even at 2 years experience.
Section G

MCQ Assessment Bank โ€” 30 Questions (Bloom's Mapped)

Remember / Identify (Q1โ€“Q5)

Q1

Which of the following is NOT one of the four necessary conditions for deadlock?

  1. Mutual Exclusion
  2. Hold and Wait
  3. Aging
  4. Circular Wait
Remember
โœ… Answer: (C) Aging โ€” Aging is a technique to prevent starvation, not a deadlock condition. The four conditions are: Mutual Exclusion, Hold & Wait, No Preemption, and Circular Wait.
Q2

In a Resource Allocation Graph, a request edge goes from:

  1. Resource to Process
  2. Process to Resource
  3. Process to Process
  4. Resource to Resource
Remember
โœ… Answer: (B) โ€” A request edge Pi โ†’ Rj means process Pi is requesting resource Rj. An assignment edge goes from Resource to Process (Rj โ†’ Pi).
Q3

The Banker's Algorithm is used for deadlock:

  1. Prevention
  2. Detection
  3. Avoidance
  4. Recovery
Remember
โœ… Answer: (C) Avoidance โ€” Banker's Algorithm dynamically checks if granting a resource request will keep the system in a safe state. It avoids deadlock by refusing unsafe requests.
Q4

In Banker's Algorithm, the Need matrix is calculated as:

  1. Need = Max + Allocation
  2. Need = Allocation โˆ’ Max
  3. Need = Max โˆ’ Allocation
  4. Need = Available โˆ’ Max
Remember
โœ… Answer: (C) Need = Max โˆ’ Allocation โ€” Need represents the remaining resources a process might request. It's the difference between maximum declared need and currently allocated resources.
Q5

Which deadlock handling strategy does most desktop operating systems (Windows, Linux) use?

  1. Prevention
  2. Avoidance
  3. Detection and Recovery
  4. Ignore the problem (Ostrich Algorithm)
Remember
โœ… Answer: (D) โ€” Most desktop OS use the Ostrich Algorithm โ€” they ignore deadlocks because they're rare in practice, and the overhead of prevention/avoidance outweighs the cost of occasional reboots.

Understand / Explain (Q6โ€“Q10)

Q6

Why does a cycle in a Resource Allocation Graph NOT always indicate deadlock when resources have multiple instances?

  1. Because cycles can't exist with multiple instances
  2. Because a process in the cycle might get resources from another available instance, breaking the wait
  3. Because multiple instances automatically prevent deadlock
  4. Because the OS ignores cycles with multiple instances
Understand
โœ… Answer: (B) โ€” With multiple instances, even if a cycle exists, a process might be able to get the resource from a free instance (not held by any process in the cycle), allowing it to proceed and break the circular wait.
Q7

What is the relationship between safe state and deadlock?

  1. Safe state = no deadlock; Unsafe state = deadlock
  2. Safe state = no deadlock guaranteed; Unsafe state = deadlock possible but not certain
  3. Safe and unsafe states have no relation to deadlock
  4. Unsafe state = deadlock guaranteed
Understand
โœ… Answer: (B) โ€” Safe state guarantees no deadlock because a safe sequence exists. Unsafe state means no safe sequence exists, so deadlock is POSSIBLE but processes might not actually make their maximum requests, so deadlock isn't guaranteed.
Q8

Why is negating Mutual Exclusion impractical for most deadlock prevention?

  1. Because it's too expensive
  2. Because most resources (printers, tape drives) are inherently non-sharable
  3. Because the OS doesn't support it
  4. Because it causes starvation
Understand
โœ… Answer: (B) โ€” Many resources like printers, scanners, and write-access files are inherently non-sharable. You can't have two processes printing to the same printer simultaneously. Only read-only resources can be made sharable.
Q9

In the context of deadlock recovery, what does "rollback" mean?

  1. Restarting the entire operating system
  2. Returning a preempted process to a previously saved safe checkpoint state
  3. Rolling back the clock to before the deadlock
  4. Removing the process from memory permanently
Understand
โœ… Answer: (B) โ€” Rollback means restoring a process to a saved checkpoint (a snapshot of its state taken earlier). The process loses work done after the checkpoint but can restart from that point instead of from scratch.
Q10

How does a Wait-For Graph differ from a Resource Allocation Graph?

  1. WFG includes resource nodes; RAG doesn't
  2. WFG removes resource nodes and draws direct edges between processes; RAG includes resource nodes
  3. They are identical
  4. WFG is used for avoidance; RAG for detection
Understand
โœ… Answer: (B) โ€” Wait-For Graph simplifies RAG by removing resource nodes. If Pi waits for a resource held by Pj, we draw Pi โ†’ Pj directly. Used for single-instance resource deadlock detection.

Apply / Solve (Q11โ€“Q18) โ€” Numerical Problems

Q11

Given: Allocation = [[1,0],[0,1]], Max = [[2,1],[1,2]], Available = [0,1]. What is the Need matrix?

  1. [[1,1],[1,1]]
  2. [[3,1],[1,3]]
  3. [[1,0],[0,1]]
  4. [[2,1],[1,2]]
ApplyNumerical
โœ… Answer: (A) โ€” Need = Max โˆ’ Allocation. P0: [2-1, 1-0] = [1,1]. P1: [1-0, 2-1] = [1,1]. So Need = [[1,1],[1,1]].
Q12

For the system in Q11, is the system in a safe state?

  1. Yes, safe sequence is <P0, P1>
  2. Yes, safe sequence is <P1, P0>
  3. No, the system is unsafe
  4. Cannot be determined
ApplyNumerical
โœ… Answer: (B) โ€” Work = [0,1]. P0 needs [1,1] but Work = [0,1], 1 โ‰ค 0? NO. P1 needs [1,1] but Work = [0,1], 1 โ‰ค 0? NO. Wait โ€” let me recheck: P1 Need = [1,1], Work = [0,1] โ†’ 1 โ‰ค 0? NO. Hmm, neither can run. Actually this is UNSAFE. Answer: (C). Correction: Need[P0] = [1,1], Work = [0,1] โ†’ 1 > 0, fail. Need[P1] = [1,1], Work = [0,1] โ†’ 1 > 0, fail. System is UNSAFE. โœ… Answer: (C).
Q13

Given 3 processes, 2 resource types: Allocation = [[1,0],[0,1],[1,1]], Max = [[2,1],[1,2],[2,2]], Available = [1,0]. What is the Need matrix?

  1. [[1,1],[1,1],[1,1]]
  2. [[1,0],[0,1],[1,1]]
  3. [[3,1],[1,3],[3,3]]
  4. [[2,1],[1,2],[2,2]]
ApplyNumerical
โœ… Answer: (A) โ€” Need = Max โˆ’ Allocation. P0: [2-1, 1-0] = [1,1]. P1: [1-0, 2-1] = [1,1]. P2: [2-1, 2-1] = [1,1]. All same = [[1,1],[1,1],[1,1]].
Q14

Using the textbook Worked Example 1 (5 processes, Available = [3,3,2]), which process is selected FIRST in the safety algorithm?

  1. P0
  2. P1
  3. P3
  4. P4
ApplyNumerical
โœ… Answer: (B) P1 โ€” Need[P1] = [1,2,2] and Available = [3,3,2]. Since [1,2,2] โ‰ค [3,3,2] (all components), P1 is the first process that can be safely allocated. P0's Need [7,4,3] exceeds Available.
Q15

In Worked Example 1, after P1 and P3 finish, what is the Work vector?

  1. [5, 3, 2]
  2. [7, 4, 3]
  3. [3, 3, 2]
  4. [10, 5, 5]
ApplyNumerical
โœ… Answer: (B) [7,4,3] โ€” After P1: Work = [3,3,2] + [2,0,0] = [5,3,2]. After P3: Work = [5,3,2] + [2,1,1] = [7,4,3].
Q16

In Worked Example 2, after P1 requests (1,0,2), the new Available vector becomes:

  1. [3, 3, 2]
  2. [2, 3, 0]
  3. [4, 3, 4]
  4. [1, 0, 2]
ApplyNumerical
โœ… Answer: (B) [2,3,0] โ€” New Available = Old Available โˆ’ Request = [3,3,2] โˆ’ [1,0,2] = [2,3,0]. We subtract the request from available to simulate granting it.
Q17

Given: 3 processes, 1 resource type with 5 instances. Allocation = [1, 2, 1], Max = [3, 4, 2]. Available = [1]. Can P2 finish first?

  1. Yes โ€” P2 needs 1 more, and 1 is available
  2. No โ€” P2 needs 2 more resources
  3. Yes โ€” P2 needs 0 more resources
  4. Cannot be determined
ApplyNumerical
โœ… Answer: (A) โ€” Need[P2] = Max[P2] โˆ’ Alloc[P2] = 2 โˆ’ 1 = 1. Available = 1. Since 1 โ‰ค 1, P2 can get its remaining need and finish. After P2 finishes, Work = 1 + 1 = 2.
Q18

In a system with 4 processes and 3 resource types, the total resources are [10, 5, 7]. Current allocation sums to [7, 2, 5]. What is the Available vector?

  1. [10, 5, 7]
  2. [7, 2, 5]
  3. [3, 3, 2]
  4. [17, 7, 12]
ApplyNumerical
โœ… Answer: (C) [3,3,2] โ€” Available = Total Resources โˆ’ Sum of all Allocations = [10,5,7] โˆ’ [7,2,5] = [3,3,2]. This is how Available is computed in practice.

Analyze / Compare (Q19โ€“Q23)

Q19

A RAG has 3 processes and 3 single-instance resources. P1 holds R1 and requests R2. P2 holds R2 and requests R3. P3 holds R3 and requests R1. Is there a deadlock?

  1. No โ€” no cycle exists
  2. Yes โ€” cycle P1โ†’R2โ†’P2โ†’R3โ†’P3โ†’R1โ†’P1 exists with single-instance resources
  3. Maybe โ€” depends on the scheduling algorithm
  4. No โ€” because all resources have single instances
Analyze
โœ… Answer: (B) โ€” There's a clear cycle: P1โ†’R2โ†’P2โ†’R3โ†’P3โ†’R1โ†’P1. Since all resources have single instances, a cycle in RAG guarantees deadlock.
Q20

Compare: Which deadlock prevention method has the LEAST resource utilization?

  1. Negating Mutual Exclusion
  2. Negating Hold & Wait (request all at once)
  3. Negating No Preemption
  4. Negating Circular Wait
Analyze
โœ… Answer: (B) โ€” Negating Hold & Wait requires processes to request ALL resources before starting. A process may hold resources for long periods without using them, leading to very poor resource utilization.
Q21

In a distributed system with 10 microservices, which deadlock handling strategy is most practical?

  1. Prevention โ€” force all services to request resources in order
  2. Avoidance โ€” run Banker's Algorithm across all services
  3. Detection + Recovery โ€” use timeouts and retry logic
  4. Ignore โ€” distributed deadlocks never happen
Analyze
โœ… Answer: (C) โ€” In distributed systems, global state is hard to maintain (making avoidance impractical), and strict ordering is difficult across services. Most distributed systems use timeout-based detection: if a request doesn't get a response in X seconds, assume deadlock and retry.
Q22

Which of the following is true about the Banker's Algorithm?

  1. It works without knowing the maximum resource needs of processes
  2. It requires processes to declare their maximum needs in advance
  3. It can only handle single-instance resources
  4. It detects deadlocks after they occur
Analyze
โœ… Answer: (B) โ€” Banker's Algorithm requires each process to declare its Maximum need (Max matrix) before starting. This is a major limitation โ€” in practice, processes often don't know their maximum needs in advance.
Q23

What is the time complexity of the Safety Algorithm in Banker's Algorithm with n processes and m resource types?

  1. O(n)
  2. O(n ร— m)
  3. O(nยฒ ร— m)
  4. O(n!)
Analyze
โœ… Answer: (C) O(nยฒ ร— m) โ€” In the worst case, we scan all n processes in each iteration (n iterations maximum), and each comparison involves m resource types. So: n iterations ร— n processes ร— m comparisons = O(nยฒ ร— m).

Evaluate / Judge (Q24โ€“Q27)

Q24

A company's database experiences deadlocks 3 times per year, each causing 10 minutes of downtime. The cost of implementing deadlock prevention is โ‚น50 lakhs + ongoing 15% performance overhead. Should they implement prevention?

  1. Yes โ€” any deadlock is unacceptable
  2. No โ€” the cost of prevention far exceeds the cost of 30 minutes/year downtime; use detection + recovery instead
  3. Yes โ€” but only the Banker's Algorithm
  4. No โ€” just ignore deadlocks entirely
Evaluate
โœ… Answer: (B) โ€” Cost-benefit analysis: 30 minutes of downtime per year costs far less than โ‚น50 lakhs + permanent 15% performance hit. Better approach: implement timeout-based detection (cheap) and automatic retry (recovery). This is the Ostrich approach with a safety net.
Q25

In a nuclear power plant's control system, which deadlock handling strategy should be used?

  1. Ostrich Algorithm โ€” deadlocks are rare
  2. Detection and Recovery โ€” fix after it happens
  3. Prevention โ€” ensure deadlocks can NEVER occur
  4. Avoidance โ€” use Banker's Algorithm
Evaluate
โœ… Answer: (C) Prevention โ€” In safety-critical systems like nuclear plants, even a brief deadlock could be catastrophic. Prevention ensures deadlocks are structurally impossible, even if it costs more in performance. The overhead of prevention is a small price for guaranteed safety.
Q26

A system uses "abort one process at a time" for deadlock recovery. Which factor should be given HIGHEST priority when selecting which process to terminate?

  1. The process with the shortest name
  2. The process holding the most resources (freeing maximum resources)
  3. The process that started most recently
  4. A random process
Evaluate
โœ… Answer: (B) โ€” Terminating the process holding the most resources frees the maximum number of resources, giving the best chance of breaking the deadlock with a single termination. Other factors (priority, computation done) also matter, but resource held is the most impactful.
Q27

Banker's Algorithm has a known limitation. Which scenario makes it IMPRACTICAL?

  1. When the number of processes is small
  2. When processes know their maximum needs in advance
  3. When the number of processes and resources changes dynamically (cloud computing)
  4. When resources are single-instance
Evaluate
โœ… Answer: (C) โ€” In cloud computing, VMs (processes) are created and destroyed dynamically, and resource pools change. Banker's Algorithm requires a fixed, known set of processes with declared maximum needs โ€” impractical in elastic cloud environments.

Create / Design (Q28โ€“Q30)

Q28

You're designing an OS for an ATM network where each ATM can lock a customer's account and a central ledger simultaneously. To prevent deadlock, which strategy is BEST?

  1. Always lock the customer account before the central ledger (resource ordering)
  2. Let each ATM lock in any order and use detection
  3. Require ATMs to lock both simultaneously (all-or-nothing)
  4. Don't use locks at all
Create
โœ… Answer: (A) โ€” Resource ordering (circular wait prevention) is the most practical. Define: Account Lock < Ledger Lock. All ATMs must acquire Account Lock first, then Ledger Lock. This prevents any circular wait. Options C would cause delays and D would cause data corruption.
Q29

Design a deadlock-free dining philosophers solution. Which approach works?

  1. All philosophers pick up left chopstick first
  2. All philosophers pick up right chopstick first
  3. One philosopher picks up right first while all others pick up left first (asymmetric solution)
  4. All philosophers pick up both chopsticks at the same time
Create
โœ… Answer: (C) โ€” The asymmetric solution breaks circular wait. If philosopher 4 picks rightโ†’left while others pick leftโ†’right, the cycle P0โ†’P1โ†’P2โ†’P3โ†’P4โ†’P0 is broken because P4 doesn't follow the same order. Option D is impractical (atomicity of picking both is hard to guarantee).
Q30

You need to implement a deadlock detector for a database with 10,000 concurrent transactions. Which data structure would you use for the wait-for graph?

  1. Array
  2. Adjacency list with hash map for O(1) edge lookup
  3. Linked list
  4. Binary tree
Create
โœ… Answer: (B) โ€” An adjacency list with hash map provides O(1) edge insertion/deletion and efficient cycle detection using DFS (O(V+E)). For 10,000 transactions, the graph is sparse (each transaction waits for very few others), making adjacency list far more space-efficient than an adjacency matrix.
Section H

Short Answer Questions (5 Questions)

Q1: Explain the four necessary conditions for deadlock with real-world analogies. (8 marks)

Model Answer:

1. Mutual Exclusion: A resource can only be used by one process at a time. Analogy: A single-seat Indian Railways toilet โ€” only one passenger can use it at a time.

2. Hold & Wait: A process holds at least one resource while waiting for another. Analogy: You hold your phone while waiting for the charger โ€” you won't put the phone down just because the charger isn't available.

3. No Preemption: Resources can't be forcibly taken from a process. Analogy: In a thali restaurant, once a katori is served, the waiter can't snatch it back mid-meal.

4. Circular Wait: A circular chain of processes exists where each waits for a resource held by the next. Analogy: Four friends at a Chinese dinner, each holding one chopstick and waiting for the neighbor's.

Key point: ALL four must hold simultaneously for deadlock. Preventing even one breaks the possibility of deadlock.

Q2: Differentiate between safe state and unsafe state. Give an example of each. (6 marks)

Model Answer:

Safe State: A state where at least one safe sequence of process execution exists โ€” every process can eventually get all its needed resources and complete. Example: Available = [3,3,2], and processes' needs can be satisfied one after another in some order.

Unsafe State: A state where NO safe sequence exists โ€” there's no ordering of processes that guarantees all can complete. Example: Available = [1,1,0] with all processes needing more than [1,1,0] โ€” nobody can proceed.

Critical distinction: Safe โ†’ No deadlock (guaranteed). Unsafe โ†’ Deadlock is POSSIBLE but NOT certain (processes might not actually request their maximum).

Q3: Describe the Resource Request Algorithm of Banker's Algorithm. (8 marks)

Model Answer:

When process Pi makes a request Request[i]:

Step 1: Check Request[i] โ‰ค Need[i] (process hasn't exceeded its declared maximum). If violated โ†’ Error.

Step 2: Check Request[i] โ‰ค Available (resources are actually available). If not โ†’ Process must wait.

Step 3: Pretend to grant the request: Available -= Request[i]; Allocation[i] += Request[i]; Need[i] -= Request[i]. Run the Safety Algorithm on this hypothetical state.

Step 4: If safe โ†’ Grant the request (make changes permanent). If unsafe โ†’ Deny the request, restore old values, process must wait.

Q4: Compare deadlock prevention, avoidance, and detection. Which is best for database systems? (8 marks)

Model Answer:

AspectPreventionAvoidanceDetection
WhenBefore deadlockBefore deadlockAfter deadlock
MethodNegate one conditionDynamic safety checkGraph/algorithm
OverheadLow runtime, restrictiveMedium (O(nยฒm) per request)Low until detection runs
Info neededResource orderingMax claims of all processesCurrent allocations

Best for databases: Detection + Recovery. Databases handle many short transactions; deadlocks are rare but possible. Detection via wait-for graph is cheap, and recovery (aborting younger transaction) has minimal impact since transactions can be retried.

Q5: Explain the difference between starvation and deadlock with examples. (6 marks)

Model Answer:

Deadlock: Multiple processes are stuck waiting for each other's resources in a cycle. NONE can proceed. Example: P1 holds Printer, needs Scanner. P2 holds Scanner, needs Printer. Both wait forever.

Starvation: One process waits indefinitely because others keep getting served first. It's NOT blocked by resource dependencies โ€” just by scheduling unfairness. Example: Priority scheduling where low-priority process never gets CPU because high-priority processes keep arriving.

Key differences: (1) Deadlock involves multiple processes; starvation can affect just one. (2) Deadlock never resolves itself; starvation might resolve if high-priority processes finish. (3) Deadlock involves resource dependencies; starvation is about scheduling fairness. (4) Fix for starvation: aging (gradually increase waiting process's priority).

Section I

Case Studies โ€” Real-World Deadlocks

๐Ÿ“ฆ Case Study 1: Database Deadlock at Flipkart During Big Billion Days

Background

During Flipkart's Big Billion Days (BBD) sale in October 2023, the platform handled 10,000+ orders per second at peak. Their MySQL database clusters ran InnoDB engine with row-level locking to ensure data consistency.

The Deadlock Scenario

Two types of transactions ran concurrently:

  • Transaction A (Order Placement): Lock the inventory table row for the product โ†’ update stock โ†’ then lock the orders table to insert the order
  • Transaction B (Order Cancellation): Lock the orders table row for the order โ†’ update status โ†’ then lock the inventory table to restore stock

Deadlock: Transaction A held a lock on inventory row and waited for orders row. Transaction B held a lock on orders row and waited for inventory row. Classic circular wait!

Transaction A: Holds inventory_row_42 โ†’ Wants orders_row_789 Transaction B: Holds orders_row_789 โ†’ Wants inventory_row_42 Circular wait: A โ†’ orders โ†’ B โ†’ inventory โ†’ A

Impact

Over 2,000 orders got stuck for 47 minutes. Estimated revenue loss: โ‚น2.3 crore. Customer complaints spiked 300% on social media.

Solution Implemented

  1. Detection: Enabled MySQL's innodb_deadlock_detect = ON with innodb_lock_wait_timeout = 5 seconds
  2. Prevention: Changed both transactions to always lock tables in the same order: inventory first, then orders (resource ordering = circular wait prevention)
  3. Recovery: The aborted transaction is automatically retried with exponential backoff

Questions for Discussion

  1. Which of the 4 necessary conditions was violated by the fix? (Answer: Circular Wait)
  2. Why was detection + timeout chosen over Banker's Algorithm? (Answer: Database transactions are short-lived and don't declare maximum needs in advance)
  3. Could this deadlock have been prevented by using NoSQL instead of MySQL? Why or why not?

๐Ÿฆ Case Study 2: RBI Core Banking System โ€” Deadlock Prevention Architecture

Background

The Reserve Bank of India (RBI) operates the Structured Financial Messaging System (SFMS) and the Real-Time Gross Settlement (RTGS) system, processing inter-bank transfers worth โ‚น6 lakh crore daily. A deadlock in this system could freeze India's financial backbone.

Potential Deadlock Scenario

Consider two simultaneous fund transfers:

  • Transfer 1: SBI sends โ‚น100 crore to HDFC โ†’ Lock SBI's account first, then lock HDFC's account
  • Transfer 2: HDFC sends โ‚น50 crore to SBI โ†’ Lock HDFC's account first, then lock SBI's account

If both lock their source account simultaneously:

Transfer 1: Holds SBI_account_lock โ†’ Wants HDFC_account_lock Transfer 2: Holds HDFC_account_lock โ†’ Wants SBI_account_lock Result: DEADLOCK! โ‚น150 crore in transfers frozen.

RBI's Prevention Architecture

  1. Global Account Ordering: All bank accounts are assigned a unique numerical ID. Transfers ALWAYS lock the account with the lower ID first. Since SBI (ID: 001) < HDFC (ID: 047), both transfers lock SBI first โ†’ no circular wait possible.
  2. Queue-Based Processing: Instead of direct locking, transfers are placed in a FIFO queue. A single-threaded processor handles them sequentially โ€” eliminating Hold & Wait.
  3. Heartbeat Monitoring: If a transaction doesn't complete within 30 seconds, it's automatically aborted and re-queued (timeout-based detection).
  4. Redundancy: The system runs on dual active-active data centers (Mumbai and Hyderabad) โ€” if one deadlocks, the other takes over.

Why Banker's Algorithm Isn't Used Here

RTGS processes 500,000+ transactions daily. Banker's Algorithm requires knowing all processes' maximum needs and running a safety check for each request (O(nยฒm)). With 500K daily transactions involving 200+ banks, this would be computationally prohibitive. Resource ordering + timeout is simpler and proven reliable.

Questions for Discussion

  1. Which prevention strategy does the "global account ordering" implement? (Answer: Circular Wait prevention via total resource ordering)
  2. Why is the queue-based approach effective against Hold & Wait?
  3. If RBI wanted to switch to Banker's Algorithm, what information would each bank transaction need to declare in advance?
Section J

Chapter Summary

๐Ÿง  Unit 5 โ€” Deadlock: Key Takeaways

1. Deadlock Definition: A set of processes where each waits for a resource held by another in the set โ€” nobody progresses.

2. Four Necessary Conditions: ALL must hold simultaneously โ€” Mutual Exclusion, Hold & Wait, No Preemption, Circular Wait. Break ANY one โ†’ no deadlock.

3. Resource Allocation Graph: Process nodes (circles), resource nodes (rectangles with dots). Request edge: Pโ†’R. Assignment edge: Rโ†’P. Cycle with single-instance resources = deadlock. Cycle with multi-instance = maybe deadlock.

4. Prevention: Negate one condition. Most practical: resource ordering (circular wait prevention).

5. Avoidance (Banker's Algorithm): Need = Max โˆ’ Allocation. Safety Algorithm finds if a safe sequence exists. Resource Request Algorithm checks if granting a request keeps system safe. O(nยฒm) complexity.

6. Detection: Wait-for graph (single instance) or detection algorithm (multi-instance). Invoked periodically or on CPU utilization drop.

7. Recovery: Process termination (abort all or one-by-one) or resource preemption (rollback + victim selection). Watch for starvation of victims.

8. Starvation โ‰  Deadlock: Starvation = one process waits indefinitely due to unfair scheduling. Deadlock = multiple processes in a cycle, all blocked. Fix starvation with aging.

9. Real-world: Desktop OS = ignore (Ostrich). Databases = detection + recovery. Safety-critical systems = prevention. Cloud = timeout-based detection.

Section K

Earning Checkpoint โ€” Skills vs Earnings Tracker

Skill LearnedTool/MethodPortfolio ArtifactEarning Ready?
4 Necessary ConditionsConceptualโ€”โœ… Yes โ€” viva & interview ready
Resource Allocation GraphsPen & Paper / ASCIIRAG diagrams for 3+ scenariosโœ… Yes โ€” tutoring & assignment help
Banker's AlgorithmPython simulatorComplete Python Banker's Algorithmโœ… Yes โ€” โ‚น200โ€“โ‚น500/problem on Chegg
Safe Sequence FindingStep-by-step numericalWorked examples with full stepsโœ… Yes โ€” exam prep tutoring
Deadlock DetectionWait-For Graph, AlgorithmDetection algorithm implementationโœ… Yes โ€” database consulting
Prevention StrategiesResource ordering designCase study analysis documentsโœ… Yes โ€” system design interviews
RAG VisualizerPython + NetworkXGitHub project with READMEโœ… Yes โ€” portfolio for SRE/Systems jobs
Minimum Viable Earning Setup after this chapter: A GitHub repo with Banker's Algorithm Python code + a Chegg/Course Hero expert account = you can earn โ‚น8,000โ€“โ‚น25,000/month solving OS numericals for students worldwide. Combine with placement prep tutoring on Topmate (โ‚น500โ€“โ‚น1,500/session) for higher income.

โœ… Unit 5 complete. Numerical problems: 5+. MCQs: 30. Ready for Unit 6!

[QR: Link to EduArtha video tutorial โ€” Deadlock & Banker's Algorithm]