Operating Systems
Unit 5: Deadlock
From deadlock characterization to Banker's Algorithm โ master numerical problems, resource allocation graphs, and prevention strategies used in real operating systems.
โฑ๏ธ Time to Complete: 7 hrs theory + 5 hrs lab | ๐ฐ Earning Potential: โน8Kโโน25K/month | ๐ 30 MCQs (Bloom's Mapped)
๐ผ Jobs this unlocks: Systems Engineer (โน6โ10 LPA) | Site Reliability Engineer (โน8โ18 LPA)
Opening Hook โ The Deadlock You Can See
๐ Mumbai's Chandni Chowk Lane โ Two Autos, Zero Movement
Picture this: a narrow lane in Mumbai's Chandni Chowk area. Two auto-rickshaws enter from opposite ends โ nose-to-nose. Neither can reverse because behind each auto, 50 more autos have queued up. The lane is barely wide enough for one vehicle. Auto A holds position in the left half and needs the right half to pass. Auto B holds position in the right half and needs the left half to pass. Both hold what the other needs. Neither will back down.
The entire lane is DEADLOCKED. Traffic police arrive, but even they can't solve it without forcibly towing one auto (preemption) or making all autos in one direction reverse (rollback). Now imagine this happening inside your operating system โ but instead of autos, there are 4 processes, and instead of lane space, they hold printers, memory blocks, and disk drives.
"Could YOU have prevented this?" โ That's exactly what this unit teaches. You'll learn to detect, prevent, avoid, and recover from deadlocks. And you'll solve Banker's Algorithm numericals that appear in every OS exam.
Learning Outcomes โ Bloom's Taxonomy Mapped
| Bloom's Level | Learning Outcome |
|---|---|
| ๐ต Remember | List the 4 necessary conditions for deadlock and define safe/unsafe states |
| ๐ต Understand | Explain why circular wait + hold & wait cause deadlock using RAG diagrams |
| ๐ข Apply | Execute Banker's Algorithm step-by-step: compute Need matrix, find safe sequence, evaluate resource requests |
| ๐ข Analyze | Analyze Resource Allocation Graphs to detect cycles and determine if deadlock exists |
| ๐ Evaluate | Compare prevention vs avoidance vs detection strategies and recommend the best approach for given scenarios |
| ๐ Create | Implement a complete Banker's Algorithm simulator in Python and build a RAG visualizer |
Concept Explanation โ Deadlock from Scratch
1. What is Deadlock?
Plain English: Deadlock is a situation where two or more processes are stuck forever, each waiting for a resource that the other process holds. Nobody makes progress. Nobody gives up. The system is frozen.
๐ Formal Definition
A set of processes is in a deadlocked state when every process in the set is waiting for an event (resource release) that can only be caused by another process in the same set.
AnalogyFour friends sit at a round table for a Chinese dinner. There are only 4 chopsticks โ one between each pair of friends. Each person grabs the chopstick on their left. Now everyone has ONE chopstick and needs the one on their right (held by their neighbor). Nobody eats. Nobody puts their chopstick down. Deadlock.
System ExampleProcess P1 holds Printer, needs Scanner. Process P2 holds Scanner, needs Printer. Both wait forever.
2. Deadlock Characterization โ The 4 Necessary Conditions
A deadlock can arise if and only if all four conditions hold simultaneously. If even one condition is absent, deadlock cannot occur.
| # | Condition | Meaning | Indian Analogy |
|---|---|---|---|
| 1 | Mutual Exclusion | At least one resource must be non-sharable โ only one process can use it at a time | ๐ป Single-seat Indian Railways toilet โ only one person at a time. You can't share it. |
| 2 | Hold & Wait | A process holding at least one resource is waiting to acquire additional resources held by others | ๐ฑ You're holding your phone (resource 1) while waiting for your roommate to free the charger (resource 2). You won't put the phone down. |
| 3 | No Preemption | Resources cannot be forcibly taken away from a process; they must be released voluntarily | ๐ฝ๏ธ In a thali restaurant, once a customer is served a katori, the waiter can't snatch it back mid-meal โ even if someone else desperately needs it. |
| 4 | Circular Wait | A circular chain of processes exists where each is waiting for a resource held by the next in the chain | ๐ฅข The 4-chopstick dinner problem: P1โwaits for P2โwaits for P3โwaits for P4โwaits for P1. A circle. |
3. Resource Allocation Graph (RAG)
A Resource Allocation Graph is a directed graph that helps us visualize and detect deadlocks. It has two types of nodes and two types of edges.
๐ RAG Components
โข Process nodes โ drawn as circles: P1, P2, P3...
โข Resource nodes โ drawn as rectangles with dots inside (each dot = one instance)
Edgesโข Request edge (Pi โ Rj): Process Pi is waiting for resource Rj (arrow from process to resource)
โข Assignment edge (Rj โ Pi): Resource Rj is held by process Pi (arrow from resource to process)
Key Rulesโข If RAG has no cycle โ No deadlock (guaranteed)
โข If RAG has a cycle AND each resource has only 1 instance โ Deadlock exists (guaranteed)
โข If RAG has a cycle AND resources have multiple instances โ Deadlock is possible but NOT guaranteed
Worked Example: RAG with Cycle Detection
Consider 3 processes (P1, P2, P3) and 4 resources (R1 with 1 instance, R2 with 2 instances, R3 with 1 instance, R4 with 1 instance):
RAG (ASCII Art) Resource Allocation Graph ======================== Assignments (Resource โ Process): R1 โ P1 (P1 holds R1) R2 โ P2 (P2 holds one instance of R2) R2 โ P3 (P3 holds one instance of R2) R3 โ P3 (P3 holds R3) Requests (Process โ Resource): P1 โ R2 (P1 wants R2, but both instances are taken) P2 โ R3 (P2 wants R3, but P3 holds it) P3 โ R1 (P3 wants R1, but P1 holds it) Cycle Detection: P1 โ R2 โ P3 โ R1 โ P1 โ CYCLE FOUND! P2 โ R3 โ P3 โ R1 โ P1 โ R2 โ P2 โ CYCLE FOUND! Analysis: R1 has 1 instance, R3 has 1 instance โ single-instance resources in cycle โด Cycle with single-instance resources = DEADLOCK CONFIRMED
4. Deadlock Handling Strategies โ Overview
There are four approaches an OS can take:
| Strategy | Approach | When Used | Real-World Analogy |
|---|---|---|---|
| Ignorance (Ostrich) | Pretend deadlocks don't happen | Most desktop OS (Windows, Linux) | Ignoring the pothole on your street โ hope nobody falls |
| Prevention | Ensure at least one of 4 conditions never holds | Safety-critical systems | One-way streets prevent head-on collision |
| Avoidance | Dynamically check if granting a request is "safe" | Banking, real-time systems | RBI checking if a bank has enough reserves before approving a loan |
| Detection + Recovery | Allow deadlocks, detect them, then break them | Database systems, distributed systems | Traffic police clearing a jam after it forms |
5. Deadlock Prevention โ Negate Each Condition
The idea is simple: if we ensure at least one of the four necessary conditions never holds, deadlock becomes impossible.
| Condition | How to Prevent | Drawback | Indian Example |
|---|---|---|---|
| Mutual Exclusion | Make resources sharable (e.g., read-only files) | Not possible for all resources โ printers, tape drives can't be shared | Library books can be read by many (sharable), but the borrowing card is exclusive |
| Hold & Wait | Process must request ALL resources at once before starting | Low resource utilization โ process holds resources it doesn't need yet; may cause starvation | Like forcing a train passenger to book all connecting trains at once โ even if the second train is 12 hours later |
| No Preemption | If a process can't get a resource, it releases all held resources and restarts | Can cause work loss โ process might have to redo computation | If you can't get a gas cylinder refill, you give up your stove too and start cooking from scratch |
| Circular Wait | Impose a total ordering on resources; processes must request in increasing order | Programmers must know the ordering; may be hard to enforce in complex systems | In a government office, you MUST visit Window 1 โ 2 โ 3 โ 4 in order โ no skipping allowed |
6. Deadlock Avoidance โ Banker's Algorithm โญ MOST IMPORTANT
This is the most-tested topic in OS exams. You WILL get a numerical on this. Let's master it step by step.
Key Terminology
| Term | Meaning | Formula |
|---|---|---|
n | Number of processes | โ |
m | Number of resource types | โ |
Available[m] | Vector of currently available instances of each resource type | โ |
Max[n][m] | Maximum demand of each process | Declared at start |
Allocation[n][m] | Currently allocated resources to each process | โ |
Need[n][m] | Remaining resources each process might need | Need = Max โ Allocation |
Safe State vs Unsafe State
๐ก๏ธ Safe State
A state is safe if there exists at least one safe sequence โ an ordering of all processes such that each process can get its remaining needed resources from currently available resources + resources held by all preceding processes in the sequence.
Key InsightSafe state โ No deadlock (guaranteed)
Unsafe state โ Deadlock is POSSIBLE but NOT guaranteed
Deadlocked state โ Always unsafe
Safety Algorithm โ Pseudocode
Pseudocode // Safety Algorithm โ determines if system is in safe state Step 1: Initialize: Work[m] = Available[m] // copy of available resources Finish[n] = false // no process has finished yet Step 2: Find a process Pi such that: Finish[i] == false AND Need[i] <= Work // i.e., process hasn't finished AND its need can be satisfied If no such Pi exists โ go to Step 4 Step 3: Work = Work + Allocation[i] // Pi finishes, releases its resources Finish[i] = true Add Pi to safe sequence Go to Step 2 Step 4: If Finish[i] == true for ALL i โ System is in SAFE STATE Else โ System is in UNSAFE STATE
๐ WORKED EXAMPLE 1 โ Finding Safe Sequence (5 processes, 3 resource types)
Given: 5 processes (P0โP4), 3 resource types (A, B, C)
Step 1: Compute Need Matrix (Need = Max โ Allocation)
Step 2: Run Safety Algorithm
Initialize: Work = [3, 3, 2], Finish = [F, F, F, F, F]
๐ Iteration 1 โ Scan all processes
โข P0: Need[0] = [7,4,3], Work = [3,3,2] โ 7 โค 3? NO โ Skip
โข P1: Need[1] = [1,2,2], Work = [3,3,2] โ 1 โค 3? YES, 2 โค 3? YES, 2 โค 2? YES โ โ P1 can run!
โ P1 finishes. Work = Work + Allocation[1] = [3,3,2] + [2,0,0] = [5, 3, 2]
โ Finish = [F, T, F, F, F]. Safe sequence so far: <P1>
๐ Iteration 2 โ Scan unfinished processes
โข P0: Need[0] = [7,4,3], Work = [5,3,2] โ 7 โค 5? NO โ Skip
โข P2: Need[2] = [6,0,0], Work = [5,3,2] โ 6 โค 5? NO โ Skip
โข P3: Need[3] = [0,1,1], Work = [5,3,2] โ 0 โค 5? YES, 1 โค 3? YES, 1 โค 2? YES โ โ P3 can run!
โ P3 finishes. Work = [5,3,2] + [2,1,1] = [7, 4, 3]
โ Finish = [F, T, F, T, F]. Safe sequence so far: <P1, P3>
๐ Iteration 3 โ Scan unfinished processes
โข P0: Need[0] = [7,4,3], Work = [7,4,3] โ 7 โค 7? YES, 4 โค 4? YES, 3 โค 3? YES โ โ P0 can run!
โ P0 finishes. Work = [7,4,3] + [0,1,0] = [7, 5, 3]
โ Finish = [T, T, F, T, F]. Safe sequence so far: <P1, P3, P0>
๐ Iteration 4 โ Scan unfinished processes
โข P2: Need[2] = [6,0,0], Work = [7,5,3] โ 6 โค 7? YES, 0 โค 5? YES, 0 โค 3? YES โ โ P2 can run!
โ P2 finishes. Work = [7,5,3] + [3,0,2] = [10, 5, 5]
โ Finish = [T, T, T, T, F]. Safe sequence so far: <P1, P3, P0, P2>
๐ Iteration 5 โ Scan unfinished processes
โข P4: Need[4] = [4,3,1], Work = [10,5,5] โ 4 โค 10? YES, 3 โค 5? YES, 1 โค 5? YES โ โ P4 can run!
โ P4 finishes. Work = [10,5,5] + [0,0,2] = [10, 5, 7]
โ Finish = [T, T, T, T, T]. All processes finished!
โ RESULT: System is in SAFE STATE
Safe Sequence: < P1, P3, P0, P2, P4 >
This is one valid safe sequence. Other safe sequences may also exist (e.g., <P1, P3, P0, P4, P2>).
๐ WORKED EXAMPLE 2 โ Resource Request Algorithm
Question: In the above system (after finding it safe), process P1 requests (1, 0, 2). Can this request be granted?
Resource Request Algorithm โ Pseudocode
Pseudocode // Process Pi requests resources Request[i] Step 1: If Request[i] <= Need[i] โ proceed Else โ ERROR (process exceeded max claim) Step 2: If Request[i] <= Available โ proceed Else โ Pi must WAIT (resources not available) Step 3: Pretend to allocate: Available = Available โ Request[i] Allocation[i] = Allocation[i] + Request[i] Need[i] = Need[i] โ Request[i] Run Safety Algorithm on new state. If safe โ GRANT request If unsafe โ DENY request, restore old values
Solving: P1 requests (1, 0, 2)
Step 1: Check Request โค Need
Request[1] = (1, 0, 2)
Need[1] = (1, 2, 2)
Is (1, 0, 2) โค (1, 2, 2)? โ 1โค1 โ , 0โค2 โ , 2โค2 โ โ PASS
Step 2: Check Request โค Available
Request[1] = (1, 0, 2)
Available = (3, 3, 2)
Is (1, 0, 2) โค (3, 3, 2)? โ 1โค3 โ , 0โค3 โ , 2โค2 โ โ PASS
Step 3: Pretend to allocate and check safety
New Available = (3,3,2) โ (1,0,2) = (2, 3, 0)
New Allocation[1] = (2,0,0) + (1,0,2) = (3, 0, 2)
New Need[1] = (1,2,2) โ (1,0,2) = (0, 2, 0)
Now run Safety Algorithm with updated state:
Safety Check โ Iteration 1
Work = [2, 3, 0]
โข P0: Need = [7,4,3] โ 7 โค 2? NO โ Skip
โข P1: Need = [0,2,0] โ 0 โค 2? YES, 2 โค 3? YES, 0 โค 0? YES โ โ P1 runs!
โ Work = [2,3,0] + [3,0,2] = [5, 3, 2]. Finish[1] = T
Safety Check โ Iteration 2
Work = [5, 3, 2]
โข P0: Need = [7,4,3] โ 7 โค 5? NO โ Skip
โข P3: Need = [0,1,1] โ 0 โค 5? YES, 1 โค 3? YES, 1 โค 2? YES โ โ P3 runs!
โ Work = [5,3,2] + [2,1,1] = [7, 4, 3]. Finish[3] = T
Safety Check โ Iterations 3, 4, 5
Work = [7,4,3] โ P0 can run (Need [7,4,3] โค [7,4,3]) โ Work = [7,5,3] โ P2 can run โ Work = [10,5,5] โ P4 can run โ Work = [10,5,7]
All Finish = T โ SAFE!
โ RESULT: Request (1, 0, 2) by P1 can be GRANTED
Safe sequence after granting: < P1, P3, P0, P2, P4 >
The system remains in a safe state, so the OS should grant P1's request.
๐ WORKED EXAMPLE 3 โ Detecting an UNSAFE State
Given: 4 processes (P0โP3), 3 resource types (A, B, C)
Step 1: Compute Need Matrix
Step 2: Run Safety Algorithm
Initialize: Work = [1, 1, 0], Finish = [F, F, F, F]
๐ Iteration 1 โ Scan all processes
โข P0: Need = [7,4,3], Work = [1,1,0] โ 7 โค 1? NO โ Skip
โข P1: Need = [1,2,2], Work = [1,1,0] โ 1 โค 1? YES, 2 โค 1? NO โ Skip
โข P2: Need = [6,0,0], Work = [1,1,0] โ 6 โค 1? NO โ Skip
โข P3: Need = [2,1,1], Work = [1,1,0] โ 2 โค 1? NO โ Skip
โ No process can be selected! None of them have Need โค Work.
Result: Go to Step 4
Finish = [F, F, F, F] โ NOT all true.
โ โ ๏ธ System is in UNSAFE STATE
โ RESULT: System is in UNSAFE STATE
No safe sequence exists. If all processes were to request their maximum remaining resources right now, the OS cannot guarantee all will finish. Deadlock is possible (but not certain โ processes might not actually request their maximum).
7. Deadlock Detection
Instead of preventing or avoiding deadlocks, some systems allow them to happen and then detect & recover.
Wait-For Graph (WFG)
For single-instance resources, simplify the RAG by removing resource nodes. If Pi is waiting for a resource held by Pj, draw edge Pi โ Pj. A cycle in WFG = deadlock.
Wait-For Graph Original RAG: P1 โ R1 โ P2 โ R2 โ P3 โ R3 โ P1 Wait-For Graph (remove resource nodes): P1 โ P2 โ P3 โ P1 โ CYCLE = DEADLOCK!
Detection Algorithm (Multiple Instance Resources)
Works almost identically to the Safety Algorithm, but uses Request matrix instead of Need matrix (current pending requests, not maximum need).
Pseudocode Step 1: Work = Available For each Pi: if Allocation[i] โ 0, then Finish[i] = false else Finish[i] = true // process with no resources can't be deadlocked Step 2: Find Pi such that Finish[i] == false AND Request[i] <= Work If not found โ go to Step 4 Step 3: Work = Work + Allocation[i] Finish[i] = true Go to Step 2 Step 4: If any Finish[i] == false โ Pi is DEADLOCKED
When to Invoke Detection?
| Option | Frequency | Overhead | Response Time |
|---|---|---|---|
| Every resource request | Very frequent | Very high (O(m ร nยฒ) each time) | Immediate detection |
| Periodically (e.g., every 5 min) | Medium | Medium | Delayed detection |
| When CPU utilization drops below threshold | Adaptive | Low | May detect late |
8. Deadlock Recovery
Once deadlock is detected, the OS must break it. Two main approaches:
A. Process Termination
| Method | How It Works | Pros | Cons |
|---|---|---|---|
| Abort ALL deadlocked processes | Kill every process in the deadlock cycle | Guaranteed to break deadlock immediately | Expensive โ may lose lots of computation; some processes may have been running for hours |
| Abort ONE at a time | Kill one process, re-run detection; repeat if still deadlocked | Minimal damage โ only kill the minimum necessary | Slower โ must run detection algorithm after each kill; which one to kill? |
Which process to kill first? Factors to consider:
- Priority of the process (kill low-priority first)
- How long the process has been running (kill the one with least computation done)
- Resources held by the process (kill the one holding most resources to free maximum)
- How many more resources does it need (kill the one that needs a lot more)
- Is it interactive or batch? (prefer killing batch processes)
B. Resource Preemption
| Concept | Meaning |
|---|---|
| Selecting a victim | Choose which process's resources to preempt (take away) โ based on minimum cost |
| Rollback | Roll the victim process back to a safe checkpoint and restart it |
| Starvation | Same process might always be chosen as victim โ fix by adding a "rollback count" limit |
9. Starvation vs Deadlock โ Comparison
| Parameter | Deadlock | Starvation |
|---|---|---|
| Definition | Processes waiting for each other's resources in a cycle โ NONE can proceed | A process waits indefinitely because other higher-priority processes keep getting served first |
| Blocking | All processes in the cycle are blocked | Only the starving process is blocked; others proceed normally |
| Cause | Circular wait + hold & wait + no preemption + mutual exclusion | Unfair scheduling (e.g., priority scheduling without aging) |
| Resources | Involved โ processes hold resources others need | May not involve resource holding โ just scheduling unfairness |
| Solution | Prevention, avoidance, detection + recovery | Aging (gradually increase priority of waiting processes) |
| Analogy | Two trucks stuck on a single-lane bridge โ neither can move | A person waiting at a government counter where "VIP" token holders always get served first โ the person with a "general" token waits forever |
| Self-resolving? | No โ deadlock never resolves on its own | Possible โ if the higher-priority processes eventually finish |
Learn by Doing โ 3-Tier Lab Structure
๐ข Tier 1 โ GUIDED: Banker's Algorithm Simulator in Python
Complete Python Code โ Banker's Algorithm
Python def bankers_algorithm(processes, resources, allocation, max_matrix, available): """ Banker's Algorithm Implementation Returns: (is_safe, safe_sequence) """ n = len(processes) # number of processes m = len(resources) # number of resource types # Step 1: Calculate Need matrix need = [] for i in range(n): need.append([max_matrix[i][j] - allocation[i][j] for j in range(m)]) print("Need Matrix:") for i in range(n): print(f" {processes[i]}: {need[i]}") print() # Step 2: Initialize Work and Finish work = available[:] finish = [False] * n safe_sequence = [] # Step 3: Find safe sequence for _ in range(n): found = False for i in range(n): if not finish[i]: # Check if Need[i] <= Work if all(need[i][j] <= work[j] for j in range(m)): print(f" {processes[i]}: Need {need[i]} <= Work {work[j] for j in range(m)}") # Process can finish โ release its resources work = [work[j] + allocation[i][j] for j in range(m)] finish[i] = True safe_sequence.append(processes[i]) print(f" โ {processes[i]} finishes. Work = {work}") found = True break if not found: break # Step 4: Check if all finished is_safe = all(finish) return is_safe, safe_sequence # โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ # TEST: Worked Example 1 from textbook # โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ processes = ["P0", "P1", "P2", "P3", "P4"] resources = ["A", "B", "C"] allocation = [ [0, 1, 0], # P0 [2, 0, 0], # P1 [3, 0, 2], # P2 [2, 1, 1], # P3 [0, 0, 2], # P4 ] max_matrix = [ [7, 5, 3], # P0 [3, 2, 2], # P1 [9, 0, 2], # P2 [2, 2, 2], # P3 [4, 3, 3], # P4 ] available = [3, 3, 2] print("โ" * 50) print(" BANKER'S ALGORITHM โ SAFETY CHECK") print("โ" * 50) print(f"Available: {available}\n") is_safe, sequence = bankers_algorithm(processes, resources, allocation, max_matrix, available) print(f"\n{'โ SAFE STATE' if is_safe else 'โ UNSAFE STATE'}") if is_safe: print(f"Safe Sequence: < {', '.join(sequence)} >")
Expected Output
๐ก Tier 2 โ SEMI-GUIDED: Add Resource Request Handling
Your Mission:
Extend the Tier 1 code to handle resource requests. Implement the resource_request() function.
Hints:
- Function signature:
resource_request(process_id, request, allocation, max_matrix, available) - Step 1: Check if
request[j] <= need[process_id][j]for all j. If not โ error - Step 2: Check if
request[j] <= available[j]for all j. If not โ process must wait - Step 3: Create COPIES of allocation, available, and need. Apply the request to the copies. Run
bankers_algorithm()on the copies. - If safe: Apply the request to the originals, print "Request GRANTED"
- If unsafe: Don't modify originals, print "Request DENIED"
๐ด Tier 3 โ OPEN CHALLENGE: RAG Visualizer with NetworkX
The Brief:
Build a Resource Allocation Graph Visualizer using Python's NetworkX and Matplotlib:
- Input: Number of processes, resources, allocation matrix, request matrix
- Output: A visual graph showing process nodes (circles), resource nodes (squares), request edges (dashed), assignment edges (solid)
- Cycle Detection: Highlight cycles in red and print "DEADLOCK DETECTED" or "NO DEADLOCK"
- Libraries:
pip install networkx matplotlib - Hint: Use
nx.DiGraph(),nx.find_cycle(), andnx.draw()
Industry Spotlight โ A Day in the Life
๐จโ๐ป Rohan Gupta, 30 โ Site Reliability Engineer (SRE) at Google India, Bangalore
Background: B.Tech from NIT Trichy. Joined Google as an SDE-1 in 2018. Moved to SRE role in 2020 after debugging a production deadlock in Google Cloud's internal task scheduler that affected 50,000+ VMs.
A Typical Day:
9:00 AM โ Review overnight alerts. Check if any distributed locks in Spanner (Google's global database) timed out โ a sign of potential deadlock.
10:30 AM โ Investigate a latency spike in a microservice. Root cause: two gRPC services holding mutexes and waiting for each other's response. Classic distributed deadlock. Fix: implement a lock ordering protocol.
1:00 PM โ Lunch at Google's Bangalore campus cafeteria. Discuss with the team about adding deadlock detection timeouts to a new service.
2:30 PM โ Write a postmortem (incident report) for last week's Bigtable deadlock. Include root cause analysis, timeline, and prevention measures. This is shared company-wide.
4:00 PM โ Design review for a new resource allocation system. Apply Banker's Algorithm concepts to ensure the system never over-commits GPU resources for ML training jobs.
5:30 PM โ Mentor a junior engineer on lock ordering and deadlock prevention patterns in Go.
| Detail | Info |
|---|---|
| Tools Used Daily | Go, Python, Borgmon (monitoring), Spanner, Bigtable, gRPC, Mutex/Locks |
| Entry Salary (SDE-1) | โน18โ25 LPA + benefits |
| Mid-Level SRE (3โ5 yrs) | โน30โ50 LPA |
| Senior SRE (7+ yrs) | โน50โ80 LPA |
| Companies Hiring SREs | Google, Microsoft, Amazon, Flipkart, Razorpay, PhonePe, Uber India, Atlassian |
Earn With It โ Freelance & Income Roadmap
๐ฐ Your Earning Path After This Chapter
Portfolio Piece: "Banker's Algorithm Simulator" โ a Python tool that takes allocation/max/available matrices as input and outputs safe sequence + handles resource requests. Host on GitHub with a clean README.
Beginner Gig Ideas:
โข Deadlock detection consulting for startups using microservices โ โน5,000โโน15,000 per audit
โข OS assignment solving for engineering students (Banker's Algorithm numericals) โ โน200โโน500 per problem
โข Database deadlock analysis for small e-commerce companies โ โน8,000โโน25,000 per project
โข Building RAG visualization tools for CS professors โ โน3,000โโน8,000 per tool
| Platform | Best For | Typical Rate |
|---|---|---|
| GitHub + LinkedIn | Showcasing OS projects, attracting recruiters | Indirect โ leads to โน6โ10 LPA jobs |
| Chegg / Course Hero | Solving OS numericals for students worldwide | $3โ$8 per question (โน250โโน650) |
| Upwork | Database deadlock analysis, system optimization | $20โ$50/hour |
| Internshala | OS/Systems internships at Indian startups | โน8,000โโน25,000/month |
| Topmate / Preplaced | 1:1 OS tutoring for placement prep | โน500โโน1,500 per session |
โฑ๏ธ Time to First Earning: 1โ2 weeks (if you complete the Banker's Algorithm lab and list on Chegg/GitHub)
MCQ Assessment Bank โ 30 Questions (Bloom's Mapped)
Remember / Identify (Q1โQ5)
Which of the following is NOT one of the four necessary conditions for deadlock?
- Mutual Exclusion
- Hold and Wait
- Aging
- Circular Wait
In a Resource Allocation Graph, a request edge goes from:
- Resource to Process
- Process to Resource
- Process to Process
- Resource to Resource
The Banker's Algorithm is used for deadlock:
- Prevention
- Detection
- Avoidance
- Recovery
In Banker's Algorithm, the Need matrix is calculated as:
- Need = Max + Allocation
- Need = Allocation โ Max
- Need = Max โ Allocation
- Need = Available โ Max
Which deadlock handling strategy does most desktop operating systems (Windows, Linux) use?
- Prevention
- Avoidance
- Detection and Recovery
- Ignore the problem (Ostrich Algorithm)
Understand / Explain (Q6โQ10)
Why does a cycle in a Resource Allocation Graph NOT always indicate deadlock when resources have multiple instances?
- Because cycles can't exist with multiple instances
- Because a process in the cycle might get resources from another available instance, breaking the wait
- Because multiple instances automatically prevent deadlock
- Because the OS ignores cycles with multiple instances
What is the relationship between safe state and deadlock?
- Safe state = no deadlock; Unsafe state = deadlock
- Safe state = no deadlock guaranteed; Unsafe state = deadlock possible but not certain
- Safe and unsafe states have no relation to deadlock
- Unsafe state = deadlock guaranteed
Why is negating Mutual Exclusion impractical for most deadlock prevention?
- Because it's too expensive
- Because most resources (printers, tape drives) are inherently non-sharable
- Because the OS doesn't support it
- Because it causes starvation
In the context of deadlock recovery, what does "rollback" mean?
- Restarting the entire operating system
- Returning a preempted process to a previously saved safe checkpoint state
- Rolling back the clock to before the deadlock
- Removing the process from memory permanently
How does a Wait-For Graph differ from a Resource Allocation Graph?
- WFG includes resource nodes; RAG doesn't
- WFG removes resource nodes and draws direct edges between processes; RAG includes resource nodes
- They are identical
- WFG is used for avoidance; RAG for detection
Apply / Solve (Q11โQ18) โ Numerical Problems
Given: Allocation = [[1,0],[0,1]], Max = [[2,1],[1,2]], Available = [0,1]. What is the Need matrix?
- [[1,1],[1,1]]
- [[3,1],[1,3]]
- [[1,0],[0,1]]
- [[2,1],[1,2]]
For the system in Q11, is the system in a safe state?
- Yes, safe sequence is <P0, P1>
- Yes, safe sequence is <P1, P0>
- No, the system is unsafe
- Cannot be determined
Given 3 processes, 2 resource types: Allocation = [[1,0],[0,1],[1,1]], Max = [[2,1],[1,2],[2,2]], Available = [1,0]. What is the Need matrix?
- [[1,1],[1,1],[1,1]]
- [[1,0],[0,1],[1,1]]
- [[3,1],[1,3],[3,3]]
- [[2,1],[1,2],[2,2]]
Using the textbook Worked Example 1 (5 processes, Available = [3,3,2]), which process is selected FIRST in the safety algorithm?
- P0
- P1
- P3
- P4
In Worked Example 1, after P1 and P3 finish, what is the Work vector?
- [5, 3, 2]
- [7, 4, 3]
- [3, 3, 2]
- [10, 5, 5]
In Worked Example 2, after P1 requests (1,0,2), the new Available vector becomes:
- [3, 3, 2]
- [2, 3, 0]
- [4, 3, 4]
- [1, 0, 2]
Given: 3 processes, 1 resource type with 5 instances. Allocation = [1, 2, 1], Max = [3, 4, 2]. Available = [1]. Can P2 finish first?
- Yes โ P2 needs 1 more, and 1 is available
- No โ P2 needs 2 more resources
- Yes โ P2 needs 0 more resources
- Cannot be determined
In a system with 4 processes and 3 resource types, the total resources are [10, 5, 7]. Current allocation sums to [7, 2, 5]. What is the Available vector?
- [10, 5, 7]
- [7, 2, 5]
- [3, 3, 2]
- [17, 7, 12]
Analyze / Compare (Q19โQ23)
A RAG has 3 processes and 3 single-instance resources. P1 holds R1 and requests R2. P2 holds R2 and requests R3. P3 holds R3 and requests R1. Is there a deadlock?
- No โ no cycle exists
- Yes โ cycle P1โR2โP2โR3โP3โR1โP1 exists with single-instance resources
- Maybe โ depends on the scheduling algorithm
- No โ because all resources have single instances
Compare: Which deadlock prevention method has the LEAST resource utilization?
- Negating Mutual Exclusion
- Negating Hold & Wait (request all at once)
- Negating No Preemption
- Negating Circular Wait
In a distributed system with 10 microservices, which deadlock handling strategy is most practical?
- Prevention โ force all services to request resources in order
- Avoidance โ run Banker's Algorithm across all services
- Detection + Recovery โ use timeouts and retry logic
- Ignore โ distributed deadlocks never happen
Which of the following is true about the Banker's Algorithm?
- It works without knowing the maximum resource needs of processes
- It requires processes to declare their maximum needs in advance
- It can only handle single-instance resources
- It detects deadlocks after they occur
What is the time complexity of the Safety Algorithm in Banker's Algorithm with n processes and m resource types?
- O(n)
- O(n ร m)
- O(nยฒ ร m)
- O(n!)
Evaluate / Judge (Q24โQ27)
A company's database experiences deadlocks 3 times per year, each causing 10 minutes of downtime. The cost of implementing deadlock prevention is โน50 lakhs + ongoing 15% performance overhead. Should they implement prevention?
- Yes โ any deadlock is unacceptable
- No โ the cost of prevention far exceeds the cost of 30 minutes/year downtime; use detection + recovery instead
- Yes โ but only the Banker's Algorithm
- No โ just ignore deadlocks entirely
In a nuclear power plant's control system, which deadlock handling strategy should be used?
- Ostrich Algorithm โ deadlocks are rare
- Detection and Recovery โ fix after it happens
- Prevention โ ensure deadlocks can NEVER occur
- Avoidance โ use Banker's Algorithm
A system uses "abort one process at a time" for deadlock recovery. Which factor should be given HIGHEST priority when selecting which process to terminate?
- The process with the shortest name
- The process holding the most resources (freeing maximum resources)
- The process that started most recently
- A random process
Banker's Algorithm has a known limitation. Which scenario makes it IMPRACTICAL?
- When the number of processes is small
- When processes know their maximum needs in advance
- When the number of processes and resources changes dynamically (cloud computing)
- When resources are single-instance
Create / Design (Q28โQ30)
You're designing an OS for an ATM network where each ATM can lock a customer's account and a central ledger simultaneously. To prevent deadlock, which strategy is BEST?
- Always lock the customer account before the central ledger (resource ordering)
- Let each ATM lock in any order and use detection
- Require ATMs to lock both simultaneously (all-or-nothing)
- Don't use locks at all
Design a deadlock-free dining philosophers solution. Which approach works?
- All philosophers pick up left chopstick first
- All philosophers pick up right chopstick first
- One philosopher picks up right first while all others pick up left first (asymmetric solution)
- All philosophers pick up both chopsticks at the same time
You need to implement a deadlock detector for a database with 10,000 concurrent transactions. Which data structure would you use for the wait-for graph?
- Array
- Adjacency list with hash map for O(1) edge lookup
- Linked list
- Binary tree
Short Answer Questions (5 Questions)
Q1: Explain the four necessary conditions for deadlock with real-world analogies. (8 marks)
Model Answer:
1. Mutual Exclusion: A resource can only be used by one process at a time. Analogy: A single-seat Indian Railways toilet โ only one passenger can use it at a time.
2. Hold & Wait: A process holds at least one resource while waiting for another. Analogy: You hold your phone while waiting for the charger โ you won't put the phone down just because the charger isn't available.
3. No Preemption: Resources can't be forcibly taken from a process. Analogy: In a thali restaurant, once a katori is served, the waiter can't snatch it back mid-meal.
4. Circular Wait: A circular chain of processes exists where each waits for a resource held by the next. Analogy: Four friends at a Chinese dinner, each holding one chopstick and waiting for the neighbor's.
Key point: ALL four must hold simultaneously for deadlock. Preventing even one breaks the possibility of deadlock.
Q2: Differentiate between safe state and unsafe state. Give an example of each. (6 marks)
Model Answer:
Safe State: A state where at least one safe sequence of process execution exists โ every process can eventually get all its needed resources and complete. Example: Available = [3,3,2], and processes' needs can be satisfied one after another in some order.
Unsafe State: A state where NO safe sequence exists โ there's no ordering of processes that guarantees all can complete. Example: Available = [1,1,0] with all processes needing more than [1,1,0] โ nobody can proceed.
Critical distinction: Safe โ No deadlock (guaranteed). Unsafe โ Deadlock is POSSIBLE but NOT certain (processes might not actually request their maximum).
Q3: Describe the Resource Request Algorithm of Banker's Algorithm. (8 marks)
Model Answer:
When process Pi makes a request Request[i]:
Step 1: Check Request[i] โค Need[i] (process hasn't exceeded its declared maximum). If violated โ Error.
Step 2: Check Request[i] โค Available (resources are actually available). If not โ Process must wait.
Step 3: Pretend to grant the request: Available -= Request[i]; Allocation[i] += Request[i]; Need[i] -= Request[i]. Run the Safety Algorithm on this hypothetical state.
Step 4: If safe โ Grant the request (make changes permanent). If unsafe โ Deny the request, restore old values, process must wait.
Q4: Compare deadlock prevention, avoidance, and detection. Which is best for database systems? (8 marks)
Model Answer:
| Aspect | Prevention | Avoidance | Detection |
|---|---|---|---|
| When | Before deadlock | Before deadlock | After deadlock |
| Method | Negate one condition | Dynamic safety check | Graph/algorithm |
| Overhead | Low runtime, restrictive | Medium (O(nยฒm) per request) | Low until detection runs |
| Info needed | Resource ordering | Max claims of all processes | Current allocations |
Best for databases: Detection + Recovery. Databases handle many short transactions; deadlocks are rare but possible. Detection via wait-for graph is cheap, and recovery (aborting younger transaction) has minimal impact since transactions can be retried.
Q5: Explain the difference between starvation and deadlock with examples. (6 marks)
Model Answer:
Deadlock: Multiple processes are stuck waiting for each other's resources in a cycle. NONE can proceed. Example: P1 holds Printer, needs Scanner. P2 holds Scanner, needs Printer. Both wait forever.
Starvation: One process waits indefinitely because others keep getting served first. It's NOT blocked by resource dependencies โ just by scheduling unfairness. Example: Priority scheduling where low-priority process never gets CPU because high-priority processes keep arriving.
Key differences: (1) Deadlock involves multiple processes; starvation can affect just one. (2) Deadlock never resolves itself; starvation might resolve if high-priority processes finish. (3) Deadlock involves resource dependencies; starvation is about scheduling fairness. (4) Fix for starvation: aging (gradually increase waiting process's priority).
Case Studies โ Real-World Deadlocks
๐ฆ Case Study 1: Database Deadlock at Flipkart During Big Billion Days
Background
During Flipkart's Big Billion Days (BBD) sale in October 2023, the platform handled 10,000+ orders per second at peak. Their MySQL database clusters ran InnoDB engine with row-level locking to ensure data consistency.
The Deadlock Scenario
Two types of transactions ran concurrently:
- Transaction A (Order Placement): Lock the
inventorytable row for the product โ update stock โ then lock theorderstable to insert the order - Transaction B (Order Cancellation): Lock the
orderstable row for the order โ update status โ then lock theinventorytable to restore stock
Deadlock: Transaction A held a lock on inventory row and waited for orders row. Transaction B held a lock on orders row and waited for inventory row. Classic circular wait!
Impact
Over 2,000 orders got stuck for 47 minutes. Estimated revenue loss: โน2.3 crore. Customer complaints spiked 300% on social media.
Solution Implemented
- Detection: Enabled MySQL's
innodb_deadlock_detect = ONwithinnodb_lock_wait_timeout = 5seconds - Prevention: Changed both transactions to always lock tables in the same order:
inventoryfirst, thenorders(resource ordering = circular wait prevention) - Recovery: The aborted transaction is automatically retried with exponential backoff
Questions for Discussion
- Which of the 4 necessary conditions was violated by the fix? (Answer: Circular Wait)
- Why was detection + timeout chosen over Banker's Algorithm? (Answer: Database transactions are short-lived and don't declare maximum needs in advance)
- Could this deadlock have been prevented by using NoSQL instead of MySQL? Why or why not?
๐ฆ Case Study 2: RBI Core Banking System โ Deadlock Prevention Architecture
Background
The Reserve Bank of India (RBI) operates the Structured Financial Messaging System (SFMS) and the Real-Time Gross Settlement (RTGS) system, processing inter-bank transfers worth โน6 lakh crore daily. A deadlock in this system could freeze India's financial backbone.
Potential Deadlock Scenario
Consider two simultaneous fund transfers:
- Transfer 1: SBI sends โน100 crore to HDFC โ Lock SBI's account first, then lock HDFC's account
- Transfer 2: HDFC sends โน50 crore to SBI โ Lock HDFC's account first, then lock SBI's account
If both lock their source account simultaneously:
RBI's Prevention Architecture
- Global Account Ordering: All bank accounts are assigned a unique numerical ID. Transfers ALWAYS lock the account with the lower ID first. Since SBI (ID: 001) < HDFC (ID: 047), both transfers lock SBI first โ no circular wait possible.
- Queue-Based Processing: Instead of direct locking, transfers are placed in a FIFO queue. A single-threaded processor handles them sequentially โ eliminating Hold & Wait.
- Heartbeat Monitoring: If a transaction doesn't complete within 30 seconds, it's automatically aborted and re-queued (timeout-based detection).
- Redundancy: The system runs on dual active-active data centers (Mumbai and Hyderabad) โ if one deadlocks, the other takes over.
Why Banker's Algorithm Isn't Used Here
RTGS processes 500,000+ transactions daily. Banker's Algorithm requires knowing all processes' maximum needs and running a safety check for each request (O(nยฒm)). With 500K daily transactions involving 200+ banks, this would be computationally prohibitive. Resource ordering + timeout is simpler and proven reliable.
Questions for Discussion
- Which prevention strategy does the "global account ordering" implement? (Answer: Circular Wait prevention via total resource ordering)
- Why is the queue-based approach effective against Hold & Wait?
- If RBI wanted to switch to Banker's Algorithm, what information would each bank transaction need to declare in advance?
Chapter Summary
๐ง Unit 5 โ Deadlock: Key Takeaways
1. Deadlock Definition: A set of processes where each waits for a resource held by another in the set โ nobody progresses.
2. Four Necessary Conditions: ALL must hold simultaneously โ Mutual Exclusion, Hold & Wait, No Preemption, Circular Wait. Break ANY one โ no deadlock.
3. Resource Allocation Graph: Process nodes (circles), resource nodes (rectangles with dots). Request edge: PโR. Assignment edge: RโP. Cycle with single-instance resources = deadlock. Cycle with multi-instance = maybe deadlock.
4. Prevention: Negate one condition. Most practical: resource ordering (circular wait prevention).
5. Avoidance (Banker's Algorithm): Need = Max โ Allocation. Safety Algorithm finds if a safe sequence exists. Resource Request Algorithm checks if granting a request keeps system safe. O(nยฒm) complexity.
6. Detection: Wait-for graph (single instance) or detection algorithm (multi-instance). Invoked periodically or on CPU utilization drop.
7. Recovery: Process termination (abort all or one-by-one) or resource preemption (rollback + victim selection). Watch for starvation of victims.
8. Starvation โ Deadlock: Starvation = one process waits indefinitely due to unfair scheduling. Deadlock = multiple processes in a cycle, all blocked. Fix starvation with aging.
9. Real-world: Desktop OS = ignore (Ostrich). Databases = detection + recovery. Safety-critical systems = prevention. Cloud = timeout-based detection.
Earning Checkpoint โ Skills vs Earnings Tracker
| Skill Learned | Tool/Method | Portfolio Artifact | Earning Ready? |
|---|---|---|---|
| 4 Necessary Conditions | Conceptual | โ | โ Yes โ viva & interview ready |
| Resource Allocation Graphs | Pen & Paper / ASCII | RAG diagrams for 3+ scenarios | โ Yes โ tutoring & assignment help |
| Banker's Algorithm | Python simulator | Complete Python Banker's Algorithm | โ Yes โ โน200โโน500/problem on Chegg |
| Safe Sequence Finding | Step-by-step numerical | Worked examples with full steps | โ Yes โ exam prep tutoring |
| Deadlock Detection | Wait-For Graph, Algorithm | Detection algorithm implementation | โ Yes โ database consulting |
| Prevention Strategies | Resource ordering design | Case study analysis documents | โ Yes โ system design interviews |
| RAG Visualizer | Python + NetworkX | GitHub project with README | โ Yes โ portfolio for SRE/Systems jobs |
โ Unit 5 complete. Numerical problems: 5+. MCQs: 30. Ready for Unit 6!
[QR: Link to EduArtha video tutorial โ Deadlock & Banker's Algorithm]