Computer Organization & Architecture

Unit 5: Input-Output Organization

From peripheral devices to DMA controllers — master how computers communicate with the outside world, handle interrupts, and transfer data at blazing speed.

⏱️ 5 hrs theory + 3 hrs lab | 🎯 GATE ~2 marks | 🖥️ Aadhaar Biometric I/O

💼 Jobs this unlocks: Embedded Systems Engineer (₹5–10 LPA) | Hardware Design Engineer (₹6–12 LPA) | IoT Developer (₹4–8 LPA)

Section A

Opening Hook — The Fingerprint That Feeds 80 Crore Indians

🖐️ How Aadhaar's Fingerprint Scanner Bypasses the CPU

Walk into any Indian ration shop, place your thumb on the biometric scanner, and within 2 seconds your identity is verified against a database of 1.4 billion records. But here's the engineering marvel most people miss: when that fingerprint scanner captures your print, the image data doesn't pass through the CPU.

The biometric device uses Direct Memory Access (DMA) — a hardware technique where the scanner writes fingerprint data directly into memory, bypassing the CPU entirely. Why? Because the CPU is too busy running the operating system, managing the display, and handling network packets. If the CPU had to personally move every byte of fingerprint image data, the verification would take 10× longer.

This isn't just theory — it's the I/O architecture that powers India's largest digital identity system. The same DMA principle is used in your phone (camera sensor → memory), your laptop (SSD → RAM), and every ATM you've ever used. This chapter teaches you exactly how all of this works.

🇮🇳 UIDAI (Aadhaar)🇮🇳 Texas Instruments🇮🇳 ISRO🇮🇳 DRDO🇮🇳 Qualcomm India🇮🇳 Intel India

India's Aadhaar system is the world's largest biometric database. It processes over 100 million authentication requests per day. Each biometric scanner performs I/O operations using DMA, interrupt-driven transfers, and serial communication (UART) — the exact three techniques you'll learn in this chapter. The entire authentication pipeline completes in under 200 milliseconds.

Section B

Learning Outcomes — Bloom's Taxonomy Mapped

Bloom's Level	Learning Outcome
🔵 Remember	List the three modes of data transfer (Programmed, Interrupt-driven, DMA) and define each
🔵 Remember	Recall the difference between Memory-mapped I/O and Isolated (I/O-mapped) I/O
🟢 Understand	Explain how DMA transfers data without CPU intervention and describe cycle stealing
🟢 Understand	Describe the handshaking protocol in asynchronous data transfer with timing diagrams
🟡 Apply	Calculate DMA transfer rates, bus bandwidth, and interrupt latency for given configurations
🟡 Apply	Draw the UART frame format for a given character with correct start, data, parity, and stop bits
🟠 Analyze	Compare daisy chain vs parallel priority interrupt structures with trade-offs
🟠 Analyze	Analyze why certain I/O devices (keyboard vs disk) use different transfer modes
🔴 Evaluate	Evaluate which data transfer mode is optimal for a given real-world I/O scenario (sensor, camera, network card)
🔴 Evaluate	Assess the performance impact of DMA burst mode vs cycle stealing on CPU utilization
🟣 Create	Design a priority interrupt system for a given set of devices with different priority levels
🟣 Create	Design a complete I/O interface block diagram for an embedded system with multiple peripherals

Section C

Concept Explanation — I/O Organization from Scratch

1. Peripheral Devices

A computer processor by itself is just a number-crunching machine. It becomes useful only when it can communicate with the outside world — keyboards, monitors, printers, scanners, network cards, and sensors. These external devices are called peripheral devices (or simply peripherals).

🖥️ Classification of Peripheral Devices

Input Devices: Send data INTO the computer — Keyboard, mouse, scanner, microphone, fingerprint sensor, barcode reader, webcam

Output Devices: Receive data FROM the computer — Monitor, printer, speaker, LED display, motor controller

Input/Output (I/O) Devices: Both send and receive — Hard disk, SSD, USB flash drive, network card (NIC), touchscreen, modem

Every UPI payment terminal in India uses multiple peripherals simultaneously: a touchscreen (input/output), a QR scanner (input), a thermal receipt printer (output), and a network module (I/O). The I/O organization determines how all these devices share the processor and memory without conflicts.

The fundamental challenge: peripherals operate at vastly different speeds. A keyboard generates ~10 bytes/second, a mouse ~100 bytes/second, but an NVMe SSD can deliver 7 GB/second. The CPU runs at GHz speeds. How do you connect a snail to a bullet train? That's what I/O organization solves.

Speed Hierarchy
  Device             Speed              Analogy
  ─────────────────  ─────────────────  ──────────────────
  Keyboard           ~10 B/s            🚶 Walking
  Mouse              ~100 B/s           🚲 Cycling
  Printer            ~100 KB/s          🚗 Car
  USB 2.0            ~60 MB/s           🚆 Train
  Gigabit Ethernet   ~125 MB/s          ✈️ Airplane
  SATA SSD           ~600 MB/s          🚀 Rocket
  NVMe SSD           ~7 GB/s            ⚡ Lightning
  CPU ↔ RAM          ~50 GB/s           💫 Speed of Light

2. I/O Interface

Peripherals can't talk directly to the CPU — they speak different "languages" (voltage levels, data formats, speeds). An I/O Interface acts as a translator that sits between the CPU/memory bus and the peripheral device.

Analogy: Think of a post office in a small town. The CPU is the District Collector (DC) who writes orders. The I/O Interface is the post office that receives the DC's order, translates it into the right format (Hindi to local language), and delivers it to the right person (device). The post office also handles incoming letters (data from devices) and routes them to the DC.

🔌 CPU → I/O Interface → Device Architecture

  ┌──────────┐                                     ┌──────────────┐
  │          │    Address Bus                       │              │
  │          │══════════════════╗                   │   Peripheral │
  │   CPU    │    Data Bus      ║   ┌────────────┐ │    Device    │
  │          │══════════════════╬══▶│  I/O       │ │  (Printer,   │
  │          │    Control Bus   ║   │ Interface  │═══▶ Scanner,   │
  │          │══════════════════╝   │ (Port/     │ │   Disk...)   │
  │          │                      │  Controller│ │              │
  └──────────┘                      └────────────┘ └──────────────┘
       │                                 │
       │         ┌──────────┐            │
       └════════▶│  Memory  │◀═══════════┘
                 │  (RAM)   │
                 └──────────┘

  Functions of I/O Interface:
  ├── Data Format Conversion (serial ↔ parallel)
  ├── Speed Matching (fast CPU ↔ slow device via buffers)
  ├── Device Selection (address decoding)
  ├── Status Monitoring (ready / busy / error flags)
  └── Control Signal Generation (read, write, strobe)

Memory-Mapped I/O vs Isolated I/O

There are two ways the CPU can address I/O devices:

Memory-Mapped I/O
  ┌──────────────────────────────────────┐
  │     Single Address Space             │
  │  ┌─────────────────────────────┐     │
  │  │  0x0000 ─── RAM             │     │
  │  │  ...                        │     │
  │  │  0x7FFF ─── RAM (ends)      │     │
  │  │  0x8000 ─── Keyboard Port   │ ◀── I/O addresses are
  │  │  0x8004 ─── Display Port    │     part of memory map
  │  │  0x8008 ─── Printer Port    │     │
  │  └─────────────────────────────┘     │
  │  CPU uses: MOV, LOAD, STORE          │
  │  Same instructions for I/O & memory  │
  └──────────────────────────────────────┘

Isolated I/O (I/O-Mapped)
  ┌────────────────────┐   ┌────────────────────┐
  │  Memory Space      │   │  I/O Space          │
  │  0x0000 ─── RAM    │   │  Port 0 ─ Keyboard  │
  │  ...               │   │  Port 1 ─ Display   │
  │  0xFFFF ─── RAM    │   │  Port 2 ─ Printer   │
  │                    │   │                     │
  │  SEPARATE spaces   │   │  SEPARATE spaces    │
  └────────────────────┘   └─────────────────────┘
  CPU uses: IN / OUT (special I/O instructions)
  IO/M̄ control line distinguishes the two spaces

Feature	Memory-Mapped I/O	Isolated I/O (I/O-Mapped)
Address Space	Shared with memory	Separate I/O address space
Instructions	MOV, ADD, any ALU instruction	Special IN/OUT instructions only
Control Line	No special I/O line needed	IO/M̄ line needed
Address Bits	Reduces available memory addresses	Full memory space preserved
Flexibility	Can use any instruction on I/O	Limited to IN/OUT
Hardware	Simpler	Needs extra decoder for I/O space
Used By	ARM (Raspberry Pi), MIPS, RISC-V	Intel x86 (your PC)
Indian Example	ARM-based Aadhaar biometric devices	Old railway booking terminals (x86)

Students confuse "Memory-Mapped I/O" with "DMA." Memory-mapped I/O is about addressing — how the CPU refers to an I/O port (as a memory address). DMA is about data transfer — moving data without CPU intervention. They are completely different concepts that can coexist in the same system.

3. Data Transfer Modes — How Data Moves Between CPU and Peripherals

There are three fundamental ways data can move between I/O devices and the CPU/memory. Think of them as three levels of postal service:

📦 The Three Modes of Data Transfer

Mode 1 — Programmed I/O (Polling): The CPU keeps asking the device "Are you ready?" in a loop. Like standing at the door waiting for a courier — you do nothing else until the parcel arrives.

Mode 2 — Interrupt-Driven I/O: The device sends a signal (interrupt) when it's ready. Like giving your phone number to the courier — "Call me when you arrive, I'll be working on other things."

Mode 3 — Direct Memory Access (DMA): A special hardware controller transfers data directly between device and memory, without involving the CPU at all. Like hiring a personal assistant to receive all parcels and put them in the right room — you don't even know the delivery happened.

Programmed I/O (Polling)
  CPU                          Device
   │                             │
   │──── Read Status ──────────▶ │
   │◀─── Status: NOT READY ──── │
   │                             │
   │──── Read Status ──────────▶ │    ← CPU stuck in
   │◀─── Status: NOT READY ──── │      busy-wait loop
   │                             │      (wasting time!)
   │──── Read Status ──────────▶ │
   │◀─── Status: READY ──────── │    ← Finally ready!
   │                             │
   │──── Read Data ────────────▶ │
   │◀─── Data: 0x4D ─────────── │    ← Data transferred
   │                             │
   ▼                             ▼

Interrupt-Driven I/O
  CPU                          Device
   │                             │
   │ ← doing other work...       │ ← preparing data...
   │    (executing programs)     │
   │                             │
   │◀════ INTERRUPT! ═══════════ │  ← Device signals CPU
   │                             │
   │──── Save context ────┐     │
   │     (push registers) │     │
   │                      ▼     │
   │──── Read Data ────────────▶ │
   │◀─── Data: 0x4D ─────────── │  ← Data transferred
   │                             │
   │──── Restore context ──┐    │
   │     (pop registers)   │    │
   │◀──────────────────────┘    │
   │ ← resume original work     │
   ▼                             ▼

Indian Analogy — Three Types of Bank Transfers:
Programmed I/O = Going to the bank counter — you stand in line, wait your turn, hand over the cheque, wait for processing, collect receipt. You're stuck at the bank the whole time.
Interrupt I/O = Getting an emergency call during a meeting — you're busy working, the bank calls "Your cheque is cleared!", you briefly handle it and go back to your meeting.
DMA = NEFT/RTGS transfer — you set it up once, then the bank transfers money directly from account to account without you being involved at all. You don't even need to be awake!

Feature	Programmed I/O	Interrupt-Driven I/O	DMA
CPU Involvement	100% — CPU stuck in loop	Partial — CPU handles interrupt	Minimal — CPU only initiates
CPU Efficiency	❌ Very poor (busy waiting)	✅ Good	✅✅ Excellent
Speed	Slow	Medium	Fast
Hardware Cost	💰 Cheapest	💰💰 Moderate	💰💰💰 Expensive (DMA controller)
Data Path	Device → CPU → Memory	Device → CPU → Memory	Device → Memory (bypasses CPU)
Best For	Slow devices (keyboard)	Medium devices (mouse, UART)	Fast devices (disk, camera, NIC)
GATE Favourite	Comparison questions	ISR, vector table Qs	DMA transfer time numericals

GATE Exam Shortcut: If a question asks "which mode is most efficient for high-speed devices?" — the answer is always DMA. If it asks "which mode wastes CPU cycles?" — Programmed I/O. These are guaranteed 1-mark questions.

4. Direct Memory Access (DMA) — The Star of Unit 5

DMA is the most important topic in this unit. It appears in GATE every 2–3 years, and understanding it deeply will help you crack both exams and interviews.

How DMA Works — Step by Step

⚡ DMA Controller Block Diagram

                    ┌─────────────────────────────────────────┐
                    │          DMA CONTROLLER                 │
                    │                                         │
  CPU ◀──────────── │  ┌──────────────┐  ┌──────────────┐    │
   │   Bus Request  │  │ Address Reg  │  │  Word Count  │    │
   │   (BR/HRQ)     │  │  (AR)        │  │  Register    │    │
   │                │  │ Starting     │  │  (WC)        │    │
   │ ──────────────▶│  │ address in   │  │  Number of   │    │
   │   Bus Grant    │  │ memory       │  │  words to    │    │
   │   (BG/HLDA)    │  └──────┬───────┘  │  transfer    │    │
   │                │         │          └──────┬───────┘    │
   │                │  ┌──────┴───────┐         │            │
  ═══ System Bus ═══╬══│ Control      │◀────────┘            │
   │                │  │ Logic        │                      │
   │                │  │ (R/W̄, DMA   │  ┌──────────────┐    │
   │                │  │  Request,    │  │  Data Reg    │    │──▶ I/O Device
   │                │  │  DMA Ack)    │──│  (DR)        │    │
   │                │  └──────────────┘  │  Buffer for  │    │
   │                │                    │  data byte   │    │
   │                │                    └──────────────┘    │
  ┌─┐               └─────────────────────────────────────────┘
  │M│
  │E│   DMA Registers:
  │M│   ├── AR  = Starting memory address
  │O│   ├── WC  = Word count (decrements each transfer)
  │R│   ├── DR  = Data register (holds one word in transit)
  │Y│   └── Control = Direction (Read/Write), DMA mode
  └─┘

DMA Transfer Sequence

DMA Handshake
  Step   CPU                    DMA Controller              Device
  ────   ────────────────────   ─────────────────────────   ──────────
   1     CPU initializes DMA:
         → Loads AR (start addr)
         → Loads WC (word count)
         → Sets direction (R/W)
         → Enables DMA

   2     CPU continues its        DMA waits for device       Device
         normal work...           request...                 prepares data

   3                               Device sends DRQ ◀──────── DRQ
                                   (DMA Request)

   4                               DMA sends BR ────────────▶ CPU
                                   (Bus Request / HRQ)

   5     CPU finishes current
         bus cycle, then sends
         BG (Bus Grant / HLDA) ──▶ DMA takes over bus

   6     CPU floats its                                      
         address/data lines        DMA puts address on bus
         (tri-state)               DMA transfers 1 word:
                                   Device ◀──▶ Memory

   7                               WC = WC - 1
                                   AR = AR + 1
                                   If WC ≠ 0: repeat from 3
                                   If WC = 0: send interrupt
                                   to CPU (transfer complete!)

   8     CPU gets interrupt,
         resumes normal bus
         ownership

DMA Transfer Modes

Mode	How It Works	CPU Impact	Use Case
Burst Mode	DMA holds the bus for entire block transfer	CPU blocked for duration of transfer	Disk read (large sequential block)
Cycle Stealing	DMA takes bus for 1 word, returns it, then takes again	CPU slowed down slightly, but not blocked	Network card, audio streaming
Transparent / Interleaved	DMA uses bus only when CPU is not using it	Zero CPU impact (uses idle bus cycles)	Refresh, background transfers

Cycle Stealing — Timing Diagram
  Bus Ownership Timeline:
  
  Time →  T1   T2   T3   T4   T5   T6   T7   T8   T9   T10  T11  T12
         ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
  Owner: │CPU │CPU │DMA │CPU │CPU │DMA │CPU │CPU │DMA │CPU │CPU │CPU │
         │    │    │ ↑  │    │    │ ↑  │    │    │ ↑  │    │    │    │
         └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
                    │              │              │
                   steal          steal          steal
                  1 word         1 word         1 word

  DMA "steals" one bus cycle, transfers one word,
  then gives bus back. CPU barely notices the slowdown.

Your laptop uses DMA right now! When you copy a file from USB to SSD, the DMA controller moves gigabytes of data directly between the USB controller and SSD controller through memory — your CPU is free to play YouTube simultaneously. Without DMA, copying a 4 GB file would freeze your entire system.

Students think DMA completely eliminates CPU involvement. Not true! The CPU must initialize the DMA controller (set AR, WC, direction) and respond to the completion interrupt. DMA eliminates CPU involvement only during the actual data transfer. Think of it as delegating — the boss (CPU) assigns the task and checks the result, but doesn't do the grunt work.

5. Input-Output Processor (IOP)

While DMA handles simple block transfers, complex I/O operations need a dedicated processor — the I/O Processor (IOP), also called an I/O Channel.

Analogy: DMA is like a delivery boy who can carry packages from A to B. An IOP is like a logistics manager who can plan routes, handle returns, repackage items, manage multiple deliveries simultaneously, and make decisions without calling the CEO (CPU).

🔄 IOP Architecture

  ┌──────────┐         ┌──────────────────┐
  │          │  CMD    │    I/O Processor  │
  │   CPU    │────────▶│    (Channel)      │
  │ (Main    │         │                   │
  │  Proc.)  │         │  ┌─────────────┐  │       ┌──────────┐
  │          │◀────────│  │ Channel Cmd │  │──────▶│ Device 1 │
  │          │  STATUS │  │ Word (CCW)  │  │       └──────────┘
  └──────────┘         │  └─────────────┘  │       ┌──────────┐
       │               │                   │──────▶│ Device 2 │
       │               │  Own registers,   │       └──────────┘
  ┌────▼─────┐         │  can execute      │       ┌──────────┐
  │  Main    │◀═══════▶│  channel programs │──────▶│ Device 3 │
  │  Memory  │         │                   │       └──────────┘
  └──────────┘         └──────────────────┘

Channel Command Word (CCW): The CPU sends a high-level command to the IOP (e.g., "Read 1000 records from disk into memory starting at address 0x5000"). The IOP breaks this into individual I/O operations, executes them independently, and notifies the CPU only when the entire operation is complete (or if an error occurs).

Types of I/O Channels

Channel Type	Description	Use Case
Multiplexer Channel	Handles multiple slow/medium devices simultaneously by interleaving bytes	Multiple printers, card readers, terminals
Selector Channel	Handles one high-speed device at a time; dedicated until transfer completes	Magnetic tape, disk drive
Block Multiplexer Channel	Combines both: handles multiple high-speed devices by interleaving blocks	Multiple disks (mainframe environments)

India's banking core systems (Infosys Finacle, TCS BaNCS) run on IBM mainframes that use I/O channels extensively. When SBI processes millions of transactions daily, the IOP manages simultaneous disk reads/writes, printer operations, and network I/O — all without burdening the main CPU.

6. Priority Interrupt System

When multiple devices request CPU attention simultaneously, who goes first? The priority interrupt system decides. It's like a hospital emergency room — a heart attack patient gets treated before someone with a headache, even if the headache patient arrived first.

Daisy Chain Priority

Daisy Chain Priority Interrupt
                                    Priority: Highest ──────────▶ Lowest
  ┌──────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
  │      │ PI  │ Device 0 │ PO  │ Device 1 │ PO  │ Device 2 │
  │      │────▶│ (Highest │────▶│ (Medium  │────▶│ (Lowest  │
  │ CPU  │     │ Priority)│     │ Priority)│     │ Priority)│
  │      │     │          │     │          │     │          │
  │      │◀════╪══════════╪═════╪══════════╪═════╪══════════╪═══ INT Line
  │      │     │          │     │          │     │          │    (common
  └──────┘     └──────────┘     └──────────┘     └──────────┘    wired-OR)

  INT = Interrupt Request (active when ANY device requests)
  PI  = Priority In (from CPU or previous device)
  PO  = Priority Out (to next device)

  Rule: A device can respond to the acknowledge signal
        ONLY if its PI input is active (1).
        If it has an interrupt pending, it blocks PO (sends 0).
        Otherwise, it passes PI through as PO.

  Example: If Device 1 and Device 2 both request simultaneously,
           Device 1 gets serviced first (it blocks PO to Device 2).

Parallel Priority Interrupt (Hardware)

Parallel Priority — Using Priority Encoder
  ┌──────────┐     ┌───────────────────────┐     ┌──────┐
  │ Device 0 │────▶│                       │     │      │
  │ Device 1 │────▶│   Priority Encoder    │────▶│ CPU  │
  │ Device 2 │────▶│   (Combinational      │     │      │
  │ Device 3 │────▶│    Logic Circuit)     │     │ Gets │
  │ Device 4 │────▶│                       │────▶│vector│
  │ Device 5 │────▶│  Outputs: highest     │     │number│
  │ Device 6 │────▶│  priority device ID   │     │      │
  │ Device 7 │────▶│  as binary code       │     │      │
  └──────────┘     └───────────────────────┘     └──────┘

  8 devices → 3-bit output (2³ = 8 combinations)
  ISR = Interrupt Service Routine (fetched from vector table)

  Advantage: Fastest — all devices checked in ONE clock cycle
  Disadvantage: More hardware (encoder, mask register, etc.)

Software Poll

Instead of hardware priority, the CPU checks each device's status register in software, in a fixed order. The first device found with an interrupt pending gets serviced.

Assembly-style Pseudocode
; Software Polling Routine
POLL:   IN    STATUS_DEV0     ; Read Device 0 status
        BNZ   ISR_DEV0        ; If interrupt pending, jump to ISR
        IN    STATUS_DEV1     ; Read Device 1 status
        BNZ   ISR_DEV1
        IN    STATUS_DEV2     ; Read Device 2 status
        BNZ   ISR_DEV2
        IN    STATUS_DEV3
        BNZ   ISR_DEV3
        JMP   POLL            ; No device pending, keep polling

Method	Speed	Hardware Cost	Flexibility	Best For
Daisy Chain	Medium	Low (just PI/PO wires)	Fixed priority	Small systems, microcontrollers
Parallel Priority	Fast (1 cycle)	High (encoder, mask reg)	Programmable via mask	High-performance CPUs, servers
Software Poll	Slow (sequential)	Minimal (no extra HW)	Fully flexible in software	Simple embedded systems

GATE often asks: "In a daisy chain with 4 devices, if device 2 and device 3 raise interrupts simultaneously, which one gets serviced first?" Answer: Device 2 (closer to CPU = higher priority). The key insight: priority is determined by physical position in the chain.

7. Asynchronous Data Transfer — Handshaking

In synchronous transfer, both sender and receiver are controlled by the same clock. But what if they run at different speeds? Asynchronous transfer uses a handshaking protocol — a conversation of control signals that says "I'm sending" / "I received it" / "Send the next one."

Analogy: Think of handing a heavy box to someone. You don't just throw it — you say "Ready?", they say "Ready!", you hand it over, they say "Got it!", then you pick up the next box. That's handshaking.

🤝 Source-Initiated Strobe Handshaking

  Source (Sender)                     Destination (Receiver)
       │                                     │
       │── places data on bus ──────────────▶│
       │                                     │
       │── sends STROBE pulse (↓) ─────────▶│
       │                                     │── reads data from bus
       │                                     │── sends ACK (↑)
       │◀── ACK received ───────────────────│
       │── removes data from bus             │
       │                                     │── removes ACK (↓)
       │                                     │
  
  Timing:
       Data ─────┬═══════════════════┬─────────────
                 │   Valid Data      │
  STROBE ────────┘                   └──────────────
                                          │
  ACK ───────────────────────────────┐    │
                                     └────┘

🤝 Full Handshaking (4-Phase)

  Phase  Source                      Destination         Signals
  ─────  ──────────────────────      ──────────────      ─────────
    1    Places data on bus          —                   DATA valid
         Sets DATA_VALID = 1        —                   DV ↑

    2    —                           Reads data          
         —                           Sets DATA_ACK = 1   DA ↑

    3    Sees ACK, removes data     —                   DV ↓
         Sets DATA_VALID = 0        —

    4    —                           Sees DV↓            DA ↓
         —                           Sets DATA_ACK = 0

  Timing Diagram:
  DATA    ──────┬═══════════════════┬──────────
  DATA_VALID ───┘                   └──────────
  DATA_ACK ──────────┐         ┌───────────────
                     └─────────┘
                1    2    3    4    (phases)

India's Fastag (toll plaza RFID) uses asynchronous handshaking. The RFID reader sends a signal (strobe) to the tag, the tag responds with vehicle data (ACK), and the barrier lifts. If the handshake fails (wrong tag, insufficient balance), the system retries — all within 2–3 seconds at 30 km/h.

8. UART — Universal Asynchronous Receiver-Transmitter

UART is the most common serial communication protocol. Every Arduino project, GPS module, Bluetooth module (HC-05), GSM module (SIM800), and even your computer's serial port uses UART.

Analogy: Think of UART like sending a telegram. You send one letter at a time (serial), with special "start" and "stop" markers so the receiver knows when a message begins and ends. You also include a check digit (parity) so the receiver can detect errors.

📡 UART Frame Format

  UART Idle State: Line held HIGH (1)

  ┌────┬───┬───┬───┬───┬───┬───┬───┬───┬────────┬──────┐
  │ ST │ D0│ D1│ D2│ D3│ D4│ D5│ D6│ D7│ Parity │ STOP │
  │ 0  │ b │ b │ b │ b │ b │ b │ b │ b │  bit   │  1   │
  └────┴───┴───┴───┴───┴───┴───┴───┴───┴────────┴──────┘
   ↑     ↑                                   ↑       ↑
   │     │                                   │       │
  Start  8 Data Bits                      Parity   Stop
  Bit    (LSB first)                      (Even/   Bit
  (0)                                      Odd)    (1)

  Total frame = 1 + 8 + 1 + 1 = 11 bits per character

  Example: Sending ASCII 'A' (0x41 = 0100 0001) with Even Parity

  ┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
  │ 0 │ 1 │ 0 │ 0 │ 0 │ 0 │ 0 │ 1 │ 0 │ 0 │ 1 │
  │ S │D0 │D1 │D2 │D3 │D4 │D5 │D6 │D7 │ P │ SP│
  └───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘
       LSB ────────────────────▶ MSB
  
  D0–D7 = 1,0,0,0,0,0,1,0 (binary of 0x41, LSB first)
  Parity = 0 (even parity: total 1s in data = 2, already even)
  
  Baud Rate = bits per second
  Common: 9600, 115200
  At 9600 baud: 1 character = 11 bits / 9600 = 1.146 ms

UART Parameter	Common Value	Notes
Baud Rate	9600, 115200	Bits per second (both sides must match)
Data Bits	8	5, 6, 7, or 8 bits; 8 is standard
Parity	None / Even / Odd	Error detection; optional
Stop Bits	1 or 2	Marks end of frame
Flow Control	None / RTS-CTS	Hardware handshaking for fast transfers

For interviews and labs: "8N1" means 8 data bits, No parity, 1 stop bit — the most common UART config. If someone says "9600 8N1", they mean baud rate 9600, 8 data bits, no parity, 1 stop bit. Total frame = 10 bits (1 start + 8 data + 0 parity + 1 stop).

Every Arduino project in every Indian engineering college uses UART. When you type Serial.begin(9600) in Arduino, you're configuring the UART module at 9600 baud. The Serial Monitor receives UART frames, strips the start/stop/parity bits, and shows you the ASCII characters.

Section D

Learn by Doing — 3-Tier Lab Structure

🟢 Tier 1 — GUIDED: Simulate UART Transmission in Python

⏱️ 45–60 minutesBeginnerZero prior knowledge assumed

Objective:

Write a Python program that takes an ASCII character, converts it to a UART frame (Start + 8 Data + Parity + Stop), and displays the bit stream.

Step 1: Get the ASCII value

Python
char = input("Enter a character: ")
ascii_val = ord(char)
print(f"ASCII value of '{char}' = {ascii_val} (decimal) = {bin(ascii_val)} (binary)")

Step 2: Build the UART frame

Python
# Convert to 8-bit binary (LSB first for UART)
data_bits = format(ascii_val, '08b')[::-1]  # Reverse for LSB first

# Calculate even parity
ones_count = data_bits.count('1')
parity = '0' if ones_count % 2 == 0 else '1'

# Build frame: Start(0) + Data(8) + Parity(1) + Stop(1)
frame = '0' + data_bits + parity + '1'

print(f"\nUART Frame for '{char}':")
print(f"Start | Data (LSB→MSB) | Parity | Stop")
print(f"  {frame[0]}   | {frame[1:9]}         |   {parity}    |  {frame[-1]}")
print(f"Complete frame: {frame} ({len(frame)} bits)")

Step 3: Calculate transmission time

Python
baud = 9600
bit_time = 1 / baud  # seconds per bit
frame_time = len(frame) * bit_time * 1000  # milliseconds
print(f"\nAt {baud} baud:")
print(f"Time per bit  = {bit_time*1000:.4f} ms")
print(f"Time per frame = {frame_time:.4f} ms")
print(f"Max characters/sec = {baud // len(frame)}")

🟡 Tier 2 — SEMI-GUIDED: DMA Transfer Simulator in C

⏱️ 60–90 minutesIntermediateHints provided, you fill the gaps

Your Mission:

Simulate a DMA controller in C that transfers a block of data from a "device buffer" array to a "memory" array, word by word, while counting cycles.

Hints:

Data Structures: Create struct DMA { int AR; int WC; int DR; int mode; }
Device Buffer: int device_buffer[256] filled with random data
Memory: int memory[1024] initialized to zero
Transfer Loop: On each "cycle", copy device_buffer[i] → DMA.DR → memory[DMA.AR], then AR++, WC--
Mode Switch: In burst mode, complete all transfers in one go. In cycle-stealing mode, interleave with "CPU work" prints
Completion: When WC == 0, print "DMA INTERRUPT: Transfer Complete!"

Stretch Goal: Add a timer to measure actual execution time for burst vs cycle-stealing modes. Which is faster in total? Which gives better CPU availability?

🔴 Tier 3 — OPEN CHALLENGE: Priority Interrupt Controller in Verilog/VHDL

⏱️ 2–3 hoursAdvancedNo instructions — real-world design

The Brief:

Design a priority interrupt controller for 8 devices using a priority encoder, interrupt mask register, and interrupt acknowledge logic. Implement in Verilog (or use Logisim for a visual approach).

Inputs: 8 interrupt request lines (IR0–IR7)
Outputs: 3-bit interrupt vector, INT signal to CPU
Mask Register: 8-bit register to enable/disable individual interrupts
Priority Logic: IR0 = highest priority, IR7 = lowest
Acknowledge: When CPU sends INTA, latch the current vector and clear the serviced interrupt
Test: Simulate simultaneous interrupts on IR2 and IR5 — verify IR2 is serviced first

This project is portfolio gold. A working priority interrupt controller on your GitHub (Verilog + testbench + waveforms) demonstrates hardware design skills valued at ₹6–12 LPA entry-level roles at companies like Texas Instruments, Qualcomm, Intel, and Samsung India.

Section E

Practice Problems — Diagrams, Numericals, Industry & GATE

📐 Diagram-Based Problems (D1–D3)

Draw the complete block diagram of a DMA controller showing AR, WC, DR, Control Logic, and its connections to CPU, Memory, and I/O device. Label all bus signals (BR, BG, Address Bus, Data Bus).

ApplyIntermediate

✅ Refer to the DMA block diagram in Section C.4. Key signals: BR (Bus Request) from DMA to CPU, BG (Bus Grant) from CPU to DMA, bidirectional Data Bus, unidirectional Address Bus (DMA → Memory), and DRQ/DACK between DMA and device.

Draw the daisy chain priority interrupt structure for 4 devices. Show the INT (wired-OR), PI, and PO connections. Indicate which device gets priority if Device 1 and Device 3 raise interrupts simultaneously.

ApplyIntermediate

✅ Device 1 gets serviced first. In daisy chain, PI propagates from CPU → Device 0 → Device 1 → Device 2 → Device 3. Device 1 will see PI=1, block PO to Device 2/3, and respond to the interrupt acknowledge. Device 3 waits until Device 1's ISR completes.

Draw a complete UART frame for transmitting the character 'M' (ASCII 77 = 0x4D = 01001101) with odd parity, 8 data bits, and 1 stop bit. Label each bit.

ApplyBeginner

✅ Binary of 'M' = 01001101. LSB first: 1,0,1,1,0,0,1,0. Number of 1s in data = 4 (even). For odd parity, parity bit = 1 (to make total 1s odd = 5). Frame: [0] [1 0 1 1 0 0 1 0] [1] [1] = 0-10110010-1-1 (11 bits total).

🔢 Numerical Problems (N1–N6)

A DMA controller transfers data from a disk to memory. The disk transfer rate is 2 MB/s, memory cycle time is 40 ns, and the DMA uses cycle stealing. Calculate: (a) Number of memory cycles stolen per second, (b) Percentage of memory cycles stolen if CPU also uses memory at 20 million accesses/sec.

ApplyGATE Level

✅ (a) Disk rate = 2 MB/s = 2 × 10⁶ bytes/s. If word size = 1 byte, cycles stolen = 2 × 10⁶/s. (b) Total cycles available = 1/40ns = 25 × 10⁶/s. CPU needs 20M, DMA needs 2M, total = 22M. % stolen = (2M / 25M) × 100 = 8%. CPU is barely affected!

A UART operates at 115200 baud with 8 data bits, even parity, and 1 stop bit. Calculate: (a) Time to transmit one character, (b) Maximum characters per second, (c) Effective data rate in bits/second (only data bits, not overhead).

ApplyIntermediate

✅ (a) Frame = 1+8+1+1 = 11 bits. Time = 11/115200 = 95.49 μs. (b) Max chars/sec = 115200/11 = 10,472 chars/sec. (c) Effective data rate = 10472 × 8 = 83,776 bps ≈ 83.8 kbps (only 72.7% of baud rate is useful data!).

A system has a 32-bit data bus operating at 100 MHz. DMA transfers a 1 MB file from disk. Calculate: (a) Bus bandwidth in MB/s, (b) Minimum time for DMA burst transfer, (c) Number of bus cycles needed.

ApplyGATE Level

✅ (a) Bandwidth = 32 bits × 100 MHz = 3200 Mbps = 400 MB/s. (b) Time = 1 MB / 400 MB/s = 2.5 ms. (c) Each cycle transfers 4 bytes (32 bits). Cycles = 1MB / 4B = 262,144 cycles.

In a priority interrupt system with 4 devices, the ISR execution times are: Dev0 = 25 μs, Dev1 = 40 μs, Dev2 = 15 μs, Dev3 = 30 μs. If all four devices raise interrupts at t=0, and Dev0 has highest priority, what is the response time for each device?

AnalyzeIntermediate

✅ Service order: Dev0 → Dev1 → Dev2 → Dev3. Response times: Dev0 = 0 μs (immediate), Dev1 = 25 μs (after Dev0), Dev2 = 65 μs (after Dev0+Dev1), Dev3 = 80 μs (after Dev0+Dev1+Dev2). Total = 110 μs.

A DMA controller has a 16-bit address register and 12-bit word count register. What is: (a) Maximum addressable memory, (b) Maximum block size per DMA transfer?

ApplyBeginner

✅ (a) 16-bit AR → 2¹⁶ = 64 KB addressable. (b) 12-bit WC → 2¹² = 4096 words maximum per transfer. If word = 2 bytes, max block = 8 KB.

A system uses cycle-stealing DMA. The CPU clock is 500 MHz, and the DMA steals 1 cycle every 10 μs. What is the percentage slowdown of the CPU?

ApplyGATE Level

✅ CPU cycle time = 1/500MHz = 2 ns. In 10 μs = 10,000 ns, CPU executes 5000 cycles. DMA steals 1 cycle. Slowdown = 1/5000 × 100 = 0.02%. Negligible! This is why cycle stealing is preferred.

🏭 Industry Application Problems (I1–I3)

Aadhaar Biometric Authentication: An Aadhaar-enabled POS device captures a 300 KB fingerprint image. The biometric scanner uses DMA at a transfer rate of 5 MB/s. The authentication server responds in 150 ms via the network. Calculate the total time from scan to verification, assuming CPU processing takes 20 ms.

AnalyzeIndustry

✅ DMA transfer time = 300 KB / 5 MB/s = 60 ms. Network round-trip = 150 ms. CPU processing = 20 ms. Total ≈ 60 + 20 + 150 = 230 ms. Real Aadhaar authentication targets ~200 ms — engineers optimize each stage.

ISRO Satellite Data: ISRO's Chandrayaan-3 orbiter sends scientific data at 8 Mbps. The onboard computer uses DMA with a 64-bit data bus at 50 MHz. Is the bus bandwidth sufficient? What percentage of bus capacity does the data link use?

EvaluateIndustry

✅ Bus bandwidth = 64 bits × 50 MHz = 3200 Mbps = 3.2 Gbps. Data link = 8 Mbps. Usage = 8/3200 × 100 = 0.25%. Bus is vastly over-provisioned — this is intentional for reliability in space applications.

UPI Payment Terminal: A UPI terminal has: touchscreen (interrupt-driven), QR scanner (DMA), receipt printer (programmed I/O), and network module (DMA). Design the priority interrupt assignment and justify which device gets highest priority.

CreateIndustry

✅ Priority (highest first): 1. Network module (time-critical for transaction), 2. QR scanner (DMA for image data), 3. Touchscreen (user input, moderate priority), 4. Printer (slowest, can wait). Justification: Network must complete transactions within timeout limits; printer is non-critical and can buffer.

🎯 GATE-Style Problems (G1–G5)

G1 GATE

A device with a transfer rate of 10 KB/s is connected to a CPU via DMA. The CPU clock rate is 1 GHz, and one bus cycle takes 4 clock cycles. What fraction of CPU time is consumed by the DMA in cycle-stealing mode? [GATE 2015 Style]

Apply2 marks

✅ Transfer rate = 10 KB/s = 10,240 bytes/s. Each DMA transfer steals 1 bus cycle = 4 clock cycles. Cycles stolen/sec = 10,240 × 4 = 40,960. CPU total cycles/sec = 10⁹. Fraction = 40,960/10⁹ = 4.096 × 10⁻⁵ ≈ 0.004%. Answer: Negligible fraction.

G2 GATE

Consider a system with memory-mapped I/O. The address space is 16 bits. If 1 KB is reserved for I/O ports, how many bytes of memory are addressable? [GATE 2018 Style]

Apply1 mark

✅ Total address space = 2¹⁶ = 64 KB. I/O reserved = 1 KB. Memory addressable = 64 - 1 = 63 KB. In isolated I/O, the full 64 KB would be available for memory (separate I/O space).

G3 GATE

In a system with daisy-chained interrupt, which of the following is TRUE?
(A) The device closest to CPU has the lowest priority
(B) The device farthest from CPU has the highest priority
(C) Priority is determined by the physical position in the chain
(D) All devices have equal priority

Remember1 mark

✅ Answer: (C) — In daisy chain, the device closest to the CPU has the highest priority. Priority is inherently determined by physical position. The acknowledge signal passes through each device sequentially.

G4 GATE

A DMA controller is transferring a 64 KB block. The word size is 4 bytes, and each word transfer takes 100 ns (cycle stealing). What is the total DMA transfer time? [GATE 2020 Style]

Apply2 marks

✅ Block = 64 KB = 65,536 bytes. Words = 65,536 / 4 = 16,384 words. Each word takes 100 ns. Total time = 16,384 × 100 ns = 1,638,400 ns = 1.638 ms ≈ 1.64 ms.

G5 GATE

A computer has 4 I/O devices. Device interrupt response times must satisfy: Dev A ≤ 50μs, Dev B ≤ 100μs, Dev C ≤ 200μs, Dev D ≤ 500μs. ISR times: A=30μs, B=60μs, C=40μs, D=100μs. Determine a valid priority ordering. Can all deadlines be met?

Evaluate2 marks

✅ Assign priority by deadline (Rate Monotonic): A (highest) → B → C → D (lowest). Response times: A=0+30=30μs ≤ 50 ✅, B=30+60=90μs ≤ 100 ✅, C=90+40=130μs ≤ 200 ✅, D=130+100=230μs ≤ 500 ✅. All deadlines met!

Section F

MCQ Assessment Bank — 30 Questions (Bloom's Mapped)

Remember / Identify (Q1–Q6)

DMA stands for:

Direct Memory Allocation
Direct Memory Access
Dynamic Memory Addressing
Dual Mode Architecture

Remember

✅ Answer: (B) Direct Memory Access — A technique where data is transferred directly between I/O devices and memory without CPU intervention.

In isolated I/O, the CPU uses _____ instructions to communicate with I/O devices.

MOV and ADD
IN and OUT
LOAD and STORE
PUSH and POP

Remember

✅ Answer: (B) IN and OUT — Isolated I/O uses separate I/O instructions. Memory-mapped I/O uses regular memory instructions like MOV.

The start bit in a UART frame is always:

Logic 1
Logic 0
Same as parity bit
Depends on data

Remember

✅ Answer: (B) Logic 0 — The idle state is logic 1, so a transition to 0 marks the start of a new frame. The stop bit returns to logic 1.

Which register in the DMA controller holds the starting memory address for data transfer?

Word Count Register (WC)
Data Register (DR)
Address Register (AR)
Status Register (SR)

Remember

✅ Answer: (C) Address Register (AR) — AR holds the memory address where data will be written to (or read from). It increments after each word transfer.

In programmed I/O, the CPU checks device status using:

Interrupts
DMA
Polling (busy-wait loop)
Channel commands

Remember

✅ Answer: (C) Polling — The CPU repeatedly reads the device's status register in a loop until the device is ready. This wastes CPU cycles.

UART stands for:

Universal Asynchronous Receiver-Transmitter
Unified Address Register Transfer
Universal Analog Relay Terminal
Uniform Access Resource Table

Remember

✅ Answer: (A) Universal Asynchronous Receiver-Transmitter — It converts parallel data to serial for transmission and vice versa.

Understand / Explain (Q7–Q12)

Why does DMA improve system performance compared to programmed I/O?

DMA uses faster memory chips
DMA bypasses the CPU during data transfer, freeing it for other tasks
DMA increases the clock speed of the processor
DMA eliminates the need for I/O devices

Understand

✅ Answer: (B) — DMA transfers data directly between device and memory. The CPU only initiates and handles the completion interrupt, leaving it free to execute other instructions during the transfer.

In memory-mapped I/O, I/O devices are treated as:

Separate I/O ports with special instructions
Memory locations accessible through regular memory instructions
CPU registers
External interrupt sources only

Understand

✅ Answer: (B) — I/O ports occupy addresses in the same address space as memory. Any instruction that can access memory (MOV, ADD, etc.) can also access I/O ports.

What is the purpose of the parity bit in a UART frame?

To increase data transfer speed
To detect single-bit transmission errors
To encrypt the data
To synchronize sender and receiver clocks

Understand

✅ Answer: (B) — Parity adds a bit to make the total number of 1s even (even parity) or odd (odd parity). If the received parity doesn't match, an error is detected.

Q10

In cycle stealing DMA, the CPU is:

Completely halted during the entire transfer
Slowed down slightly as DMA steals one bus cycle at a time
Not affected at all
Powered off to save energy

Understand

✅ Answer: (B) — DMA takes the bus for one word transfer, returns it, then steals again. The CPU experiences slight slowdown but is never fully blocked.

Q11

Why is handshaking necessary in asynchronous data transfer?

To encrypt data between devices
To ensure the receiver is ready before the sender transmits
To increase the clock speed
To convert serial data to parallel

Understand

✅ Answer: (B) — Without a common clock, sender and receiver use control signals (strobe, ACK) to coordinate. Handshaking prevents data loss when devices operate at different speeds.

Q12

In a daisy chain priority interrupt, the device closest to the CPU:

Has the lowest priority
Has the highest priority
Has no priority advantage
Cannot generate interrupts

Understand

✅ Answer: (B) — The acknowledge signal passes through each device sequentially. The first device in the chain intercepts it, giving it highest priority.

Apply / Calculate (Q13–Q18)

Q13

A UART with 9600 baud, 8 data bits, no parity, and 1 stop bit can transmit at most _____ characters per second.

9600
960
1200
800

Apply

✅ Answer: (B) 960 — Frame = 1 start + 8 data + 0 parity + 1 stop = 10 bits. Characters/sec = 9600/10 = 960.

Q14

A DMA controller with a 16-bit word count register can transfer a maximum of:

32 KB
64 KB words
65,536 words
16 words

Apply

✅ Answer: (C) 65,536 words — 2¹⁶ = 65,536. The actual byte count depends on word size (e.g., 65,536 × 4 = 256 KB for 32-bit words).

Q15

If a disk transfers at 4 MB/s and DMA cycle stealing steals 1 cycle per word (word = 4 bytes), how many cycles are stolen per second?

4,000,000
1,000,000
16,000,000
500,000

Apply

✅ Answer: (B) 1,000,000 — Words/sec = 4 MB/s ÷ 4 B/word = 1,000,000 words/sec = 1M cycles stolen/sec.

Q16

In a system with 20-bit address space and memory-mapped I/O, 256 addresses are reserved for I/O. How much memory is available?

1 MB
1 MB − 256 bytes
1,048,320 bytes
Both B and C

Apply

✅ Answer: (D) — 2²⁰ = 1,048,576 bytes total. Memory available = 1,048,576 − 256 = 1,048,320 bytes = 1 MB − 256 bytes. Both B and C express the same value.

Q17

A priority encoder with 8 inputs requires how many output bits to represent the highest-priority active input?

Apply

✅ Answer: (B) 3 — ⌈log₂(8)⌉ = 3 bits needed to encode 8 different device IDs (000 to 111).

Q18

A DMA burst transfer of 256 words on a bus with 50 ns cycle time takes:

12.8 μs
5.12 μs
256 μs
0.05 μs

Apply

✅ Answer: (A) 12.8 μs — 256 words × 50 ns/word = 12,800 ns = 12.8 μs.

Analyze / Compare (Q19–Q24)

Q19

Which data transfer mode would be most appropriate for a high-definition video camera streaming at 300 MB/s?

Programmed I/O
Interrupt-driven I/O
DMA with burst mode
Software polling

Analyze

✅ Answer: (C) — At 300 MB/s, neither programmed I/O nor interrupt-driven can keep up (CPU overhead too high). DMA burst mode transfers large video frames directly to memory at full bus speed.

Q20

Compared to daisy chain, parallel priority interrupt has:

Lower hardware cost but slower response
Higher hardware cost but faster response
Same cost and same speed
Lower cost and faster speed

Analyze

✅ Answer: (B) — Parallel priority uses a hardware encoder (more gates, flip-flops, mask register) but resolves priority in one clock cycle. Daisy chain is cheaper but the acknowledge must propagate through the chain sequentially.

Q21

Memory-mapped I/O is preferred over isolated I/O when:

The system needs maximum memory address space
The system needs to perform arithmetic operations on I/O data
The system has very few I/O devices
Both B and C

Analyze

✅ Answer: (D) — Memory-mapped I/O allows any ALU instruction on I/O data (not just IN/OUT). It's preferred when few I/O ports are needed (so memory loss is minimal) and when flexible I/O manipulation is required.

Q22

A selector channel differs from a multiplexer channel in that:

Selector handles one device at a time; multiplexer handles many simultaneously
Selector handles many devices; multiplexer handles one
Both handle the same number of devices
Selector is slower than multiplexer

Analyze

✅ Answer: (A) — A selector channel is dedicated to one high-speed device until the transfer completes. A multiplexer channel interleaves bytes from multiple low/medium-speed devices.

Q23

In DMA cycle stealing vs burst mode, which statement is TRUE?

Burst mode gives better CPU utilization
Cycle stealing gives better CPU utilization but slower total transfer
Both give equal CPU utilization
Cycle stealing blocks the CPU completely

Analyze

✅ Answer: (B) — Cycle stealing lets the CPU use the bus between stolen cycles, so CPU utilization is better. But the total transfer takes longer because of the interleaving overhead.

Q24

Strobe-based handshaking vs full handshaking: which is more reliable and why?

Strobe — because it uses fewer signals
Full handshaking — because both sides confirm every phase of transfer
Both are equally reliable
Neither is reliable for high-speed transfers

Analyze

✅ Answer: (B) — Full handshaking has 4 phases with mutual confirmation. Strobe only sends a pulse — if the receiver misses it, data is lost. Full handshaking guarantees delivery at the cost of slightly more time.

Evaluate & Create (Q25–Q30)

Q25

A system designer must choose between DMA burst mode and cycle stealing for a network card receiving 100 Mbps Ethernet. Which is more suitable and why?

Burst mode — to transfer each Ethernet frame in one shot
Cycle stealing — to avoid blocking CPU during continuous network traffic
Programmed I/O — network speeds are manageable
No DMA needed — interrupt-driven is sufficient

Evaluate

✅ Answer: (B) — Network traffic is continuous and bursty. Cycle stealing allows the CPU to handle TCP/IP processing between DMA transfers. Burst mode would block the CPU during each frame, causing unacceptable latency for real-time networking.

Q26

An embedded system has a temperature sensor (1 reading/sec), a motor controller (100 commands/sec), and an SD card logger (1 MB/min). Assign optimal data transfer modes.

All three use DMA
Sensor: Polling, Motor: Interrupt, SD card: DMA
All three use interrupts
Sensor: DMA, Motor: Polling, SD card: Interrupt

Evaluate

✅ Answer: (B) — Sensor at 1/sec is so slow that polling is fine. Motor at 100/sec benefits from interrupts (timely response without polling waste). SD card at 1 MB/min involves large block transfers best suited for DMA.

Q27

If you're designing a UART for an IoT device that must work at extreme distances (200m+), which modification would improve reliability?

Increase baud rate to 1 Mbps
Use differential signaling (RS-485) instead of single-ended UART
Remove parity bit to reduce overhead
Reduce data bits to 5

Evaluate

✅ Answer: (B) — RS-485 uses differential signaling (two wires, comparing voltage difference) which is resistant to noise over long distances. Standard UART (single-ended) is limited to ~15m.

Q28

Design a priority scheme for 4 devices: Keyboard (slow), Disk (fast, bulk), Network (medium, real-time), Timer (critical). What is the optimal priority order?

Timer > Network > Disk > Keyboard
Keyboard > Timer > Network > Disk
Disk > Network > Timer > Keyboard
Network > Disk > Keyboard > Timer

Create

✅ Answer: (A) Timer > Network > Disk > Keyboard — Timer interrupts are critical for OS scheduling and must never be missed. Network needs real-time response for packet processing. Disk is fast but can buffer. Keyboard is the slowest and most tolerant of delay.

Q29

You're building an Arduino-based weather station. Which I/O concepts from this chapter would you use for: (1) reading a temperature sensor every 5 seconds, (2) logging data to an SD card, (3) sending data over Bluetooth (HC-05)?

All polling
(1) Timer interrupt, (2) SPI with DMA, (3) UART interrupt
All DMA
(1) DMA, (2) Polling, (3) DMA

Create

✅ Answer: (B) — Timer interrupt triggers sensor reading at precise intervals. SD card uses SPI protocol with DMA for efficient block writes. Bluetooth HC-05 communicates via UART with interrupt-driven reception for incoming commands.

Q30

A company needs an I/O system for an ATM machine: card reader, keypad, receipt printer, screen, network module, and cash dispenser motor. Which combination of transfer modes and priority interrupts would you recommend?

All devices on programmed I/O with round-robin polling
Network & card reader on DMA; keypad & screen on interrupt; printer & motor on programmed I/O; Priority: Network > Card > Keypad > Motor > Screen > Printer
All devices on DMA with daisy chain
All devices on interrupt with no priority

Create

✅ Answer: (B) — Network requires fast, reliable data transfer (DMA). Card reader transfers magnetic stripe/chip data blocks (DMA). Keypad and screen are event-driven (interrupt). Printer and motor are slow, sequential operations (programmed I/O). Priority reflects criticality: network transaction > card security > user input > physical actuators.

Section G

Short Answer Questions (8 Questions)

SA1

Differentiate between Memory-Mapped I/O and Isolated I/O with one example each. (4 marks)

Memory-Mapped I/O: I/O ports share the same address space as memory. CPU uses regular instructions (MOV, ADD) to access devices. Example: ARM-based Raspberry Pi — GPIO registers are mapped to memory addresses like 0x3F200000. Advantage: Any instruction works. Disadvantage: Reduces available memory addresses.

Isolated I/O: I/O ports have a separate address space. CPU uses special IN/OUT instructions. Example: Intel x86 PCs — keyboard port at I/O address 0x60. Advantage: Full memory space preserved. Disadvantage: Limited to IN/OUT instructions only.

SA2

Explain the handshaking mechanism in asynchronous data transfer with a timing diagram. (5 marks)

Handshaking is a protocol where sender and receiver exchange control signals to synchronize data transfer without a common clock. The 4-phase handshake: (1) Source places data and raises DATA_VALID, (2) Destination reads data and raises DATA_ACK, (3) Source sees ACK, removes data, lowers DATA_VALID, (4) Destination sees DV low, lowers DATA_ACK. Each phase waits for the other side's response, ensuring reliable transfer regardless of speed mismatch. See timing diagram in Section C.7.

SA3

What are the three modes of DMA transfer? Compare them briefly. (4 marks)

Burst Mode: DMA holds bus for entire block. CPU fully blocked. Fastest total transfer. Used for disk reads.

Cycle Stealing: DMA takes one bus cycle per word, then returns bus to CPU. CPU slightly slowed. Best for streaming devices like network cards.

Transparent/Interleaved: DMA uses bus only during CPU idle cycles. Zero CPU impact. Slowest overall but no CPU interference.

SA4

Draw and explain the UART frame format for transmitting 'Z' (ASCII 90) with even parity. (5 marks)

ASCII 'Z' = 90 = 0x5A = 01011010 binary. LSB first: 0,1,0,1,1,0,1,0. Count of 1s = 4 (even). For even parity, parity bit = 0 (already even). Frame: Start(0) + Data(01011010 LSB first) + Parity(0) + Stop(1) = 0-01011010-0-1 = 11 bits. At 9600 baud, transmission time = 11/9600 = 1.146 ms.

SA5

List the registers in a DMA controller and state the function of each. (4 marks)

Address Register (AR): Holds the memory address where data is to be transferred. Incremented after each word transfer.

Word Count Register (WC): Holds the number of words to transfer. Decremented after each transfer. When WC=0, transfer complete.

Data Register (DR): Temporary buffer holding one word during transit between device and memory.

Control Register: Specifies direction (read/write), DMA mode (burst/cycle stealing), enable/disable, and interrupt enable.

SA6

Compare daisy chain and parallel priority interrupt mechanisms. Which is faster? (4 marks)

Daisy Chain: Devices connected in series. Acknowledge signal cascades through chain. Priority = physical position (closest to CPU = highest). Simple hardware (just PI/PO wires). Fixed priority. Slower for many devices.

Parallel Priority: All devices connect to a priority encoder simultaneously. Resolves priority in one clock cycle using combinational logic. Requires more hardware (encoder, mask register). Programmable priority via mask. Much faster — O(1) vs O(n) for daisy chain.

SA7

Explain the concept of cycle stealing in DMA with a timing example. (4 marks)

In cycle stealing, the DMA controller "steals" one bus cycle from the CPU to transfer one word, then returns bus control. Example: CPU clock = 100 MHz (10 ns cycle). DMA needs to transfer 1 word every 1 μs (device rate). In 1 μs = 100 CPU cycles, DMA steals 1 cycle. CPU slowdown = 1/100 = 1%. The CPU continues executing instructions during the 99 unstolen cycles, making cycle stealing nearly transparent.

SA8

What is an I/O Processor (IOP)? How does it differ from a DMA controller? (5 marks)

IOP is a dedicated processor that handles all I/O operations independently. It has its own instruction set (channel commands), can execute I/O programs, manage multiple devices, perform error handling, and make decisions — all without CPU intervention.

Key differences: DMA transfers data blocks passively (no decision-making). IOP can execute programs, handle errors, manage device queues, and perform data formatting. DMA is simpler and cheaper; IOP is a full processor dedicated to I/O. DMA needs CPU to initialize each transfer; IOP needs CPU only for high-level commands.

Section H

Long Answer Questions (3 Questions)

LA1

Explain the complete working of a DMA controller with a neat block diagram. Describe the DMA transfer sequence (initialization, request, acknowledge, transfer, completion). Discuss burst mode vs cycle stealing with timing diagrams and calculate the CPU slowdown for a given configuration: disk at 10 MB/s, 32-bit bus, 200 MHz clock. (15 marks)

Block Diagram: DMA controller contains AR, WC, DR, Control Logic. Connected to CPU via BR/BG lines, to system bus (Address + Data + Control), and to device via DRQ/DACK. See Section C.4 for complete diagram.

Transfer Sequence: (1) CPU loads AR, WC, direction into DMA registers. (2) Device raises DRQ when data ready. (3) DMA raises BR (Bus Request) to CPU. (4) CPU completes current cycle, sends BG (Bus Grant). (5) DMA places AR on address bus, transfers data between device and memory. (6) AR incremented, WC decremented. (7) Repeat until WC=0. (8) DMA sends interrupt to CPU — transfer complete.

Burst vs Cycle Stealing: Burst holds bus for all N transfers continuously. Cycle stealing takes bus for 1 transfer, returns, repeats. Burst is faster total but blocks CPU. Cycle stealing has better CPU utilization.

Calculation: Disk = 10 MB/s = 10×10⁶ bytes/s. Bus width = 32 bits = 4 bytes. Words/sec = 10M/4 = 2.5M. CPU clock = 200 MHz = 200M cycles/s. If each DMA transfer = 1 bus cycle, cycles stolen = 2.5M/s. Slowdown = 2.5M/200M = 1.25%.

LA2

Compare and contrast the three methods of data transfer: Programmed I/O, Interrupt-driven I/O, and DMA. For each method, draw the data flow path, explain the algorithm, discuss advantages and disadvantages, and provide real-world examples. Include a comprehensive comparison table. (15 marks)

Programmed I/O: Data path: Device → CPU → Memory. Algorithm: CPU reads status in loop → reads data → stores to memory. Advantage: Simple hardware. Disadvantage: CPU stuck in polling loop. Example: Old dot-matrix printer, simple embedded keypad.

Interrupt-Driven I/O: Data path: Device → CPU (via ISR) → Memory. Algorithm: Device raises interrupt → CPU saves context → executes ISR → reads data → stores to memory → restores context. Advantage: CPU free between interrupts. Disadvantage: Context switch overhead. Example: Mouse, UART serial port, keyboard.

DMA: Data path: Device → Memory (bypasses CPU). Algorithm: CPU initializes DMA → DMA requests bus → transfers data → interrupts CPU on completion. Advantage: Highest throughput, minimal CPU load. Disadvantage: Complex hardware, DMA controller cost. Example: Disk, SSD, network card, video capture.

See comparison table in Section C.3 for detailed feature-by-feature comparison.

LA3

Explain the priority interrupt system in detail. Describe (a) Daisy Chain priority with diagram, (b) Parallel priority using priority encoder, (c) Software polling method. Compare all three methods and design a priority interrupt system for a computer with 6 peripheral devices. (15 marks)

(a) Daisy Chain: Devices connected in series via PI/PO lines. Common INT line (wired-OR) goes to CPU. When CPU sends acknowledge, it passes through chain. First device with pending interrupt captures it, blocks propagation. Priority = position. Diagram shows CPU → Dev0(PI→PO) → Dev1(PI→PO) → Dev2 → ... with common INT line.

(b) Parallel Priority: All 6 devices connect to a priority encoder. Encoder outputs 3-bit vector of highest-priority active device. Mask register enables/disables individual interrupts. Interrupt status register stores pending interrupts. Resolution time = 1 clock cycle. Hardware: 8-to-3 encoder, 8-bit mask register, AND gates, comparator.

(c) Software Polling: CPU sequentially reads each device's status register. First device found with interrupt flag set is serviced. Priority = order of polling. No extra hardware. Slowest method.

Design for 6 devices: Use parallel priority encoder (8-input, 6 used). Assign priorities based on urgency: Timer(0) > Network(1) > Disk(2) > Serial(3) > Keyboard(4) > LED Display(5). Mask register allows runtime priority changes.

Section I

Industry Spotlight — A Day in the Life

👩‍💻 Meera Nair, 27 — Embedded Systems Engineer at Texas Instruments, Bangalore

Background: B.Tech ECE from NIT Calicut. Fascinated by microcontrollers in 3rd year. Built an IoT weather station using Arduino and ESP32 as her final-year project. Got placed at TI through campus recruitment after acing the hardware design round.

A Typical Day:

8:30 AM — Morning stand-up with the MSP430 microcontroller team. Review yesterday's Verilog simulation results for a new DMA controller IP block.

9:30 AM — Write RTL (Register Transfer Level) code in Verilog for a priority interrupt controller that supports 16 interrupt sources with programmable priority and nested interrupts.

11:00 AM — Run gate-level simulations. Check timing analysis — does the DMA controller meet the 100 MHz clock target? Fix a setup time violation on the address register path.

12:30 PM — Lunch at TI's campus cafeteria in Outer Ring Road, Bangalore. Chat about upcoming tapeout deadline.

1:30 PM — Debug a UART module that's dropping characters at 921600 baud. Root cause: FIFO buffer overflow. Fix: increase FIFO depth from 16 to 64 bytes.

3:30 PM — Code review with senior architect. Discuss trade-offs between cycle-stealing DMA and burst DMA for the new industrial sensor interface.

5:00 PM — Write test vectors for the I/O subsystem. Simulate multiple devices raising interrupts simultaneously — verify priority encoding works correctly.

6:00 PM — Learning hour: study AXI bus protocol (ARM's advanced I/O bus used in modern SoCs).

Detail	Info
Tools Used Daily	Verilog/SystemVerilog, Synopsys VCS, Cadence Genus, Logic Analyzers, Oscilloscopes
Entry Salary (2025)	₹8–12 LPA + benefits (TI is among the best-paying for fresh grads)
Mid-Level (3–5 yrs)	₹15–25 LPA
Senior (7+ yrs)	₹30–50 LPA
Key Skills	Verilog, VHDL, I/O architecture, DMA design, interrupt handling, UART/SPI/I2C protocols
Companies Hiring	Texas Instruments, Qualcomm, Intel, Samsung, MediaTek, ISRO, DRDO, Analog Devices, Microchip, NXP

Meera's advice to students: "I/O Organization was the most boring chapter in my textbook. But when I actually designed a DMA controller in my first week at TI, I realised it's the most practical chapter in all of COA. Every chip we design has I/O ports, interrupts, and DMA. Learn the concepts, then build something — even a blinking LED with interrupts on Arduino counts."

Section J

Earn With It — IoT & Embedded Projects

💰 Your Earning Path After This Chapter

Portfolio Pieces You Can Build:

• UART Communication Logger — Arduino reads sensors, sends data via UART to a PC dashboard — ₹2,000–₹5,000 freelance projects

• DMA Transfer Simulator — Python/C visualization of DMA with performance graphs — Portfolio showcase for TI/Qualcomm interviews

• Priority Interrupt Controller — Verilog project on GitHub — ₹6–12 LPA job applications in VLSI design

Project Idea	Skills Needed	Earning Potential
IoT Sensor Station	Arduino + UART + I2C + WiFi	₹3,000–₹10,000/project on Freelancer.in
Home Automation System	ESP32 + GPIO interrupts + relay control	₹5,000–₹15,000/installation
Industrial Data Logger	RS-485 (UART variant) + SD card (DMA) + sensors	₹10,000–₹30,000/project for factories
Verilog IP Blocks on OpenCores	Verilog/VHDL + I/O controller design	Resume gold for ₹8–12 LPA VLSI jobs
PCB Design with I/O	KiCad + UART/SPI/I2C headers + DMA-capable MCU	₹5,000–₹20,000/board design on Fiverr

India's IoT market is expected to reach $15 billion by 2027 (NASSCOM). Every IoT device uses UART, DMA, and interrupts. If you can build a working sensor-to-cloud pipeline using an ESP32 and demonstrate it on YouTube/LinkedIn, you become instantly hireable for IoT roles at Bosch India, Honeywell, ABB, or L&T Technology Services.

⏱️ Time to First Earning: 3–4 weeks (build an IoT project + create a portfolio on GitHub + apply on Internshala for embedded systems internships)

Section K

Chapter Summary

📋 Key Takeaways — Unit 5: I/O Organization

Peripheral Devices communicate with the CPU through I/O interfaces that handle speed matching, format conversion, and device selection.
I/O Interface sits between CPU bus and device. Two addressing schemes: Memory-Mapped I/O (shared address space, any instruction) and Isolated I/O (separate space, IN/OUT instructions).
Three Data Transfer Modes:
- Programmed I/O — CPU polls device status in a loop (simplest, least efficient)
- Interrupt-Driven I/O — Device signals CPU when ready (better efficiency)
- DMA — Hardware controller transfers data directly to memory (best for high-speed devices)
DMA Controller has AR, WC, DR, and Control registers. Supports burst mode (fastest transfer, CPU blocked), cycle stealing (interleaved, CPU slightly slowed), and transparent mode (uses idle cycles).
I/O Processor (IOP) is a dedicated processor for complex I/O operations. Types: Multiplexer Channel, Selector Channel, Block Multiplexer Channel.
Priority Interrupts resolve simultaneous device requests. Methods: Daisy Chain (hardware, simple, fixed priority), Parallel Priority (fast, encoder-based), Software Poll (flexible, slow).
Asynchronous Transfer uses handshaking (strobe + ACK) for devices without a common clock.
UART transmits serial data with Start + 8 Data + Parity + Stop bits. Common config: 9600 baud, 8N1.

Everything in this chapter powers India's digital infrastructure: Aadhaar biometric scanners (DMA), UPI payment terminals (UART + interrupts), ISRO satellites (priority interrupts), Vande Bharat train control systems (real-time I/O), and the 600,000+ ATMs across India (DMA + interrupt-driven I/O).

Section L

Earning Checkpoint

Skill Learned	Tool / Method	Portfolio Artifact	Can You Earn?
I/O Addressing (Mem-mapped vs Isolated)	Conceptual + ARM/x86 comparison	—	✅ Yes — interview topic for embedded jobs
Data Transfer Modes	Conceptual comparison	Comparison chart / notes	✅ Yes — GATE + interview essential
DMA Architecture	Block diagrams + numericals	DMA Simulator (C/Python)	✅ Yes — portfolio for TI/Qualcomm roles
UART Protocol	Python simulation + Arduino	UART Frame Generator program	✅ Yes — IoT freelance projects
Priority Interrupts	Verilog / Logisim	Interrupt Controller on GitHub	✅ Yes — VLSI design jobs ₹8–12 LPA
Handshaking Protocol	Timing diagrams	—	✅ Yes — embedded systems interviews
IOP / Channel Architecture	Conceptual (mainframe focus)	—	⬜ Niche — relevant for IBM/HCL mainframe roles
IoT Integration	Arduino/ESP32 + sensors + UART	IoT Weather Station project	✅ Yes — ₹3,000–₹15,000/project

Minimum Viable Earning Setup after this chapter: An Arduino/ESP32 IoT project on GitHub + a UART simulator in Python + an Internshala profile listing "Embedded Systems / IoT" = you can earn ₹3,000–₹10,000/month from IoT freelance projects while still in college. For VLSI roles, add a Verilog interrupt controller and apply to TI, Qualcomm, Intel campus placements.

✅ Unit 5 complete. You've mastered I/O Organization!

[QR: Link to EduArtha video tutorial — I/O Organization]