Phase 2 β€’ EduArtha

Programming & Software Engineering

Python is the language of AI. You must also understand how to write efficient, scalable code. This book covers Python mastery, scientific computing, software engineering practices, and hardware fundamentals.

⏱ 2–4 months  |  13 Chapters  |  50+ Exercises

Part I

Python Mastery

Core language skills every AI engineer needs

Chapter 1

Data Structures & Algorithms

Learning Objectives

  • Choose the right data structure for each problem (lists, dicts, sets, tuples)
  • Implement stacks, queues, and linked lists
  • Understand Big-O notation and analyze algorithm complexity
  • Implement binary search, merge sort, and quicksort

Built-in Data Structures

StructureOrderedMutableDuplicatesLookupBest For
Listβœ“βœ“βœ“O(n)Ordered collections
Tupleβœ“βœ—βœ“O(n)Immutable records
Setβœ—βœ“βœ—O(1)Membership testing
Dictβœ“*βœ“Keys: βœ—O(1)Key-value mapping
Python
# Performance comparison β€” why choosing right structure matters
import time

data_list = list(range(1_000_000))
data_set = set(data_list)

# Searching for an element
target = 999_999

start = time.time()
_ = target in data_list   # O(n) β€” scans every element
print(f"List: {time.time()-start:.6f}s")

start = time.time()
_ = target in data_set    # O(1) β€” hash lookup
print(f"Set:  {time.time()-start:.6f}s")
# Set is ~1000x faster for membership testing!

Big-O Notation

NotationNameExample1M items
O(1)ConstantDict lookup1 op
O(log n)LogarithmicBinary search20 ops
O(n)LinearList scan1M ops
O(n log n)LinearithmicMerge sort20M ops
O(nΒ²)QuadraticBubble sort1T ops

Searching & Sorting

Python
# Binary Search β€” O(log n)
def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

# Quick Sort β€” O(n log n) average
def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)

print(quicksort([38, 27, 43, 3, 9, 82, 10]))

# Stack implementation
class Stack:
    def __init__(self): self.items = []
    def push(self, item): self.items.append(item)
    def pop(self): return self.items.pop()
    def peek(self): return self.items[-1]
    def is_empty(self): return len(self.items) == 0

Project: Task Scheduler with Priority Queue

Python
import heapq

class TaskScheduler:
    def __init__(self):
        self.heap = []
        self.counter = 0

    def add_task(self, task, priority):
        heapq.heappush(self.heap, (priority, self.counter, task))
        self.counter += 1

    def get_next(self):
        if self.heap:
            priority, _, task = heapq.heappop(self.heap)
            return task
        return None

scheduler = TaskScheduler()
scheduler.add_task("Fix critical bug", 1)
scheduler.add_task("Write docs", 5)
scheduler.add_task("Deploy to prod", 2)
scheduler.add_task("Code review", 3)

while (task := scheduler.get_next()):
    print(f"Executing: {task}")

Exercises

Exercise 1.1: Implement merge sort and explain its time complexity
def merge_sort(arr):
    if len(arr) <= 1: return arr
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

def merge(left, right):
    result, i, j = [], 0, 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    result.extend(left[i:]); result.extend(right[j:])
    return result

Time: O(n log n) always. Space: O(n). Divides array in half each time (log n levels), merges n elements at each level.

Exercise 1.2: When would you use a dict over a list?

Use dict when you need fast O(1) key-based lookup, counting occurrences, or mapping relationships. Use list when you need ordered elements, indexed access, or iteration in sequence. Example: counting word frequencies β†’ dict. Storing sorted scores β†’ list.

Exercise 1.3: Implement a queue using two stacks
class QueueFromStacks:
    def __init__(self):
        self.in_stack = []
        self.out_stack = []
    def enqueue(self, item):
        self.in_stack.append(item)
    def dequeue(self):
        if not self.out_stack:
            while self.in_stack:
                self.out_stack.append(self.in_stack.pop())
        return self.out_stack.pop()
Exercise 1.4: What is the time complexity of checking if an element exists in a list vs a set?

List: O(n) β€” must scan linearly. Set: O(1) amortized β€” uses hash table. For 1M elements, list takes ~1M comparisons, set takes ~1. Always use sets for membership tests.

Chapter Summary

  • Choose data structures by access pattern: O(1) lookup β†’ dict/set, ordered β†’ list
  • Binary search (O(log n)) requires sorted data; quicksort/mergesort are O(n log n)
  • Big-O describes worst-case growth rate β€” crucial for scalable code
  • Stacks (LIFO) and queues (FIFO) solve specific ordering problems
Chapter 2

Object-Oriented Programming

Learning Objectives

  • Design classes with encapsulation, inheritance, and polymorphism
  • Implement magic/dunder methods for Pythonic objects
  • Apply abstract classes and design patterns
  • Build a complete OOP project

Classes & Objects

Python
class NeuralLayer:
    def __init__(self, input_size, output_size, activation='relu'):
        self.weights = [[0.0] * input_size for _ in range(output_size)]
        self.bias = [0.0] * output_size
        self.activation = activation
        self._name = f"Layer({input_size}β†’{output_size})"  # private

    def __repr__(self):
        return f"NeuralLayer({self._name}, act={self.activation})"

    def __len__(self):
        return len(self.weights)

    def param_count(self):
        return len(self.weights) * len(self.weights[0]) + len(self.bias)

layer = NeuralLayer(128, 64)
print(layer)              # NeuralLayer(Layer(128β†’64), act=relu)
print(len(layer))         # 64
print(layer.param_count()) # 8256

Inheritance & Polymorphism

Python
from abc import ABC, abstractmethod

class Shape(ABC):
    @abstractmethod
    def area(self): pass

    @abstractmethod
    def perimeter(self): pass

    def describe(self):
        return f"{self.__class__.__name__}: area={self.area():.2f}"

class Circle(Shape):
    def __init__(self, radius):
        self.radius = radius
    def area(self): return 3.14159 * self.radius ** 2
    def perimeter(self): return 2 * 3.14159 * self.radius

class Rectangle(Shape):
    def __init__(self, w, h):
        self.w, self.h = w, h
    def area(self): return self.w * self.h
    def perimeter(self): return 2 * (self.w + self.h)

# Polymorphism β€” same interface, different behavior
shapes = [Circle(5), Rectangle(4, 6), Circle(3)]
for s in shapes:
    print(s.describe())

Design Patterns

Python
# Singleton β€” only one instance ever
class DatabaseConnection:
    _instance = None
    def __new__(cls):
        if cls._instance is None:
            cls._instance = super().__new__(cls)
        return cls._instance

# Factory β€” create objects without specifying exact class
class ModelFactory:
    @staticmethod
    def create(model_type):
        models = {'linear': LinearModel, 'tree': TreeModel}
        return models[model_type]()

Project: Library Management System

Python
from datetime import datetime, timedelta

class Book:
    def __init__(self, title, author, isbn):
        self.title, self.author, self.isbn = title, author, isbn
        self.is_available = True
    def __str__(self): return f"'{self.title}' by {self.author}"

class Member:
    def __init__(self, name, member_id):
        self.name, self.id = name, member_id
        self.borrowed = []

class Library:
    def __init__(self, name):
        self.name = name
        self.books, self.members, self.loans = {}, {}, []

    def add_book(self, book):
        self.books[book.isbn] = book

    def borrow(self, isbn, member_id):
        book = self.books.get(isbn)
        member = self.members.get(member_id)
        if book and member and book.is_available:
            book.is_available = False
            due = datetime.now() + timedelta(days=14)
            self.loans.append({'book': book, 'member': member, 'due': due})
            member.borrowed.append(book)
            print(f"βœ“ {member.name} borrowed {book}. Due: {due:%Y-%m-%d}")

    def return_book(self, isbn):
        book = self.books.get(isbn)
        if book:
            book.is_available = True
            print(f"βœ“ {book} returned")

lib = Library("City Library")
lib.add_book(Book("Deep Learning", "Goodfellow", "978-0"))
lib.members["M1"] = Member("Alice", "M1")
lib.borrow("978-0", "M1")

Exercises

Exercise 2.1: Implement __add__ and __eq__ for a Vector2D class
class Vector2D:
    def __init__(self, x, y): self.x, self.y = x, y
    def __add__(self, other): return Vector2D(self.x+other.x, self.y+other.y)
    def __eq__(self, other): return self.x==other.x and self.y==other.y
    def __repr__(self): return f"Vec({self.x},{self.y})"

v = Vector2D(1,2) + Vector2D(3,4)  # Vec(4,6)
Exercise 2.2: What is the difference between @staticmethod and @classmethod?

@staticmethod: No access to class or instance. Just a function namespaced inside the class. def method():

@classmethod: Receives the class as first arg (cls). Can access/modify class state. Used for alternative constructors like Date.from_string("2024-01-15").

Exercise 2.3: Why is composition often preferred over inheritance?

"Favor composition over inheritance" β€” instead of Car extends Engine (a car IS an engine? No!), use Car has-a Engine. Composition is more flexible: you can swap components at runtime, avoid deep inheritance hierarchies, and follow the Single Responsibility Principle.

Chapter Summary

  • Classes encapsulate data + behavior; __init__ initializes state
  • Inheritance enables code reuse; ABC enforces interfaces
  • Dunder methods (__str__, __add__, __len__) make objects behave like built-ins
  • Design patterns (Singleton, Factory) solve recurring OOP problems
Chapter 3

Functional Programming Concepts

Learning Objectives

  • Use higher-order functions: map, filter, reduce, and lambdas
  • Build closures and understand their practical applications
  • Leverage functools for memoization and partial application
  • Write clean data processing pipelines in functional style

First-Class Functions & Lambdas

Python
# Functions are objects β€” can be passed, returned, stored
def apply_twice(func, value):
    return func(func(value))

print(apply_twice(lambda x: x * 2, 3))  # 12
print(apply_twice(lambda x: x + 10, 5)) # 25

# map, filter, reduce
nums = [1, 2, 3, 4, 5, 6, 7, 8]
squares = list(map(lambda x: x**2, nums))         # [1,4,9,16,25,36,49,64]
evens = list(filter(lambda x: x%2==0, nums))     # [2,4,6,8]

from functools import reduce
total = reduce(lambda a, b: a + b, nums)           # 36

# List comprehension alternative (more Pythonic)
squares = [x**2 for x in nums]
evens = [x for x in nums if x % 2 == 0]

Closures & functools

Python
# Closure β€” function that remembers its enclosing scope
def make_multiplier(factor):
    def multiply(x):
        return x * factor  # 'factor' is captured from outer scope
    return multiply

double = make_multiplier(2)
triple = make_multiplier(3)
print(double(5))  # 10
print(triple(5))  # 15

# Memoization with lru_cache
from functools import lru_cache

@lru_cache(maxsize=None)
def fibonacci(n):
    if n < 2: return n
    return fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(100))  # Instant! Without cache: years

# Partial application
from functools import partial
import json

pretty_json = partial(json.dumps, indent=2, sort_keys=True)
print(pretty_json({"name": "AI", "type": "ML"}))

Project: Functional Data Pipeline

Python
from functools import reduce

# Pipeline: compose functions left-to-right
def pipeline(*funcs):
    return lambda x: reduce(lambda v, f: f(v), funcs, x)

# Data processing steps
clean = lambda data: [s.strip().lower() for s in data]
remove_empty = lambda data: [s for s in data if s]
remove_dupes = lambda data: list(dict.fromkeys(data))
sort_alpha = lambda data: sorted(data)

process = pipeline(clean, remove_empty, remove_dupes, sort_alpha)

raw = ["  Python ", "  java", "", "PYTHON", "  Go  ", "java"]
result = process(raw)
print(result)  # ['go', 'java', 'python']

Exercises

Exercise 3.1: Rewrite this loop using map and filter: [x**2 for x in range(20) if x % 3 == 0]
result = list(map(lambda x: x**2, filter(lambda x: x%3==0, range(20))))
# [0, 9, 36, 81, 144, 225, 324]
# List comprehension is more Pythonic, but map/filter is more composable
Exercise 3.2: Implement a memoized factorial using lru_cache
@lru_cache(maxsize=None)
def factorial(n):
    if n <= 1: return 1
    return n * factorial(n - 1)
print(factorial(100))  # Computed instantly with caching
Exercise 3.3: What is the advantage of closures over global variables?

Closures encapsulate state without polluting global scope. Each closure gets its own private copy of captured variables. They are thread-safe (no shared mutable state), testable (no hidden dependencies), and composable. Global variables create hidden coupling and make debugging harder.

Chapter Summary

  • Functions are first-class objects β€” pass, return, and store them
  • map/filter/reduce enable declarative data transformation
  • Closures capture enclosing scope β€” great for factories and callbacks
  • lru_cache provides free memoization for recursive functions
Chapter 4

Memory Management & Generators

Learning Objectives

  • Understand Python's memory model: reference counting and garbage collection
  • Build generators with yield for memory-efficient data processing
  • Use itertools for powerful lazy iteration patterns
  • Optimize memory with __slots__ and generator expressions

Generators: Lazy Evaluation

Python
import sys

# List vs Generator β€” memory comparison
big_list = [x**2 for x in range(1_000_000)]
big_gen = (x**2 for x in range(1_000_000))

print(f"List:      {sys.getsizeof(big_list):>10,} bytes")  # ~8.5 MB
print(f"Generator: {sys.getsizeof(big_gen):>10,} bytes")   # ~200 bytes!

# Custom generator with yield
def read_large_file(filepath):
    with open(filepath) as f:
        for line in f:
            yield line.strip()  # One line at a time, not whole file

# itertools β€” lazy iteration toolkit
from itertools import chain, islice, count, cycle

# Chain multiple iterables
combined = chain([1,2], [3,4], [5,6])  # 1,2,3,4,5,6

# Take first N from infinite generator
first_10_evens = list(islice((x for x in count() if x%2==0), 10))

# __slots__ β€” reduce memory per instance
class PointSlots:
    __slots__ = ['x', 'y']
    def __init__(self, x, y): self.x, self.y = x, y
# Uses ~40% less memory than regular class with __dict__

Project: Process 1M-Row CSV with Generators

Python
import csv

def read_csv_rows(filepath):
    with open(filepath) as f:
        reader = csv.DictReader(f)
        for row in reader:
            yield row

def filter_active(rows):
    for row in rows:
        if row.get('status') == 'active':
            yield row

def extract_emails(rows):
    for row in rows:
        yield row['email']

# Pipeline β€” processes 1M rows using constant memory!
# rows = read_csv_rows('users_1M.csv')
# active = filter_active(rows)
# emails = extract_emails(active)
# for email in emails:
#     send_newsletter(email)

Exercises

Exercise 4.1: Write a generator that yields Fibonacci numbers infinitely
def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

from itertools import islice
print(list(islice(fibonacci(), 10)))  # [0,1,1,2,3,5,8,13,21,34]
Exercise 4.2: When would you use __slots__?

Use __slots__ when creating millions of instances of the same class (e.g., nodes in a graph, particles in simulation). It eliminates per-instance __dict__, saving ~40% memory. Don't use it for classes with few instances or when you need dynamic attributes.

Exercise 4.3: What is the difference between yield and return?

return exits the function permanently and returns a value. yield pauses the function, returns a value, and remembers its state β€” next call resumes from where it paused. yield turns a function into a generator that produces values lazily, one at a time.

Chapter Summary

  • Generators use yield for lazy evaluation β€” constant memory regardless of data size
  • Generator expressions (x for x in ...) are memory-efficient alternatives to list comprehensions
  • itertools provides powerful lazy tools: chain, islice, product, combinations
  • __slots__ reduces memory for classes with many instances
Chapter 5

Decorators & Context Managers

Learning Objectives

  • Build decorators from scratch and understand the decorator pattern
  • Use @property, @staticmethod, @classmethod effectively
  • Create context managers with __enter__/__exit__ and contextlib
  • Build a reusable timing and logging framework

Building Decorators

Python
import time
from functools import wraps

# Timer decorator
def timer(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        print(f"⏱ {func.__name__} took {elapsed:.4f}s")
        return result
    return wrapper

# Retry decorator with arguments
def retry(max_attempts=3, delay=1):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    print(f"Attempt {attempt+1} failed: {e}")
                    time.sleep(delay)
            raise Exception(f"Failed after {max_attempts} attempts")
        return wrapper
    return decorator

@timer
@retry(max_attempts=3)
def fetch_data(url):
    print(f"Fetching {url}...")
    return {"status": "ok"}

Context Managers

Python
# Class-based context manager
class DatabaseTransaction:
    def __enter__(self):
        print("BEGIN TRANSACTION")
        return self
    def __exit__(self, exc_type, exc_val, exc_tb):
        if exc_type:
            print(f"ROLLBACK β€” Error: {exc_val}")
        else:
            print("COMMIT")
        return False

# contextlib β€” simpler decorator-based approach
from contextlib import contextmanager

@contextmanager
def temp_directory():
    import tempfile, shutil
    dirpath = tempfile.mkdtemp()
    try:
        yield dirpath
    finally:
        shutil.rmtree(dirpath)

with temp_directory() as tmpdir:
    print(f"Working in {tmpdir}")
# Directory is automatically cleaned up

Project: Timing & Logging Framework

Python
import time, logging
from functools import wraps

logging.basicConfig(level=logging.INFO)

def log_and_time(logger=None):
    def decorator(func):
        log = logger or logging.getLogger(func.__module__)
        @wraps(func)
        def wrapper(*args, **kwargs):
            log.info(f"β–Ά {func.__name__} started")
            start = time.perf_counter()
            try:
                result = func(*args, **kwargs)
                elapsed = time.perf_counter() - start
                log.info(f"βœ“ {func.__name__} completed in {elapsed:.3f}s")
                return result
            except Exception as e:
                elapsed = time.perf_counter() - start
                log.error(f"βœ— {func.__name__} failed after {elapsed:.3f}s: {e}")
                raise
        return wrapper
    return decorator

@log_and_time()
def train_model(epochs):
    time.sleep(0.5)
    return {"accuracy": 0.95}

train_model(10)

Exercises

Exercise 5.1: Write a decorator that caches results in a dictionary (manual memoize)
def memoize(func):
    cache = {}
    @wraps(func)
    def wrapper(*args):
        if args not in cache:
            cache[args] = func(*args)
        return cache[args]
    wrapper.cache = cache
    return wrapper
Exercise 5.2: What does functools.wraps do and why is it important?

Without @wraps, the decorated function loses its original __name__, __doc__, and __module__. help(func) would show "wrapper" instead of the real function name. @wraps copies these attributes from the original function to the wrapper, preserving introspection and debugging ability.

Exercise 5.3: Build a context manager that suppresses specific exceptions
@contextmanager
def suppress(*exceptions):
    try:
        yield
    except exceptions:
        pass

with suppress(FileNotFoundError):
    open('nonexistent.txt')  # Silently ignored

Chapter Summary

  • Decorators wrap functions to add behavior: timing, logging, caching, retrying
  • Always use @functools.wraps to preserve function metadata
  • Context managers (with statement) ensure cleanup: files, connections, locks
  • contextlib.contextmanager simplifies context manager creation with yield
Part II

Scientific Computing

The data science toolkit

Chapter 6

NumPy β€” Array Operations

Learning Objectives

  • Create and manipulate NumPy arrays efficiently
  • Master broadcasting, vectorization, and boolean masking
  • Perform linear algebra operations for ML
  • Benchmark NumPy vs pure Python performance

Array Creation & Indexing

Python
import numpy as np

# Creation
a = np.array([1,2,3,4,5])
zeros = np.zeros((3,4))
rand = np.random.randn(3,3)
grid = np.arange(0, 100, 5)
space = np.linspace(0, 1, 50)

# Boolean masking β€” powerful filtering
data = np.random.randn(1000)
positives = data[data > 0]        # All positive values
outliers = data[np.abs(data) > 2]  # Beyond 2 std devs

# Vectorized operations β€” 100x faster than loops
import time
size = 1_000_000
a = np.random.randn(size)

start = time.time()
result_loop = [x**2 + 2*x + 1 for x in a]
print(f"Loop:  {time.time()-start:.3f}s")

start = time.time()
result_np = a**2 + 2*a + 1
print(f"NumPy: {time.time()-start:.3f}s")  # ~50-100x faster

Broadcasting & Linear Algebra

Python
# Broadcasting: (3,3) + (3,1) β†’ auto-expands
matrix = np.ones((3,3))
col_vec = np.array([[10],[20],[30]])
result = matrix + col_vec  # Each row gets different offset

# Linear algebra for ML
X = np.random.randn(100, 3)
w = np.random.randn(3)
y = X @ w  # Matrix-vector multiplication (predictions)

# Normal equation: w = (Xα΅€X)⁻¹Xα΅€y
X_b = np.c_[np.ones((100,1)), X]  # Add bias column
w_optimal = np.linalg.inv(X_b.T @ X_b) @ X_b.T @ y

Project: Linear Regression from Scratch with NumPy

Python
import numpy as np

# Generate data: y = 3x₁ + 5xβ‚‚ + 7 + noise
np.random.seed(42)
X = np.random.randn(200, 2)
y = 3*X[:,0] + 5*X[:,1] + 7 + np.random.randn(200)*0.5

# Add bias, split data
X_b = np.c_[np.ones((200,1)), X]
X_train, X_test = X_b[:160], X_b[160:]
y_train, y_test = y[:160], y[160:]

# Gradient descent
w = np.zeros(3)
lr, epochs = 0.01, 1000
for i in range(epochs):
    preds = X_train @ w
    error = preds - y_train
    gradient = (2/len(y_train)) * X_train.T @ error
    w -= lr * gradient

print(f"Learned weights: bias={w[0]:.2f}, w1={w[1]:.2f}, w2={w[2]:.2f}")
# Expected: ~7, ~3, ~5
rmse = np.sqrt(np.mean((X_test @ w - y_test)**2))
print(f"Test RMSE: {rmse:.4f}")

Exercises

Exercise 6.1: Normalize a matrix so each column has mean=0 and std=1
X = np.random.randn(100, 5) * 10 + 50
X_norm = (X - X.mean(axis=0)) / X.std(axis=0)
print(X_norm.mean(axis=0).round(10))  # ~[0, 0, 0, 0, 0]
print(X_norm.std(axis=0).round(10))   # ~[1, 1, 1, 1, 1]
Exercise 6.2: Why is np.dot(a,b) faster than sum(a[i]*b[i] for i in range(n))?

NumPy uses compiled C/Fortran code (BLAS libraries) that operates on contiguous memory blocks with CPU SIMD instructions. Python loops have interpreter overhead per iteration, dynamic type checking, and poor cache utilization. For 1M elements, NumPy can be 100-500x faster.

Exercise 6.3: Explain broadcasting rules with an example of incompatible shapes

Rules: dimensions are compared right-to-left. Each must be equal OR one must be 1. (3,4) + (4,) works: (3,4)+(1,4)β†’(3,4). (3,4) + (3,) FAILS: 4β‰ 3 and neither is 1. Fix: reshape to (3,1) to broadcast across columns.

Chapter Summary

  • NumPy arrays are 100x faster than Python lists for numerical operations
  • Broadcasting auto-expands dimensions for element-wise operations
  • Boolean masking enables powerful data filtering without loops
  • Linear algebra functions (dot, inv, eig) are essential for ML implementations
Chapter 7

Pandas β€” Data Wrangling

Learning Objectives

  • Create, explore, and manipulate DataFrames
  • Handle missing data, merge datasets, and use groupby
  • Apply transformations with apply(), map(), and lambda
  • Perform a complete data analysis project
Python
import pandas as pd
import numpy as np

# Create from dict
df = pd.DataFrame({
    'name': ['Alice','Bob','Charlie','Diana','Eve'],
    'dept': ['ML','Web','ML','Data','Web'],
    'salary': [95000,82000,105000,78000,88000],
    'exp_years': [5, 3, 8, 2, 4]
})

# Selecting & filtering
ml_team = df[df['dept'] == 'ML']
senior = df.query('exp_years >= 5 and salary > 90000')

# GroupBy β€” split-apply-combine
dept_stats = df.groupby('dept').agg(
    avg_salary=('salary', 'mean'),
    headcount=('name', 'count'),
    max_exp=('exp_years', 'max')
)

# Missing data handling
df.loc[1, 'salary'] = np.nan
df['salary'] = df['salary'].fillna(df['salary'].median())

# Apply custom function
df['tax_bracket'] = df['salary'].apply(
    lambda s: 'High' if s > 90000 else 'Standard'
)

Project: Titanic Survival Analysis

Python
df = pd.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv')

# Exploration
print(df.shape)                  # (891, 12)
print(df['Survived'].value_counts())
print(df.isnull().sum())          # Age: 177, Cabin: 687 missing

# Survival rate by class and gender
survival = df.groupby(['Pclass', 'Sex'])['Survived'].mean()
print(survival.round(2))
# 1st class females: 97% survived, 3rd class males: 14%

# Feature engineering
df['Age'] = df['Age'].fillna(df['Age'].median())
df['FamilySize'] = df['SibSp'] + df['Parch'] + 1
df['IsAlone'] = (df['FamilySize'] == 1).astype(int)

Exercises

Exercise 7.1: Merge two DataFrames on a common column
orders = pd.DataFrame({'id':[1,2,3], 'product':['A','B','A']})
prices = pd.DataFrame({'product':['A','B'], 'price':[100,200]})
merged = orders.merge(prices, on='product')
print(merged)
Exercise 7.2: Find the top 3 departments by average salary using groupby
top3 = df.groupby('dept')['salary'].mean().nlargest(3)
Exercise 7.3: What is the difference between loc and iloc?

loc: label-based indexing β€” df.loc[0:5, 'name':'salary'] includes endpoint. iloc: integer-based indexing β€” df.iloc[0:5, 0:3] excludes endpoint (like Python slicing). Use loc with column names, iloc with column positions.

Chapter Summary

  • Pandas DataFrames are the standard for tabular data manipulation in Python
  • GroupBy enables split-apply-combine analysis patterns
  • Handle missing data with dropna/fillna before modeling
  • merge/join combines datasets; apply transforms columns with custom logic
Chapter 8

Matplotlib & Seaborn β€” Visualization

Learning Objectives

  • Create publication-quality plots with Matplotlib
  • Build statistical visualizations with Seaborn
  • Design multi-panel dashboards with subplots
  • Customize colors, labels, annotations, and themes
Python
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

# Multi-panel dashboard
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# 1. Line plot
x = np.linspace(0, 10, 100)
axes[0,0].plot(x, np.sin(x), '#6366f1', linewidth=2, label='sin')
axes[0,0].plot(x, np.cos(x), '#f59e0b', linewidth=2, label='cos')
axes[0,0].legend(); axes[0,0].set_title('Trigonometric Functions')

# 2. Histogram
data = np.random.normal(0, 1, 1000)
axes[0,1].hist(data, bins=30, color='#10b981', edgecolor='white')
axes[0,1].set_title('Normal Distribution')

# 3. Scatter
x = np.random.randn(100)
y = 2*x + np.random.randn(100)*0.5
axes[1,0].scatter(x, y, alpha=0.7, c='#ec4899')
axes[1,0].set_title('Correlation')

# 4. Bar chart
categories = ['Python', 'JS', 'Java', 'C++']
values = [85, 72, 68, 55]
axes[1,1].barh(categories, values, color='#6366f1')
axes[1,1].set_title('Language Popularity')

plt.tight_layout()
plt.savefig('dashboard.png', dpi=150)

# Seaborn β€” statistical plots
sns.set_theme(style='whitegrid')
tips = sns.load_dataset('tips')
sns.boxplot(data=tips, x='day', y='total_bill', hue='time')
plt.title('Bill Distribution by Day')

Exercises

Exercise 8.1: Create a heatmap of a correlation matrix using Seaborn
df = sns.load_dataset('iris')
corr = df.select_dtypes(include='number').corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)
Exercise 8.2: When would you use a violin plot instead of a box plot?

Violin plots show the full distribution shape (kernel density), while box plots only show quartiles. Use violin when you care about multimodality (e.g., bimodal distributions that box plots miss). Box plots are better for comparing medians across many groups quickly.

Exercise 8.3: How do you add annotations to highlight key data points?
plt.annotate('Peak', xy=(5, 100), xytext=(6, 120),
            arrowprops=dict(arrowstyle='->', color='red'),
            fontsize=12, color='red')

Chapter Summary

  • Matplotlib gives full control with figure/axes architecture
  • Seaborn provides high-level statistical plots with beautiful defaults
  • Always label axes, add titles, and use tight_layout for clean plots
  • Save with dpi=150+ for publication quality
Chapter 9

SciPy & Jupyter Notebooks

Learning Objectives

  • Use scipy.optimize for function minimization and curve fitting
  • Perform statistical tests with scipy.stats
  • Master Jupyter Notebook best practices and magic commands
  • Conduct an A/B test analysis
Python
from scipy import optimize, stats
import numpy as np

# Curve fitting β€” find best parameters
def model(x, a, b, c):
    return a * np.exp(-b * x) + c

x_data = np.linspace(0, 4, 50)
y_data = model(x_data, 2.5, 1.3, 0.5) + np.random.normal(0, 0.1, 50)

params, cov = optimize.curve_fit(model, x_data, y_data)
print(f"Fitted: a={params[0]:.2f}, b={params[1]:.2f}, c={params[2]:.2f}")

# T-test: are two groups statistically different?
group_a = np.random.normal(100, 10, 50)
group_b = np.random.normal(105, 10, 50)
t_stat, p_value = stats.ttest_ind(group_a, group_b)
print(f"t={t_stat:.3f}, p={p_value:.4f}")
print("Significant!" if p_value < 0.05 else "Not significant")

Jupyter Magic Commands

%timeit β€” benchmark code execution time. %matplotlib inline β€” show plots in notebook. %%writefile β€” save cell to file. %who β€” list variables. !pip install β€” run shell commands. %load_ext autoreload β€” auto-reload changed modules.

Project: A/B Test Analysis

Python
import numpy as np
from scipy import stats

# Control: old button, Treatment: new button
np.random.seed(42)
control_clicks = np.random.binomial(1, 0.12, 1000)   # 12% CTR
treatment_clicks = np.random.binomial(1, 0.15, 1000) # 15% CTR

print(f"Control CTR:   {control_clicks.mean():.1%}")
print(f"Treatment CTR: {treatment_clicks.mean():.1%}")

# Chi-squared test for proportions
from scipy.stats import chi2_contingency
table = np.array([[control_clicks.sum(), 1000-control_clicks.sum()],
                  [treatment_clicks.sum(), 1000-treatment_clicks.sum()]])
chi2, p, dof, expected = chi2_contingency(table)
print(f"\nChiΒ² = {chi2:.3f}, p-value = {p:.4f}")
print("β†’ Deploy new button!" if p < 0.05 else "β†’ Keep old button")

Exercises

Exercise 9.1: Minimize f(x) = (x-3)Β² + 2 using scipy.optimize
result = optimize.minimize(lambda x: (x-3)**2+2, x0=0)
print(f"Minimum at x={result.x[0]:.4f}, f(x)={result.fun:.4f}")
Exercise 9.2: What does a p-value of 0.03 mean?

There's a 3% probability of observing results this extreme if the null hypothesis (no difference) were true. Since 0.03 < 0.05, we reject the null hypothesis and conclude the difference is statistically significant. Note: p-value does NOT tell you the magnitude of the effect β€” use effect size for that.

Chapter Summary

  • SciPy extends NumPy with optimization, statistics, and signal processing
  • curve_fit finds optimal parameters for any model function
  • t-tests and chi-squared tests determine statistical significance
  • Jupyter notebooks are the standard interactive environment for data science
Part III

Software Engineering

Building production-ready code

Chapter 10

Version Control with Git

Learning Objectives

  • Master core Git commands: init, add, commit, branch, merge
  • Work with remote repositories: clone, push, pull
  • Handle merge conflicts and use feature branch workflows
  • Write good commit messages and maintain .gitignore
Bash
# Initialize & first commit
git init
git add .
git commit -m "Initial commit: project structure"

# Branching workflow
git checkout -b feature/add-login      # Create & switch
# ... make changes ...
git add -A
git commit -m "feat: add user login endpoint"
git checkout main
git merge feature/add-login            # Merge feature into main
git branch -d feature/add-login        # Clean up branch

# Working with remotes
git remote add origin https://github.com/user/repo.git
git push -u origin main
git pull origin main                   # Fetch + merge

# Undo mistakes
git stash                              # Save uncommitted changes
git reset --soft HEAD~1                # Undo last commit, keep changes
git log --oneline --graph -10          # Visual history

Commit Message Convention

feat: new feature, fix: bug fix, docs: documentation, refactor: code restructuring, test: adding tests, chore: maintenance. Example: feat: add batch prediction endpoint with caching

Exercises

Exercise 10.1: What is the difference between git merge and git rebase?

Merge creates a new "merge commit" with two parents, preserving full history. Rebase replays your commits on top of the target branch, creating a linear history. Rebase is cleaner but rewrites history β€” never rebase shared branches. Use merge for team branches, rebase for local cleanup.

Exercise 10.2: How do you resolve a merge conflict?

Git marks conflicts with <<<< HEAD, ====, >>>> markers. Open the file, choose the correct code (or combine both), remove markers, then git add and git commit. Use git diff to review, and test before committing.

Exercise 10.3: Write a .gitignore for a Python ML project
__pycache__/
*.pyc
.env
*.sqlite
node_modules/
dist/
*.egg-info/
.ipynb_checkpoints/
data/*.csv
models/*.pkl
wandb/

Chapter Summary

  • Git tracks every change β€” you can always undo mistakes
  • Feature branches isolate work; merge integrates it
  • Good commit messages document project evolution
  • .gitignore prevents secrets and large files from being tracked
Chapter 11

Testing & Debugging

Learning Objectives

  • Write unit tests with pytest and understand test types
  • Use fixtures, parametrize, and measure code coverage
  • Debug with pdb, breakpoints, and logging
  • Build a fully tested module
Python
# calculator.py
def add(a, b): return a + b
def divide(a, b):
    if b == 0: raise ValueError("Cannot divide by zero")
    return a / b

# test_calculator.py
import pytest
from calculator import add, divide

def test_add():
    assert add(2, 3) == 5
    assert add(-1, 1) == 0
    assert add(0, 0) == 0

def test_divide_by_zero():
    with pytest.raises(ValueError):
        divide(10, 0)

# Parametrize β€” test multiple inputs at once
@pytest.mark.parametrize("a,b,expected", [
    (10, 2, 5), (9, 3, 3), (7, 2, 3.5)
])
def test_divide(a, b, expected):
    assert divide(a, b) == expected

# Fixtures β€” shared setup
@pytest.fixture
def sample_data():
    return [1, 2, 3, 4, 5]

def test_sum(sample_data):
    assert sum(sample_data) == 15
Bash
# Run tests with coverage
pytest test_calculator.py -v --cov=calculator --cov-report=term-missing

Debugging & Logging

Python
import logging

logging.basicConfig(level=logging.INFO,
    format='%(asctime)s %(levelname)s %(message)s')
logger = logging.getLogger(__name__)

def train_model(data):
    logger.info(f"Training on {len(data)} samples")
    try:
        # Training logic
        logger.info("Training complete")
    except Exception as e:
        logger.error(f"Training failed: {e}")
        raise

# Debug with breakpoint() β€” drops into pdb
def buggy_function(x):
    result = x * 2
    breakpoint()  # Execution pauses here β†’ inspect variables
    return result + 1

Exercises

Exercise 11.1: What is the testing pyramid (unit vs integration vs E2E)?

Unit tests (base, most tests): Test individual functions in isolation. Fast, cheap. Integration tests (middle): Test components working together (API + database). E2E tests (top, fewest): Test full user workflows. Slow, expensive. The pyramid shape means: write many unit tests, fewer integration, fewest E2E.

Exercise 11.2: Write a test for a function that reads a file (using tmp_path fixture)
def test_read_file(tmp_path):
    f = tmp_path / "test.txt"
    f.write_text("hello world")
    assert f.read_text() == "hello world"
Exercise 11.3: Why use logging instead of print statements?

Logging provides: severity levels (DEBUG/INFO/WARNING/ERROR), timestamps, configurable output (file/console/remote), can be disabled in production without removing code, and supports structured formatting. Print statements must be manually removed and provide no filtering or context.

Chapter Summary

  • pytest is the standard Python testing framework β€” simple, powerful, extensible
  • Parametrize tests multiple inputs; fixtures share setup code
  • Aim for 80%+ code coverage; test edge cases and error paths
  • Use logging over print; use breakpoint() for interactive debugging
Chapter 12

Code Modularization, REST APIs & Docker

Learning Objectives

  • Structure Python projects with modules and packages
  • Build REST APIs with Flask and FastAPI
  • Containerize applications with Docker
  • Deploy an ML prediction API

Clean Code & Modules

Python
# Project structure
# ml_project/
# β”œβ”€β”€ src/
# β”‚   β”œβ”€β”€ __init__.py
# β”‚   β”œβ”€β”€ model.py
# β”‚   β”œβ”€β”€ preprocess.py
# β”‚   └── api.py
# β”œβ”€β”€ tests/
# β”œβ”€β”€ Dockerfile
# └── requirements.txt

FastAPI β€” Modern Python API

Python
from fastapi import FastAPI
from pydantic import BaseModel
import pickle
import numpy as np

app = FastAPI(title="ML Prediction API")

class PredictionInput(BaseModel):
    features: list[float]

class PredictionOutput(BaseModel):
    prediction: str
    confidence: float

# Load model on startup
with open("model.pkl", "rb") as f:
    model = pickle.load(f)

@app.post("/predict", response_model=PredictionOutput)
async def predict(data: PredictionInput):
    X = np.array(data.features).reshape(1, -1)
    pred = model.predict(X)[0]
    proba = model.predict_proba(X).max()
    return PredictionOutput(prediction=str(pred), confidence=float(proba))

@app.get("/health")
async def health():
    return {"status": "healthy"}

Docker β€” Containerization

Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "src.api:app", "--host", "0.0.0.0", "--port", "8000"]
Bash
# Build and run
docker build -t ml-api .
docker run -p 8000:8000 ml-api

# docker-compose for multi-service
# docker-compose.yml:
# services:
#   api:  build: . ports: ["8000:8000"]
#   db:   image: postgres:15

Exercises

Exercise 12.1: What is the difference between Flask and FastAPI?

Flask: Synchronous, mature, huge ecosystem, minimal by design. FastAPI: Async (ASGI), auto-generates OpenAPI docs, uses Pydantic for validation, 3-5x faster. Use FastAPI for new APIs; Flask for existing projects or simpler needs.

Exercise 12.2: Why containerize ML models with Docker?

Docker ensures "it works on my machine" β†’ "it works everywhere." Packages Python version, libraries, system dependencies, and model files into one portable image. Eliminates version conflicts, enables horizontal scaling, and works with Kubernetes for orchestration.

Exercise 12.3: Explain SOLID principles with one-line examples

Single Responsibility: One class = one job. Open/Closed: Extend via inheritance, don't modify existing code. Liskov Substitution: Subclass should be usable wherever parent is. Interface Segregation: Many specific interfaces > one fat interface. Dependency Inversion: Depend on abstractions, not concrete classes.

Chapter Summary

  • Structure code into modules/packages for maintainability
  • FastAPI provides modern, fast, auto-documented REST APIs
  • Docker containerizes your app with all dependencies for portable deployment
  • SOLID principles guide clean, maintainable architecture
Part IV

Hardware & Computing

Understanding the machine beneath the code

Chapter 13

GPU vs CPU, CUDA & Distributed Computing

Learning Objectives

  • Understand CPU vs GPU architecture and when GPUs win
  • Learn CUDA programming basics and GPU-accelerated Python
  • Grasp distributed computing concepts: data and model parallelism
  • Understand memory bandwidth bottlenecks and mixed precision training

CPU vs GPU Architecture

FeatureCPUGPU
Cores4-64 complex cores1000-16000 simple cores
Clock Speed3-5 GHz1-2 GHz
StrengthSequential, complex logicMassively parallel, simple ops
MemorySystem RAM (64-512 GB)VRAM (8-80 GB)
Best ForControl flow, I/O, OSMatrix math, convolutions

A GPU's thousands of cores can execute the same operation on thousands of data elements simultaneously (SIMD β€” Single Instruction Multiple Data). Matrix multiplication β€” the core of neural networks β€” is perfectly parallel: each output element is an independent dot product.

GPU-Accelerated Python

Python
# CuPy β€” NumPy on GPU (drop-in replacement)
import cupy as cp
import numpy as np
import time

size = 10000

# CPU (NumPy)
a_cpu = np.random.randn(size, size).astype(np.float32)
b_cpu = np.random.randn(size, size).astype(np.float32)
start = time.time()
c_cpu = a_cpu @ b_cpu
print(f"CPU: {time.time()-start:.3f}s")

# GPU (CuPy)
a_gpu = cp.array(a_cpu)
b_gpu = cp.array(b_cpu)
start = time.time()
c_gpu = a_gpu @ b_gpu
cp.cuda.Stream.null.synchronize()
print(f"GPU: {time.time()-start:.3f}s")
# GPU is typically 10-50x faster for large matrix multiplications

# Numba β€” JIT compile Python to GPU kernels
from numba import cuda
import math

@cuda.jit
def vector_add_gpu(a, b, result):
    idx = cuda.grid(1)
    if idx < a.size:
        result[idx] = a[idx] + b[idx]

CUDA Concepts

CUDA organizes parallel execution into a hierarchy:

LevelDescriptionAnalogy
ThreadSingle execution unitOne worker
BlockGroup of threads (up to 1024)One team
GridGroup of blocksEntire workforce

Distributed Computing

Python
# PyTorch Data Parallel β€” split batches across GPUs
import torch
import torch.nn as nn

model = nn.Linear(1000, 100)

# Data Parallelism: same model, split data across GPUs
if torch.cuda.device_count() > 1:
    model = nn.DataParallel(model)
model = model.cuda()

# Model Parallelism: split model layers across GPUs
# layer1 β†’ GPU0, layer2 β†’ GPU1 (for models too large for one GPU)

Memory Bottlenecks & Mixed Precision

Memory Bandwidth is the Real Bottleneck

GPUs can compute faster than memory can feed data to them. Key optimizations: (1) Mixed precision (FP16): halves memory, doubles throughput with tensor cores. (2) Gradient checkpointing: recompute activations instead of storing them. (3) Data prefetching: load next batch while computing current one. (4) Model quantization: INT8 inference for 4x speedup on edge devices.

Python
# Mixed Precision Training with PyTorch
from torch.cuda.amp import autocast, GradScaler

scaler = GradScaler()

for batch in dataloader:
    optimizer.zero_grad()
    with autocast():                  # FP16 forward pass
        output = model(batch)
        loss = criterion(output, labels)
    scaler.scale(loss).backward()      # Scaled FP16 backward
    scaler.step(optimizer)
    scaler.update()
# ~2x faster training, ~50% less memory!

Project: Benchmark CPU vs GPU Matrix Operations

Python
import numpy as np
import time

sizes = [100, 500, 1000, 2000, 5000]
results = []

for n in sizes:
    A = np.random.randn(n, n).astype(np.float32)
    B = np.random.randn(n, n).astype(np.float32)

    start = time.time()
    C = A @ B
    cpu_time = time.time() - start

    gflops = (2 * n**3) / cpu_time / 1e9
    results.append((n, cpu_time, gflops))
    print(f"Size {n:>5}Γ—{n:<5} β†’ {cpu_time:.3f}s ({gflops:.1f} GFLOPS)")

# With GPU (CuPy):
# try:
#     import cupy as cp
#     A_gpu = cp.array(A); B_gpu = cp.array(B)
#     start = time.time()
#     C_gpu = A_gpu @ B_gpu
#     cp.cuda.Stream.null.synchronize()
#     gpu_time = time.time() - start
#     print(f"GPU: {gpu_time:.3f}s β†’ {cpu_time/gpu_time:.1f}x speedup")

Exercises

Exercise 13.1: Why can't GPUs replace CPUs entirely?

GPUs excel at data parallelism β€” the same operation on many elements. But they are poor at: (1) complex branching/control flow, (2) sequential algorithms, (3) operating system tasks, (4) low-latency single-thread operations, (5) irregular memory access patterns. The CPU handles orchestration while the GPU handles computation.

Exercise 13.2: What is the difference between data parallelism and model parallelism?

Data parallelism: Same model replicated across GPUs, each processes different data batches. Gradients are averaged. Works for most models. Model parallelism: Different parts of the model on different GPUs. Required when model doesn't fit in one GPU's memory (e.g., GPT-4 with 1.7T parameters). More complex to implement.

Exercise 13.3: Why does mixed precision training work without losing accuracy?

FP16 has less precision but neural networks are robust to small rounding errors. The trick: (1) Forward pass in FP16 (fast, small). (2) Loss scaling prevents tiny gradients from rounding to zero. (3) Weight updates in FP32 (full precision master copy). (4) Only the final update step needs precision. Result: nearly identical accuracy with 2x speed and 50% memory.

Exercise 13.4: What is MapReduce and how does it relate to distributed ML?

Map: Apply a function to each data chunk independently (parallel). Reduce: Combine results into a single output. In ML: Map = compute gradients on each GPU's data batch. Reduce = average gradients across all GPUs (AllReduce). This is the foundation of distributed SGD used by PyTorch DDP and Horovod.

Chapter Summary

  • GPUs have thousands of simple cores ideal for parallel matrix operations
  • CuPy and Numba bring GPU acceleration to Python with minimal code changes
  • Data parallelism splits batches across GPUs; model parallelism splits the model
  • Mixed precision (FP16) doubles throughput with minimal accuracy loss
  • Memory bandwidth, not compute, is often the real bottleneck

πŸŽ“ Congratulations!

You've completed Programming & Software Engineering. You now have the skills to write clean, efficient, production-ready Python code β€” from algorithms and OOP to APIs and GPU computing.

Β© 2025 EduArtha β€” Programming & Software Engineering Complete Guide