Operating Systems, VisualizedSchedule. Page. Lock. Crash.
The OS is hiding under every line of code you ship. Twelve hands-on sections to make the kernel obvious — and pass the interview round while you're at it.
Scroll to explore ↓
An operating system is the software that owns your hardware. It decides which program gets the CPU next, where each byte of memory lives, when a disk read returns, and which process is allowed to send packets. Every keystroke, every API call, every database query you ship in production travels through the OS — usually thousands of times per second.
Most of that machinery is invisible until something goes wrong. Then suddenly you're reading a stack trace pinned in futex(FUTEX_WAIT), staring at vmstat output where si/so is climbing, or watching p99 latency double after a deploy you can't correlate to anything obvious. The OS is showing through.
This page is built for engineers who've shipped real systems and want their OS knowledge tight enough for a senior interview — and tight enough to debug production. You won't read about schedulers; you'll race them. You won't learn deadlock from a diagram; you'll build one with your hands.
New to all of this? Start with The Basics below — plain prose, no jargon, gets you the vocabulary you need for everything after.
Start Here — The Basics
Everything below assumes you know what a process, a thread, and memory are. If those words feel fuzzy, this section gets you there from scratch — patient, no jargon dumps.
Already comfortable? Skip straight to The Kernel Boundary.
1. What's actually inside a computer?
Forget software for a moment. The physical machine in front of you has three things that matter for our story: a CPU, some RAM, and a disk. Everything an operating system does is ultimately about juggling those three.
Does the actual thinking — adds, compares, copies bytes around. Your laptop CPU does this billions of times per second.
Where the CPU keeps things it's working on right now. Big, fast, but forgets everything when power is cut.
Where files live permanently. Survives power loss. Much slower than RAM — that's why we copy things into RAM to use them.
A simple way to picture it: the CPU is the brain, RAM is the desk where it spreads out the papers it's currently working on, and the disk is the filing cabinet down the hall. Anything the CPU wants to use, it first pulls onto the desk. That's why opening a big file feels slow — it's being walked over from the cabinet to the desk.
2. Bits, bytes, and how computers count
Every wire inside the CPU is either carrying voltage or it isn't. We call those two states 1 and 0. One such state is a bit — the tiniest possible piece of information.
Bits alone aren't very useful, so we group them. Eight bits make a byte. One byte can hold a number from 0 to 255, or one ASCII character like ‘A’. Almost everything you measure on a computer is a count of bytes:
- 1 KB ≈ 1,000 bytes — a paragraph of text
- 1 MB ≈ 1 million bytes — a phone photo
- 1 GB ≈ 1 billion bytes — a movie
- 1 TB ≈ 1 trillion bytes — your laptop's disk
When someone says your laptop has “16 gigs of RAM,” they mean the CPU's desk can hold roughly 16 billion bytes at once. That's a lot of mailboxes (more on those in section 10).
3. How a CPU actually works
A CPU only knows how to do very small things: add two numbers, compare two numbers, copy a byte from here to there, jump to a different instruction. Each one of those is called an instruction. Your entire program — Chrome, Python, this web page — is just millions of those little instructions strung together.
The CPU runs them one at a time, very quickly. The clock is what keeps the beat. A 3 GHz CPU ticks 3 billion times per second, and roughly one instruction happens per tick. That's how a chip the size of a stamp ends up running an entire video game.
Most CPUs have multiple cores. Each core is a complete little CPU on its own. A “8-core” chip can be running 8 instructions at the same instant — eight different programs, or eight parts of the same program. This is real hardware parallelism, not a trick.
4. Registers — the CPU's pockets
Inside each core there are about 16–32 tiny storage slots called registers. Each one holds one number. That's it — a few dozen numbers total. But they're the fastest memory that exists in the whole computer, because they're wired directly to the part of the CPU doing the math.
When the CPU adds two numbers, it doesn't reach into RAM. It loads both into registers, adds them in a register, then (eventually) writes the result back to RAM. Think of registers as the CPU's pockets: tiny, but right there, no walking required.
5. Why CPUs have caches (L1, L2, L3)
Here's the catch: RAM is dramatically slower than the CPU. The CPU does an instruction in ~0.3 nanoseconds. Reading from RAM takes ~100 nanoseconds. If the CPU had to wait on RAM for every instruction, it would spend 99% of its time doing nothing.
So chip designers slip in caches — small, fast memories that sit between the CPU and RAM. They're arranged in a hierarchy:
Each level is bigger but slower than the one above. The CPU checks the closest one first; if the data isn't there it tries the next level down. The whole point of writing “cache-friendly” code — the kind that walks an array in order — is to keep the data the CPU needs near the top of this stack.
6. Interrupts — how the CPU notices things
Picture this: the CPU is grinding through a million instructions a millisecond. You press a key. How does the CPU even know?
The keyboard chip raises a signal called an interrupt. That signal yanks the CPU's attention: it stops what it's doing, saves where it was, jumps to a small piece of OS code that handles the keypress, and then resumes. The whole detour takes microseconds.
Almost every external thing you can think of arrives as an interrupt: keyboard, mouse, network packet, disk read finished, hardware timer ticking, USB device plugged in. The OS spends a huge chunk of its life servicing interrupts. Without them, the CPU would have to constantly poll every device — exhausting and wasteful.
7. 32-bit vs 64-bit — what those numbers mean
When someone calls a CPU “64-bit,” they mean two things: its registers are 64 bits wide, and it can address up to 264 bytes of memory.
That second part is why we moved from 32-bit to 64-bit. A 32-bit CPU could only address 232 bytes — about 4 GB. The moment laptops started shipping with more than 4 GB of RAM (around 2005–2008), 32-bit was done. A 64-bit CPU can theoretically address 16 exabytes — 16 billion gigabytes. We'll be fine for a while.
One side effect: 64-bit pointers are twice as big, so 64-bit programs use slightly more memory. The trade-off is worth it because the alternative is being stuck under 4 GB of RAM per process.
8. CPU architecture families
Not all CPUs speak the same language. The set of instructions a CPU understands is called its Instruction Set Architecture (ISA). Code compiled for one ISA generally won't run on another — this is why an iPhone app and a Windows app aren't interchangeable, even when they do the same thing.
Examples: Intel Core, AMD Ryzen
Where: Most desktops, laptops, servers
Style: CISC — many complex instructions
Examples: Apple Silicon (M1/M2/M3), Snapdragon, AWS Graviton
Where: Phones, tablets, recent Macs, cloud servers
Style: RISC — fewer simple instructions, lower power
Examples: SiFive, Western Digital chips
Where: Embedded, growing into laptops/servers
Style: RISC, open-source ISA — anyone can build a chip
There are a handful of others (PowerPC, MIPS, SPARC) but they're mostly historical or niche. For day-to-day work you'll see x86-64 and ARM the most.
9. Intel vs AMD vs Apple Silicon
Inside the x86-64 family, two companies make almost every chip: Intel (Core i5/i7/i9, Xeon) and AMD (Ryzen, EPYC). Their chips speak the same instruction set, so a compiled binary runs on either — you don't recompile your Linux server when you swap an Intel box for an AMD box. They compete on speed, power efficiency, and price; AMD has been particularly aggressive on core counts in the last few years.
Apple Silicon (M1, M2, M3, M4) is different. It's ARM, not x86. When Apple switched their Macs from Intel to their own ARM chips in 2020, every macOS app had to be recompiled — or run through a translator called Rosetta 2. The payoff was huge: the M1 was faster than Intel chips at a fraction of the power, partly because of ARM's simpler instruction set.
That “simpler instruction set” is the famous RISC vs CISC argument. CISC (Complex Instruction Set Computing — like x86) has lots of fancy instructions that do a lot per tick. RISC (Reduced Instruction Set — like ARM, RISC-V) has fewer, simpler instructions and trusts the compiler to combine them. Modern chips on both sides have borrowed from each other so heavily that the line is fuzzier than it used to be — but RISC generally wins on power efficiency, which is why ARM rules phones and is now eating data centers.
10. What memory really is
Picture a long street with billions of identical mailboxes, numbered 0, 1, 2, 3, … and so on. Each mailbox holds exactly one byte. That's RAM.
Each mailbox number is called an address. When your code does x = 42, what really happens is: the compiler picks an address for x, and your CPU writes the byte 42 into that mailbox. Reading it back is the same in reverse — give the CPU an address, get the byte at that mailbox.
Numbers bigger than 255 take more than one byte. A 32-bit integer occupies 4 mailboxes in a row; a 64-bit double occupies 8. The address you read or write from is the address of the first mailbox in the run.
11. How a program uses memory
When your program runs, the OS hands it a chunk of those mailboxes and the program organises them into four named regions:
Your program's instructions. Read-only at runtime.
Variables declared outside any function. Live the whole time.
Memory you ask for at runtime — malloc, new, lists in Python. Grows up.
Local variables and function call info. Grows down. Auto-cleaned on return.
When you write int x = 10 inside a function, x lives on the stack and disappears the moment the function returns. When you write malloc(1000) in C or new Foo() in Java, the bytes come from the heap and stay until something explicitly frees them (or, in Java/Python, until the garbage collector decides to). Every memory bug you'll ever debug is a confusion between these regions.
12. From program to running code
What happens between “double-click an icon” and “the app is running”? Roughly five steps:
- 1. The OS finds the executable file on disk.
- 2. It allocates a chunk of RAM and copies the code section from disk into it.
- 3. It sets up the four regions from the previous section (code, globals, heap, stack).
- 4. It records this whole bundle as a new process with a unique ID (the PID).
- 5. It tells the CPU: “start running instructions at the entry point of this code.”
That's it — that's how every running thing on your computer got there. The terminal command ps on Linux/macOS (or Task Manager on Windows) shows you all the processes currently in step 5.
13. What a process is, really
A process is a running program plus everything that program needs to actually run: its own private memory, its open files, its current state (which instruction it's on), and a unique number called the PID.
Crucially, every process gets its own private memory. Process A and Process B can both have a variable at memory address 0x1000, and they don't interfere — the OS makes sure those two “0x1000”s point to different bytes of physical RAM. (We'll explain how that magic works in section 16.)
If you open Chrome twice, you get two Chrome processes — same program, two independent runtimes, two PIDs. Modern Chrome actually splits itself into many processes (one per tab and extension) for safety: if a malicious page crashes its process, the other tabs keep working.
14. What a thread is, really
Sometimes one process needs to do several things at once. Imagine a video player: it has to decode video, play audio, and respond to your clicks. If those happened one after the other, the UI would freeze every time a frame decoded.
A thread is one stream of execution inside a process. A process can have many threads, all sharing the same memory and files but each running its own code in parallel (or taking turns on the CPU if there are more threads than cores).
Concrete example: when you open this web page, the browser spins up a UI thread that handles your scrolling, a network thread that fetches the page, and a JavaScript thread that runs the page's code. They share the page's data because they're all in the same process. That sharing is what makes threads powerful — and also what makes them dangerous (see the “Race Conditions” section below).
15. Process vs thread — when to use which
The cheat-sheet:
- · you want strong isolation (a crash in one shouldn't kill the others)
- · the work is fundamentally a different program
- · security boundaries matter (browser tabs, sandboxes)
- · you need to do several things inside one program
- · they'll share a lot of data and copying is expensive
- · you want low startup cost — threads spin up much faster than processes
Real-world examples: Chrome chooses processes-per-tab for safety. nginx and Postgres use one process with many threads (or workers) for speed. There's no universally right answer — it's a trade-off between isolation and performance.
16. Why memory needs to be managed
Now back to the problem we hinted at: your laptop has 16 GB of RAM, and right now there are probably 200+ processes running. They can't all just freely use whatever addresses they want. Two problems:
- Conflict: If process A writes to address 0x1000 and process B also writes to address 0x1000, somebody's data gets stomped.
- Security: Without isolation, any program could read your password manager's memory.
The OS solves both with virtual memory. Every process gets its own private “map” of addresses. Process A's address 0x1000 and Process B's address 0x1000 secretly point to different bytes of physical RAM — the CPU translates between them on every access. Each program thinks it has the whole memory to itself; the OS makes the illusion work.
That's the foundation of everything in the “Walk a Virtual Address” section below. The boring details (page tables, the TLB, page faults) are all in service of this one trick.
17. How the OS gets running — the boot process
When you press the power button, none of the OS is loaded yet. So how does it get there? Roughly:
- 1. The CPU starts running instructions at a hard-coded address that points to a small program built into the motherboard, called BIOS (or, on newer machines, UEFI).
- 2. That firmware does a self-test, finds the disks, and looks for a bootloader — usually GRUB on Linux, the Windows Boot Manager on Windows, or boot.efi on macOS.
- 3. The bootloader loads the kernel into RAM and hands control to it.
- 4. The kernel initialises hardware (CPU caches, RAM, disks, network), mounts the root filesystem, and starts the first user-space process —
initorsystemdon Linux. - 5. That first process spawns everything else — login screens, background services, your desktop. Eventually you see a login prompt.
The whole sequence usually takes 2–10 seconds on a modern laptop. On servers it can be longer because firmware does more hardware checks.
18. Types of operating systems
Not all OSes do the same job. The constraints they're built for shape almost every design choice:
Run interactive apps for one user. Make the screen, keyboard, and mouse feel snappy.
Run thousands of background processes. No display needed. Optimised for throughput, uptime, and remote management.
Battery-aware. Sandboxed apps. Touch-first UI. Aggressive about killing background work to save power.
Runs on small chips inside microwaves, routers, smartwatches, car ECUs. Tiny memory, no GUI.
Guarantees a task finishes within a deadline. Used in pacemakers, drones, factory robots, jet engines — anywhere late = disaster.
Treats many machines as one. Rare in practice — modern clouds use Linux + Kubernetes instead.
Many of these share roots. Android's kernel is Linux. iOS shares a lot of code with macOS. The differences are mostly in what runs on top of the kernel — the UI, the app model, the security policies.
19. What the OS actually does — the layered picture
Now we can put it all together. Every line of code you ship sits on top of a stack like this. The closer to the bottom, the more privileged — and the more dangerous when it goes wrong.
Run in user mode. Can't touch hardware directly.
Wrap raw kernel features into nicer functions like malloc, printf, pthread_create.
The privileged part of the OS. Full hardware access.
Translators between the kernel and the actual hardware chips.
Metal and silicon. Doesn't know or care about your program.
When you call malloc() from C, you're calling libc, which asks the kernel for memory. The kernel updates page tables. The hardware MMU does the translation. All of that happens for one line of code. That's the cost of the illusion — and the value of it.
20. Mini glossary
All the terms you'll need for the rest of this page, defined once.
- Bit
- A single 0 or 1. The smallest unit of data.
- Byte
- A group of 8 bits. The standard chunk of memory. One ASCII character is 1 byte.
- CPU
- Central Processing Unit. The chip that runs your code, one instruction at a time per core.
- Core
- A complete CPU inside the chip. A 'quad-core' CPU has 4 of them, running in parallel.
- Register
- A tiny storage slot inside the CPU itself. The fastest memory there is.
- Cache
- Small fast memory between the CPU and RAM. Holds recently-used data so the CPU doesn't keep waiting on RAM.
- RAM
- Random-Access Memory. The 'desk' the CPU works on. Loses everything on power-off.
- Address
- A number that identifies one byte in memory. Like a house number on a street.
- Program
- A file on disk containing instructions. Doesn't do anything until you run it.
- Process
- A running instance of a program. Has its own memory and a unique PID.
- Thread
- One stream of execution inside a process. Threads in the same process share memory.
- Kernel
- The privileged core of the OS. Has full access to hardware. Other code asks the kernel for help.
- OS
- Operating System. The software that runs the hardware and shares it among programs.
- ISA
- Instruction Set Architecture. The 'language' a CPU understands — x86, ARM, RISC-V are all ISAs.
- x86
- The Intel/AMD CPU family. Runs most desktops and servers.
- ARM
- A different CPU family. Runs phones, recent Macs (Apple Silicon), and increasingly servers.
The Kernel Boundary
Every program runs in user mode (ring 3). To do almost anything useful — open a file, allocate memory, send a packet — it has to ask the kernel. Click a syscall and watch the CPU mode-switch.
Why does the boundary even exist?
Without a kernel/user split, any program could write to disk sectors directly, reprogram the network card, or read another process's memory. Multics in 1965 introduced ring-based protection precisely because researchers had spent the decade watching one buggy job crash the whole machine.
The cost is real but small. A bare syscall on modern x86-64 is around 100–300 cycles since SYSCALL/SYSRET — about 30–100 ns. After Spectre/Meltdown, the kernel page-table isolation (KPTI) mitigation roughly doubles that. The reason io_uring exists, the reason high-frequency traders run kernel bypass with DPDK — every cycle of this boundary-crossing matters when you're doing millions of ops per second.
Processes vs Threads
Both run code. Both can be scheduled. The difference is what they share. Click fork() or pthread_create() and watch the tree grow.
What gets shared?
Hover the segments to see how each is treated by fork vs threads.
fork(), every page is shared and read-only. The first write to any page triggers a page fault, allocates a new physical page, and copies. Memory- heavy parents that write after fork can OOM unexpectedly.The cheapest fork() in the world
fork() looks expensive in theory — duplicate an entire address space — but Linux (and every Unix since 4.0BSD) cheats. It copies only the page tables and marks every page read-only with a copy-on-write bit. The actual page contents are shared until somebody writes, which triggers a fault, allocates a new physical page, and copies on demand.
That's why Postgres can fork a worker per connection, why Redis BGSAVE forks the entire dataset for snapshotting, why Chrome spawns dozens of renderer processes. Without CoW, none of those architectures would be viable. (And it's also why a memory- heavy parent that writes after fork can suddenly OOM — CoW failure is real and silent.)
Race the Schedulers
Same workload. Four scheduling algorithms. Watch the Gantt chart and metrics shift as you switch. The interview answer is never “just use SJF” — it's knowing which algorithm loses on which workload.
What real schedulers actually do
FCFS and SJF are teaching tools, not products. No production OS uses them — FCFS suffers convoy effect catastrophically, and SJF requires knowing burst length, which you can't. Round Robin is closer to reality but its fairness comes at the cost of latency: a 100ms quantum is too coarse for an interactive desktop, and a 1ms quantum spends most of its time on context-switch overhead.
Linux ran CFS (Completely Fair Scheduler) from 2007 to October 2023. Each runnable task got a virtual runtime; the scheduler always picked the task with the lowest vruntime. CFS used a red-black tree, so picking the next task was O(log n).
In Linux 6.6 (October 2023), CFS was replaced by EEVDF (Earliest Eligible Virtual Deadline First). EEVDF still gives each task a fair share, but it also assigns a deadline — interactive tasks with short slice lengths preempt CPU-bound tasks more aggressively. The first major scheduler change in 16 years.
Walk a Virtual Address
On x86-64, every memory access goes through a 4-level page-table walk — unless the TLB caches the translation. Type or pick an address; watch the walk light up.
Why paging beat segmentation
Early systems used segmentation: each process had a base + limit, and the CPU added the base to every address. Simple. But external fragmentation killed it — once you'd allocated and freed enough segments, you had unusable holes everywhere.
Paging fixed that with one small idea: chop memory into fixed-size pages (4 KB on x86, 16 KB on Apple Silicon). Now every page is the same size, so any free page can hold any virtual page — no fragmentation. The price is per-process page tables and the TLB to cache recent translations. On x86-64 with 4-level paging, a TLB miss can cost up to 4 memory accesses per address. That's why huge pages (2 MB, 1 GB) exist — fewer entries, more hits.
Page Replacement Battle
Pick an access pattern. Set RAM size (frames). All four algorithms run on the same trace simultaneously. The winner is the highest hit ratio. Spoiler: Bélády's Optimal always wins because it cheats by knowing the future.
| ref | 7 | 0 | 1 | 2 | 0 | 3 | 0 | 4 | 2 | 3 | 0 | 3 | 2 | 1 | 2 | 0 | 1 | 7 | 0 | 1 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| f0 | 7 | 7 | 7 | 2 | 2 | 2 | 2 | 4 | 4 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 7 | 7 | 7 |
| f1 | · | 0 | 0 | 0 | 0 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 0 | 0 |
| f2 | · | · | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 1 |
| ref | 7 | 0 | 1 | 2 | 0 | 3 | 0 | 4 | 2 | 3 | 0 | 3 | 2 | 1 | 2 | 0 | 1 | 7 | 0 | 1 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| f0 | 7 | 7 | 7 | 2 | 2 | 2 | 2 | 4 | 4 | 4 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| f1 | · | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 3 | 3 | 3 | 3 | 3 | 0 | 0 | 0 | 0 | 0 |
| f2 | · | · | 1 | 1 | 1 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 7 | 7 | 7 |
| ref | 7 | 0 | 1 | 2 | 0 | 3 | 0 | 4 | 2 | 3 | 0 | 3 | 2 | 1 | 2 | 0 | 1 | 7 | 0 | 1 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| f0 | 7 | 7 | 7 | 2 | 2 | 2 | 2 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 0 | 0 | 0 | 0 | 0 |
| f1 | · | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 7 | 7 | 7 |
| f2 | · | · | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 3 | 0 | 0 | 0 | 0 | 2 | 2 | 2 | 2 | 2 | 1 |
| ref | 7 | 0 | 1 | 2 | 0 | 3 | 0 | 4 | 2 | 3 | 0 | 3 | 2 | 1 | 2 | 0 | 1 | 7 | 0 | 1 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| f0 | 7 | 7 | 7 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 7 | 7 | 7 |
| f1 | · | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 4 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| f2 | · | · | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Why nobody implements LRU exactly
True LRU requires updating a recency timestamp on every memory access. That's billions of writes per second on a busy server — the bookkeeping costs more than the eviction it optimizes. So real systems approximate.
Linux uses a two-list variant of Clock-Pro: an active list and an inactive list, with reference bits set by the MMU on access and cleared by a periodic scan. Pages survive on the active list only if they're touched between scans. macOS does FIFO with reactivation. Windows tracks a per-process working set. Different bets, all approximating Bélády's unreachable optimum.
Race Conditions in 60 Seconds
Spawn N threads. Each increments a shared counter (TARGET / N) times. Without a lock, two threads can read the same value, increment, and write back — one update is silently lost.
fetch_add stays in CPU cache and costs ~5–15 ns even under contention. That's why high-perf counters (Prometheus internals, Linux per-CPU stats) use atomics, not mutexes.Build a Deadlock
Each resource is a single-instance lock. A process can hold at most one of each. Click a cell to toggle. Watch the wait-for graph — when a cycle appears, you've built a deadlock.
| process | R1 DB Lock | R2 Cache Lock | R3 File Lock | R4 Net Lock |
|---|---|---|---|---|
| P1 | ||||
| P2 | ||||
| P3 | ||||
| P4 |
Production deadlocks: how the real world handles them
Most production systems don't prevent deadlock — they detect it and break it. PostgreSQL maintains a wait-for graph between transactions; if a cycle appears it picks the youngest transaction and aborts it with deadlock_detected. MySQL/InnoDB does the same. Java's ThreadMXBean exposes findDeadlockedThreads() for the same purpose.
Prevention happens in code review: enforce a global lock ordering. If every codepath always acquires lock A before lock B, the circular-wait condition can't happen. The Linux kernel's lockdep checker enforces this at runtime — it builds a graph of every lock-acquisition order seen and screams when a new path violates it.
Filesystem Internals
Reading /etc/passwd looks atomic but is actually three lookups: directory entry → inode → data blocks. Watch the path light up. Then crash the disk and see why journaling exists.
- · name → inode #
- · lookup in /etc/
- · result: inode 1234
- · size, perms, owner
- · atime / mtime / ctime
- · block pointers ↓
- · 28 KB / 4 KB blocks
- · → 7 direct blocks
- · DMA fetches into page cache
- · copy_to_user()
- · fd offset advances
- · syscall returns
mode: 100644 (regular, rw-r--r--) uid / gid: 0 / 0 size: 28672 bytes blocks: 7 × 4 KB atime: 2026-04-25T12:30:00Z mtime: 2026-04-20T09:14:22Z direct[0..11]: → blk 0x4a01 ... 0x4a07 indirect: (unused — file fits) double-ind: (unused) triple-ind: (unused)
Pick the Right IPC
Six classic IPC mechanisms, six different jobs. Tell me what you're trying to do; I'll show you what to reach for and what it costs.
One-way, byte stream, related processes. Kernel-managed circular buffer. Blocks on full.
ls | grep foo — bash creates a pipe, fork()s twice, dup2()s the fds.
How Real Operating Systems Differ
Same problems, different bets. Click an OS to highlight its choices — and notice how often FreeBSD invented something Linux later cloned.
| property | Linux | Windows | macOS | FreeBSD |
|---|---|---|---|---|
| Kernel architecture | Monolithic (modular) | Hybrid (NT kernel) | Hybrid (XNU = Mach + BSD) | Monolithic |
| Scheduler | EEVDF (since 6.6, Oct 2023). Replaced CFS. | Multi-level priority + boosting (NT scheduler) | Mach thread policies + BSD priority + Grand Central Dispatch | ULE — interactivity + load-balancing aware |
| Default scheduling unit | Thread (task_struct) | Thread | Mach thread; processes are container | Thread |
| Page replacement | Active/inactive LRU lists + Clock-Pro inspired | Working-set manager + Modified Page List | FIFO + reactivation (LRU approximation) | Two-handed clock |
| Default page size | 4 KB (huge pages: 2 MB / 1 GB) | 4 KB (large pages: 2 MB) | 16 KB on Apple Silicon, 4 KB on Intel | 4 KB |
| Filesystem (default) | ext4 / btrfs / xfs | NTFS (ReFS for servers) | APFS (since 10.13, 2017) | UFS2 / ZFS |
| Async I/O | io_uring (since 5.1, 2019) — kernel rings | IOCP (I/O Completion Ports) | kqueue / GCD / dispatch_io | kqueue (origin of the design) |
| IPC standout | eBPF, futex, io_uring | ALPC (Asynchronous Local Procedure Call) | Mach ports — everything is a port | Capsicum (capability-mode sandbox) |
| Container primitive | namespaces + cgroups | Server Silos / Job Objects | Sandbox profiles (macOS Seatbelt) | Jails (the original, 2000) |
EEVDF (Earliest Eligible Virtual Deadline First) replaced CFS in kernel 6.6 (Oct 2023). It's the first major scheduler change in 16 years.
Windows boosts thread priority when a thread receives keyboard/mouse input — that's why your foreground app feels snappy even under load.
Apple Silicon Macs use 16 KB pages — 4× the standard size. This reduces TLB pressure but makes per-page memory waste 4× worse.
FreeBSD invented kqueue (1999) and Jails (2000). Linux's epoll and Docker are both descendants of FreeBSD ideas.
OS Interview Rapid-Fire
15 questions across all topics. No timer — just answer. Get a scorecard at the end with the topics to revise. Share if you survive.
Frequently Asked Questions
What is an operating system, in one sentence?+
What's the difference between a process and a thread?+
Why does Linux use a scheduler instead of just running one program at a time?+
What is virtual memory, really?+
What is a page fault, and is it always bad?+
Why does deadlock happen, and how do real systems prevent it?+
What's the difference between a mutex and a spinlock?+
What changed in Linux 6.6's scheduler?+
Why is fork() so fast, even for huge processes?+
What is io_uring and why is everyone excited about it?+
How much should an experienced engineer actually know about OS internals?+
Where should I go to learn more after this page?+
Sources & References
Operating Systems: Three Easy Pieces — Remzi & Andrea Arpaci-Dusseau (ostep.org)
Brendan Gregg — Systems Performance, 2nd ed. + brendangregg.com
Linux Weekly News (lwn.net) — EEVDF scheduler, io_uring, BPF
Linux 6.6 release notes — kernel.org/doc/html/v6.6/admin-guide/
Discord Engineering — How Discord Stores Trillions of Messages (discord.com/blog)
Microsoft Docs — Windows NT scheduler, IOCP, ALPC
Apple — Kernel Programming Guide (XNU + Mach)
FreeBSD Architecture Handbook — kqueue, ULE scheduler, Jails
Jeff Dean / Peter Norvig — Latency Numbers Every Programmer Should Know