Processes, Threads, and Concurrency: The Mental Model That Actually Holds Up

2026, March 3rd - by Ricardo Guzman

OS Concurrency Threads Systems Computer Science

Most developers have a rough idea of what a process or a thread is. Rough enough to get by, at least.

But “getting by” means you’ll hit a wall the moment you need to debug a race condition, understand why your server is choking under load, or explain to someone why async doesn’t mean parallel.

So let’s build the model properly.

The Hierarchy, Cleanly

These concepts live at different layers. Mixing them up is where the confusion starts.

Program

A program is just code sitting on disk. A .jar file, an executable binary, a Python script. It’s not doing anything. It’s not running. It’s a blueprint.

Process

A process is what happens when you actually run that program. The OS loads it, sets up an isolated environment for it, and hands it the resources it needs:

Its own virtual address space (it thinks it owns all the memory)
Open files, sockets, handles
Security and execution context
At least one thread

The key insight: a process is a container and isolation boundary. It’s not the thing the CPU directly runs. It’s the environment where execution happens.

Thread

A thread is the actual unit of execution inside a process. Threads in the same process share the same memory space and code, but each one has its own:

Instruction pointer (where it is in the code)
Register state
Stack

The CPU executes threads, not processes. When we say “a process is running,” we mean one or more of its threads are running on a CPU core.

This is the most important distinction in this entire post. Everything else builds on it.

Task

A task is not an OS primitive. It’s a unit of work - something you want to get done:

Handle an incoming HTTP request
Download a file
Sort a chunk of an array
Process a message from a queue

A task might be executed by one thread, split across several, or managed by an async runtime without a dedicated OS thread at all. It’s a higher-level concept.

The clean mental model:

Process = container
Thread = execution path
Task = work to be done

The House Analogy

This clicked for me:

Program = blueprint of a house
Process = the house built from that blueprint
Threads = people moving through the house, doing things
Tasks = the chores those people are doing

The house doesn’t “do” anything by itself. The people inside it do. And the chores are what actually needs to get done.

What the CPU Does vs. What the OS Does

People often blur these two together. They’re doing very different things.

The CPU

The CPU fetches instructions, decodes them, and executes them. That’s it. It’s the engine.

What it does not do is decide which process or thread should run next. It just executes whatever it’s been given.

The OS

The OS is the manager. It:

Creates and destroys processes and threads
Tracks their state (running, waiting, blocked, runnable)
Manages virtual memory and resources
Performs context switches
Schedules runnable threads onto CPU cores

The CPU executes instructions. The OS decides whose instructions.

One thing worth clarifying: the OS runs on the CPU too.

When your program makes a system call - opens a file, creates a thread, allocates memory - execution switches from your user-space code into kernel code.

The OS isn’t floating above the CPU. The CPU is executing OS kernel code just like it executes your application code.

What Actually Happens When You Open a Program

The OS loads the program from disk and creates a process
That process gets its own virtual address space, resources, and handles
The process starts with at least one thread - the initial thread
The CPU begins executing that thread
The running thread can request more memory, files, network connections, or more threads
Any new threads become runnable
The OS scheduler picks which runnable thread gets CPU time next

Notice step 7.

The program doesn’t command the CPU: “run this thread right now”.

It creates a thread, the thread becomes runnable, and then the scheduler decides when it actually gets to run.

That distinction matters a lot when you’re debugging scheduling or latency issues.

Can Only the Main Thread Create More Threads?

In practice, yes - the main thread usually creates the others. But it’s not a rule.

Any running thread can request the creation of additional threads. It depends on how the program is designed.

You might have a thread pool manager that spins up workers, or a worker thread that spawns subtasks. The OS doesn’t care which thread made the request.

Context Switching

On a single CPU core, only one thread can run at a time. But the OS can run dozens of threads “at the same time” from the user’s perspective - by rapidly switching between them.

When the OS stops one thread and starts another, it performs a context switch:

Saves the current thread’s state: registers, instruction pointer, stack pointer
Loads the state of the next thread
The CPU resumes the new thread as if it never stopped

This is what makes single-core concurrency possible. The threads aren’t actually running simultaneously - they’re taking very fast turns.

Concurrency vs. Parallelism

These two are not the same thing, and conflating them will eventually get you into trouble.

Concurrency

Multiple units of work are making progress during the same period of time.

This does not require multiple cores. On a single core, threads A and B are concurrent if the OS switches between them - both make progress over time, even though only one is running at any given instant.

Parallelism

Multiple things are literally executing at the same instant.

This requires multiple hardware execution units - multiple CPU cores, or similar. Only then can thread A and thread B both be running at the exact same moment.

	Single Core	Multiple Cores
Concurrency	✅ (via context switching)	✅
Parallelism	❌	✅

Concurrency is about structure - how you organize work. Parallelism is about execution - whether things literally happen at the same time.

And concurrency is broader than you might think. It’s not just “one process with many threads.” It includes:

Multiple threads in one process
Multiple separate processes
Async tasks and event loops
Coroutines
Green threads / goroutines

Async Is Concurrency Without Extra Threads

Thread-based concurrency is the most common mental model, but it’s not the only one.

In async/coroutine-based concurrency, a single thread can manage many tasks. The trick is that tasks yield when they’re waiting for something:

Task A starts a network request → yields
Task B runs
Task A’s data arrives → Task A resumes

No extra OS threads needed. The single thread is doing multiple things over time - which is still concurrency, just not the kind that comes from new Thread().

This is why frameworks like Node.js, Python’s asyncio, or Kotlin coroutines can handle thousands of concurrent requests without spawning thousands of threads. A task is a higher-level concept than a thread. Not every task needs its own OS thread.

Process vs. Thread: The Practical Trade-off

	Process	Thread
Memory space	Isolated	Shared with other threads
Fault isolation	Strong (crash doesn’t take others down)	Weak (a crash can kill the whole process)
Creation cost	Higher	Lower
Communication	IPC (sockets, pipes, shared memory)	Direct shared memory
Risk	Safer	Race conditions, deadlocks

Use processes when isolation matters. Use threads when you need performance and are willing to manage shared state carefully.

A Concrete Example: Web Server

A web server makes this concrete:

1 program on disk
1 running process
8 worker threads
10,000 incoming requests (tasks)

On a 4-core machine, 4 threads are literally running in parallel at any instant. The other 4 are waiting for their turn. The 10,000 requests get distributed across threads - some waiting, some being processed, some already done.

Process = the environment the server lives in
Threads = the workers handling requests
Requests = the tasks

The Model in One Line

A process owns resources, a thread executes instructions, a task is the work being done, the OS scheduler chooses which runnable thread gets CPU time, and the CPU executes that thread.

Internalize this and a lot of things that felt vague - concurrency bugs, async behavior, multi-core scaling - will start making a lot more sense.

Now go write something concurrent.