Skip to content

Introduction to Goroutines

This section covers the theoretical concepts of goroutines in Go and concurrency in general. The next section will focus on hands-on practice.


Main topics:

  • Goroutines
  • Concurrency
  • IO-bound vs CPU-bound concurrency
  • Threads vs goroutines vs asyncio
  • Coroutines vs goroutines
  • Preemptive vs cooperative scheduling

1. What is a Goroutine?

A goroutine is an independent function that runs concurrently with other functions in Go. For those familiar with threads, you can think of a goroutine as a thread, with the following differences:

For example, consider a browser app that opens three tabs, all running in parallel:

flowchart TB

    subgraph Goroutine3["Tab 3 (Download)"]
        direction TB
        G3S1["Step 1: Start Download"]
        G3S2["Step 2: Monitor Progress"]
        G3S3["Step 3: Save File"]
        G3S1 --> G3S2 --> G3S3
    end

    subgraph Goroutine2["Tab 2 (Video)"]
        direction TB
        G2S1["Step 1: Connect Stream"]
        G2S2["Step 2: Buffer Video"]
        G2S3["Step 3: Play Video"]
        G2S1 --> G2S2 --> G2S3
    end

    subgraph Goroutine1["Tab 1 (News)"]
        direction TB
        G1S1["Step 1: Send Request"]
        G1S2["Step 2: Wait for HTML"]
        G1S3["Step 3: Display Page"]
        G1S1 --> G1S2 --> G1S3
    end

2. main() is a Goroutine

In Go, even the main() function is executed as a goroutine — called the main goroutine. That means every Go program always starts with at least one goroutine.

From there:

  • We can create additional goroutines using the go keyword.
  • All goroutines, including the main one, run concurrently.
  • When the main goroutine ends, the whole program terminates; even if other goroutines are still running.

So, every Go application is composed of at least one goroutine (main), and we can scale concurrency by adding more goroutines when needed.

3. Threads vs Goroutines

3.1 Threads in OS

  • Modern hardware (microprocessors) often provides multiple cores, sometimes in the dozens or even hundreds. Each core acts as an independent processing unit capable of executing instructions.
  • At the operating system level, these cores are utilized through the concept of threads. A thread is the smallest unit of execution, and typically, the OS scheduler maps threads to cores. Multiple threads can be run concurrently across cores.

Therefore, multithreading depends on both the OS kernel features (such as thread scheduling, context switching, etc.) and the number of available CPU cores.

  • The Go programming language offers a higher-level concurrency model through goroutines, which are lightweight functions that can run independently. Goroutines are not tied 1:1 to OS threads. Instead, the Go runtime uses a scheduler that maps many goroutines onto a smaller number of threads.

  • This is known as an M:N scheduling model: M goroutines are multiplexed onto N OS threads, which in turn are scheduled on physical CPU cores. This enables Go to support massive concurrency with relatively low overhead.

3.2 M:N scheduling model

  • The diagram below illustrates the M:N scheduling model where multiple goroutines (GR1 to GR6) are multiplexed onto fewer OS threads (OS Thread-1 and OS Thread-2).
  • These OS threads, in turn, are managed by the OS kernel, enabling efficient concurrency by allowing many goroutines to run on a smaller number of threads.
graph TD
    OSK[OS Kernel]

    OSK --> T1["OS Thread-1"]
    OSK --> T2["OS Thread-2"]

    T1 --> GR1[GR1]
    T1 --> GR2[GR2]
    T1 --> GR3[GR3]

    T2 --> GR4[GR4]
    T2 --> GR5[GR5]
    T2 --> GR6[GR6]

3.3 groutines compared to threads

  • Goroutines are lighter than threads (thousands or millions can run on few OS threads).
  • They are managed by the Go runtime, not the operating system.
  • They have smaller stack sizes (grow/shrink dynamically), unlike fixed-size thread stacks.

3.4 Comparison table

Aspect OS Threads (Kernel-Managed) Goroutines (Go Runtime-Managed)
Management Managed by the operating system kernel Managed by the Go runtime scheduler (user space)
Stack Size Fixed, usually 1–2 MB reserved per thread Starts small (~2 KB) and grows/shrinks dynamically
Creation Cost Expensive (system calls required) Very cheap (just a function call + small runtime setup)
Context Switch Handled by OS, involves saving/restoring registers & stacks Handled in user space by Go scheduler, much cheaper
Scheduling Kernel schedules all system threads across processes Go runtime uses M:N scheduling (many goroutines on few OS threads)
Scalability Limited (hundreds or thousands, memory-heavy) Extremely scalable (hundreds of thousands or millions)
Communication Usually needs locks, mutexes, or shared memory Done via channels (safe, blocking, typed communication)
Overhead Higher memory + kernel involvement Low memory + efficient runtime management

4. How many threads a system can support

On Linux, threads are basically lightweight processes (tasks) managed by the kernel.

Each thread requires:

  • Stack memory (typically 1–2MB per thread by default)
  • Kernel data structures (task_struct, file descriptors, etc.)
  • In practice, Linux can often handle several thousand threads per process

Example: On an Intel PC

  • Suppose we have 16 GB RAM
  • Default stack size per thread is 2 MB

    Max threads ≈ 16 GB / 2 MB ≈ 8,000
    

Or, for inline math:

  • Max threads ≈ $ \frac{16\ \text{GB}}{2\ \text{MB}} \approx 8{,}000 $
Example: On an Intel PC

* Suppose we have 16GB RAM
* default 2MB stack per thread;

    Max threads ≈16GB​/2MB ≈8,000

4.1 How many goroutine

Goroutines are much lighter:

Example: On the same Intel PC 

* Suppose we have 16GB RAM.
* Initial stack ~2KB (grows dynamically)

    Max goroutines ≈16GB/2KB  ​≈8,000,000

5. Processor vs IO concurrency

Concurrency can be broadly categorized as:

  • Processor-bound (CPU-bound) concurrency
  • I/O-bound concurrency
Feature Processor-bound (CPU-bound) I/O-bound
Definition Tasks limited by CPU computation Tasks limited by waiting for I/O
Characteristics Heavy computations, minimal waiting Mostly waiting for external resources
Concurrency model OS threads or goroutines on multiple cores Lightweight threads, goroutines, or async/event loop
Parallelism True parallelism improves throughput Parallelism mostly logical (many tasks waiting concurrently)
CPU usage High Low
Example tasks Image/video processing, simulations, math computations Web servers, network requests, database queries
Optimization focus Maximize CPU core usage Maximize concurrency, minimize blocking
graph LR
  subgraph IO_bound
    LOOP["Event Loop (manages IO)<br/>(Runs as a Thread)"]
    LOOP --> IO-Task-1["IO-Task-1"]
    LOOP --> IO-Task-2["IO-Task-2"]
    LOOP --> IO-Task-3["IO-Task-3"]
  end



   subgraph "CPU-bound(SMT Core)"
    MCPU["SMT Core"]
    MCPU --> MTh1["Thread-1"]
    MCPU --> MTh2["Thread-2"]
  end

  subgraph "CPU-bound"
    CPU["Simple Core-1"]
    CPU --> Th1["OS Thread-1"]
    CPU2["Simple Core-2"]
    CPU2 --> Th2["OS Thread-2"]
  end

Note:

  • CPU-bound: Threads / goroutines doing heavy computations mapped directly to cores
  • Simple cores run one thread, however a Simultaneous Multithreading (SMT) core can run more than one thread at a time.
  • I/O-bound: async tasks multiplexed over few OS threads or a single event loop
  • A goroutine can be bound to a single thread (if performance is needed). Or multiple (thousands) can be bound to a single thread when performance is not required but other considerations are important such as multiple IO.

5.1 Event loop (in IO-bound)

An event loop is a programming construct that repeatedly checks for and dispatches events or tasks. It allows a program to handle many I/O-bound operations concurrently on a single thread by switching between tasks whenever one is waiting for I/O.

graph TD
  EventLoop["Event Loop<br/>(Single Thread)"]

  EventLoop --> Task1["Task1"]
  EventLoop --> Task2["Task2"]
  EventLoop --> Task3["Task3"]

  Task1 --> Wait1["Waiting for I/O"]
  Task2 --> Wait2["Waiting for I/O"]
  Task3 --> Wait3["Waiting for I/O"]

  Wait1 --> Poll["Event Loop<br/>polls repeatedly"]
  Wait2 --> Poll
  Wait3 --> Poll
  • The Event Loop is a single thread that repeatedly checks tasks.
  • Each task that is waiting for I/O yields control.
  • The loop continues to poll and schedule ready tasks, allowing high concurrency without multiple threads.

6. Where do Goroutines fit?

Goroutines are general-purpose lightweight concurrency units.

  • They are excellent for I/O-bound workloads,
  • but they can also handle CPU-bound workloads up to the number of available CPU cores.

6.1 I/O-bound

  • Goroutines are more suited here.
  • We can run thousands of concurrent I/O operations (network calls, disk reads, etc.) very efficiently.
  • This is similar to asyncio or Node.js’s event loop, but with a simpler model (you just write normal-looking code with goroutines + channels).

6.2 CPU-bound

  • Goroutines also work well for processor-bound tasks (i.e. threading),
  • However, they’re limited by the number of CPU cores. The Go runtime maps goroutines onto OS threads, and the OS threads onto cores.
  • So, if you have an 8-core machine, you’ll get at most 8 CPU-heavy goroutines truly running in parallel.

6.3 Async-io loop

  • asyncio is mainly for I/O-bound concurrency (single thread can juggle many tasks, but CPU-heavy work will block it).

  • Goroutines, being mapped onto OS threads, can do both I/O and CPU-bound concurrency.

6.4 Comparision of concurrency techniques

Technique Best For Characteristics Example Scenarios
OS Threads CPU-heavy parallelism Heavyweight, limited in number, managed by OS Video rendering, physics simulations, game engines
Goroutines (Go) Mixed CPU + I/O, scalable tasks Lightweight, multiplexed on OS threads, simple API Web servers, microservices, proxies, IoT collectors
Asyncio / Event Loop I/O-heavy concurrency (low CPU) Single-threaded, cooperative multitasking Chat servers, web scrapers, GUI event handling, lightweight web apps

A goroutine can be bound to a single thread (if performance is needed). Or multiple (thousands) can be bound to a single thread when performance is not required but other considerations are important such as multiple IO.

7. Coroutines vs goroutines

Coroutine is a lightweight function that can pause (yield) and resume later, allowing cooperative multitasking within a program.

  • Coroutines and goroutines are both ways to run multiple tasks seemingly at the same time.

  • Coroutines are functions that the programmer can control. They can pause themselves using something like yield or await, and then resume later.

  • This is called cooperative scheduling, because the coroutine has to decide when to give up control.

  • goroutines on the other hand, are managed by the Go runtime, which automatically decides when to pause and resume them. This is called preemptive scheduling.

  • So, coroutines give programmers more manual control, but goroutines make concurrency easier by handling all the scheduling under the hood.

Feature Coroutines Goroutines
Scheduling Type Cooperative — controlled by the programmer Preemptive — controlled by the Go runtime
Pause/Resume Manually using yield, await, etc. Automatic by the runtime — no need to yield manually
Blocking Risk Yes, if coroutine forgets to yield No, Go handles blocking transparently
Ease of Use Requires discipline to manage yielding correctly Very easy — just use go myFunction()
Syntax Required Needs special keywords (yield, await, suspend) No special syntax — normal function with go keyword
Managed By Language/runtime or libraries (e.g., asyncio, trio) Go’s built-in runtime scheduler
Flexibility More control for advanced use cases Simpler, less error-prone for general concurrency