Go Internals - Memory
Introduction
In Go Internals - Scheduler we saw that every goroutine (G) carries its own stack. This article covers where that stack lives, how it grows, and what happens when a value cannot stay there.
We work from Go 1.23 source — mainly src/runtime/stack.go and the escape pass in cmd/compile/internal/escape/ — and tie everyday code (locals, pointers, closures, go statements) to concrete layout decisions. The subject is the stack–heap boundary: who owns a value's lifetime, and how the toolchain decides.
Stack vs heap: two lifetimes
Go programs use two primary places for object storage:
| Stack (per goroutine) | Heap (process-wide) | |
|---|---|---|
| Lifetime | Tied to the calling goroutine's stack frame | Until no live pointer references it (GC reclaims) |
| Cost to use | Bump pointer in the frame; no GC scan for the allocation itself | Allocator + eventual GC work |
| Who decides placement | Compiler escape analysis (mostly) | Compiler when value escapes |
| Concurrency | Each G has its own stack; no sharing |
Shared; requires synchronization or immutability |
| Typical size | Kilobytes per goroutine, grows on demand | Limited by virtual memory and GC policy |
flowchart TB
subgraph G["Goroutine G"]
SF1["stack frame: f()"]
SF2["stack frame: g()"]
SF1 --> SF2
end
subgraph heap["Process heap"]
H1["escaped *T"]
H2["closure env"]
H3["slice backing array"]
end
SF2 -->|"pointer escape"| H1
SF1 -->|"go / closure"| H2
SF2 -->|"append grows beyond stack"| H3
Taking the address of a local does not automatically heap-allocate it. Escape analysis asks whether that address can outlive the frame. If the pointer never leaves the function, the value can stay on the stack even with &x.
Neither side wins on speed alone. Stack use trades off against allocator pressure and GC tracing; heap use trades off against stack copying on growth and per-goroutine memory footprint.
Goroutine stacks in Go 1.23
Every runnable G has a stack segment the runtime manages in stack.go. New goroutines start small — on the order of 2 KiB on 64-bit platforms in recent Go versions — which is why spawning huge numbers of goroutines is viable. Most never need a megabyte-sized stack.
Since Go 1.13, growth copies the entire stack to a larger contiguous region instead of linking stack segments. That simplifies GC and stack scanning; the cost shows up when growth happens. Stacks have a maximum size (1 GiB on 64-bit); exceeding it triggers a fatal stack overflow, not silent corruption. Each function prologue compares the stack pointer against a guard slot; crossing the guard triggers growth or overflow handling before clobbering adjacent memory.
The scheduler article's go worker() path allocates or reuses a G with this initial mapping. User code runs on that stack until the goroutine exits and the runtime recycles the G, often keeping the grown stack attached for reuse.
func main() {
go func() {
var buf [128]byte // likely on this goroutine's stack
_ = buf
}()
}
Deep call chains and large frame locals increase stack usage and can trigger morestack more often. For short-lived data that is still cheaper than heap allocation plus GC work, but it is not free.
How stack growth works
When a function needs more stack space than the current segment allows, the compiled code does not call malloc. It hits a stack check inserted by the compiler; on failure it calls into the runtime's morestack path, which eventually runs newstack in stack.go.
sequenceDiagram
participant Fn as compiled function
participant MS as morestack
participant NS as newstack
participant Sched as scheduler
Fn->>Fn: SP near stackguard
Fn->>MS: stack check fails
MS->>NS: allocate larger stack
NS->>NS: copy frames + fix pointers
NS->>Fn: resume on new stack
Note over Sched: other Gs may run during copy on same M
If stack frames contain pointers to other stack slots or to heap objects, the runtime adjusts those pointers after relocation. Stack growth is runtime code for that reason — not a simple realloc.
Growth typically doubles stack size until the needed frame fits (implementation details vary; the invariant is amortized O(1) push/pop for bounded depth). When a goroutine returns from deep recursion and stays idle, Go does not always shrink the stack immediately. The runtime may retain the larger mapping for reuse on that G, trading memory for fewer copy cycles.
flowchart TD
A["function entry"] --> B{"SP < stackguard?"}
B -->|no| C["run function body"]
B -->|yes| D["morestack"]
D --> E{"needed size > limit?"}
E -->|yes| F["fatal: stack overflow"]
E -->|no| G["newstack: alloc + copy"]
G --> C
C --> H["return / shrink hooks"]
If you profile an app doing heavy recursion, time in runtime.newstack is stack growth, not heap allocation.
Escape analysis: the compiler decides
Whether a variable lives on the stack or the heap is decided at compile time by escape analysis in cmd/compile/internal/escape/. The runtime does not promote a stack variable to the heap mid-execution; either the compiler emitted a heap allocation, or it did not.
The analysis builds an escape graph: assignments, calls, returns, and closures add edges. If a pointer to a local can reach a point after the frame dies — return to caller, store in global, send on channel, capture by goroutine — the value escapes and the compiler inserts a heap allocation.
go build -gcflags="-m" ./...
-m prints escape decisions; -m -m adds more detail. Run it on a small package while learning — output gets noisy on large modules.
Stays on the stack
func sum(a, b int) int {
x := a + b
return x
}
No pointer leaves the frame; x is stack-only.
func swap(p, q *int) {
*p, *q = *q, *p
}
Pointers come in from the caller; locals that are not leaked stay on the stack.
Escapes to the heap
Returning a pointer to a local is the textbook case:
func newInt() *int {
n := 42
return &n // escapes: caller holds pointer after return
}
The compiler allocates n on the heap because the returned *int outlives newInt's frame.
Storing a concrete value in an interface{} often forces heap allocation — the interface word may point at data whose lifetime the compiler cannot bound:
func printAny(v interface{}) {
fmt.Println(v)
}
func main() {
printAny(3) // interface paths are common escape hot spots
}
Closures started with go typically heap-allocate captured variables because the new G can outlive the creating frame:
func main() {
x := 10
go func() {
fmt.Println(x) // x escapes
}()
}
Maps and slices returned from a function follow the same rule. make(map...) always uses heap-backed structures; returning the map keeps that store alive:
func cache() map[string]int {
m := make(map[string]int)
m["k"] = 1
return m
}
Flow-sensitive cases
func maybeEscape(debug bool) *[1]int {
var arr [1]int
if debug {
return &arr
}
return nil
}
Escape analysis is flow-sensitive: maybeEscape still escapes arr because one branch returns its address.
Large stack objects — arrays above an internal threshold — may move to the heap even without a pointer leak, to avoid huge frames and expensive stack copies. Thresholds change between releases; -gcflags="-m" is authoritative for your toolchain.
Reading -m output
On a minimal example:
package main
func leak() *int {
n := 7
return &n
}
func main() {
_ = leak()
}
go build -gcflags="-m" -o /dev/null .
# ./main.go:5:2: moved to heap: n
# ./main.go:5:9: &n escapes to heap
Each line names the variable, the reason (moved to heap / escapes to heap), and often the sink (return, channel, global). When optimizing, fix the leak path — return by value, pass buffers in, reach for sync.Pool only after you have measured allocation — rather than removing *T at random.
Symptom in -m |
Likely fix direction |
|---|---|
escapes to heap via return |
Return value type instead of pointer |
escapes to heap via go func |
Reduce capture; pass args by value |
moved to heap: ... large array |
Split buffer; use compile-time-sized stack array |
... flows to heap via interface{} |
Concrete API or generics to avoid boxing |
If the compiler cannot prove a value is stack-safe, it allocates. Correctness comes before stack placement.
Stack maps and GC roots
When the garbage collector runs, it must find all live pointers, including those stored in goroutine stacks. The compiler emits stack maps, which are metadata describing which stack slots hold pointers at each safepoint. The runtime uses them during stack scanning in stack.go.
flowchart LR
subgraph compile["Compile time"]
SSA["SSA + escape"]
SM["stack maps"]
SSA --> SM
end
subgraph runtime["Runtime GC"]
STK["scan goroutine stacks"]
HEAP["trace heap objects"]
SM --> STK
STK --> HEAP
end
A pointer sitting in a stack local keeps the referenced heap object alive for as long as that frame exists. A long-running function that holds *BigStruct in a local extends GC retention even if it rarely dereferences the pointer.
Cooperative and async preemption (covered in the Scheduler article) stops goroutines at safepoints where stack maps are valid. Corrupt the stack layout and the GC's assumptions break with it.
Slice, string, and map headers
Several built-in types are small headers pointing at heap data. Escape analysis usually cares about the backing store, not the header words that may sit on the stack.
A slice is three words (pointer, len, cap) on the stack; the backing array may escape on the first append that exceeds capacity:
func grow() []int {
s := make([]int, 0, 2)
for i := 0; i < 100; i++ {
s = append(s, i)
}
return s
}
A string is two words (pointer, length); the bytes live on the heap or in read-only static data for literals. A map header can live on the stack, but buckets and overflow chains are heap-allocated once you call make.
That is why "I only have a slice on the stack" can still mean megabytes on the heap. The Built-in Types article will go into slice, string, and hmap layouts; for now, the relevant part is that escape analysis and stack maps apply to the pointer words the GC must follow.
Stacks and the scheduler
Memory and scheduling share the G abstraction. A go statement creates a runnable G with an initial stack mapping. While the G is blocked, that stack stays mapped. When the G exits, the runtime recycles the G struct and its stack; any heap objects the goroutine referenced still need the GC.
Preemption at safepoints also depends on valid stack maps for root scanning — the same G ties scheduling events to memory metadata.
A goroutine leak leaks scheduler state and stack memory, plus whatever escaped heap graph the goroutine holds. That is run-queue pressure and virtual memory use, not only a GC problem:
func leakG() {
for {
go func() {
select {} // G never dies; stack stays mapped
}()
}
}
Heap allocation: when it is correct, when it is accidental
Heap allocation is the right choice for returned builders, shared caches, data that outlives the request goroutine, and maps or slices shared across goroutines. The mistake is accidental escape: returning *T for a small struct on every call, logging through interface{} at hot call sites, spawning go in a tight loop with captured variables, or keeping large [N]byte locals that the compiler moves to the heap or that force repeated stack growth.
sync.Pool can reduce allocation rate but does not change escape proofs — the pool still stores interface{} values or pointers.
Observing memory behavior
Start with escape reports on the smallest reproducer you can write:
go build -gcflags="-m -m" ./mypackage
For hot paths, confirm zero allocations in benchmarks:
func BenchmarkNoEscape(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_ = sum(1, 2)
}
}
Heap profiles show where allocations happen — often mallocgc — while -m shows why the compiler emitted them:
go test -memprofile=mem.prof -bench .
go tool pprof -http=:8080 mem.prof
runtime/trace (from the Scheduler article) helps correlate long-lived Gs with retained stacks. GODEBUG=gctrace=1 does not explain escape decisions, but it shows whether the live heap graph is growing in step with escaped allocations — a useful signal before the Garbage Collector article.
Conclusion
Stack storage is per goroutine and frame-scoped; the runtime grows it by copying in stack.go. Heap storage is shared, GC-managed, and chosen at compile time by escape analysis. The scheduler runs your code on stacks; the compiler emits stack maps so the GC can find pointers wherever they live.
Next in the series is Allocator — how mallocgc routes sizes through mcache, mcentral, and mheap.