I’ve been reviewing some performance-critical code lately, and I keep coming back to this pattern:
for {
// tight polling loop
if condition {
break
}
}
versus
go func() {
for {
// background processing
}
}()
On the surface, these look similar, both use infinite loops. But the runtime implications are fascinating, and I think we don’t talk about this enough.
The performance rabbit hole
Here’s what got me thinking. I was optimizing a real-time data processor that needed sub-millisecond response times. The naive approach was throwing everything into goroutines:
go func() {
for {
data := <-inputChan
process(data)
outputChan <- result
}
}()
Standard Go idiom, right? But the scheduler overhead was killing us. Context switches, goroutine parking/unparking. All tiny costs that add up when you’re processing millions of events per second.
The solution? Sometimes a plain for {} in the main goroutine actually performs better:
func main() {
for {
select {
case data := <-inputChan:
process(data)
outputChan <- result
default:
// yield briefly to scheduler
runtime.Gosched()
}
}
}
No goroutine overhead. No scheduler interference. Just raw, tight loop performance when you need it.
When Goroutines can become overhead
This isn’t about goroutines being bad. They’re one of Go’s best features. But like any abstraction, they have costs. For most applications, those costs are negligible. For some applications, they matter.
I’ve seen codebases where developers reflexively wrap every loop in a goroutine because “concurrency is good”. But if you’re not actually doing concurrent work, you’re just adding overhead:
// Unnecessary overhead
go func() {
for i := 0; i < len(data); i++ {
process(data[i]) // sequential work anyway
}
}()
// Just do the work
for i := 0; i < len(data); i++ {
process(data[i])
}
The goroutine version doesn’t make this faster. It makes it slower. You’ve added scheduling overhead for no concurrent benefit.
How the Go Scheduler Changes Everything
To understand why this choice matters, you need to know how Go’s scheduler actually works. It’s not just theory, this directly impacts your performance profile.
Go uses an M:N scheduler where M goroutines are multiplexed onto N OS threads. The key insight is that goroutines are cooperatively scheduled, not preemptively. They yield control at specific points:
- Channel operations
- System calls
- Memory allocator calls
- Explicit
runtime.Gosched()calls - Function calls (since Go 1.14, thanks to async preemption)
Here’s the critical difference: a tight for {} loop without any of these yield points will monopolize its OS thread until the scheduler’s async preemption kicks in (roughly every 10ms).
Sometimes that’s exactly what you want. An uninterrupted CPU time for hot loops.
But when you wrap that same loop in a goroutine, it competes with other goroutines for scheduling time. Each time the scheduler runs (which happens frequently), there’s overhead:
// This might get preempted constantly
go func() {
for {
// tight computation
result := expensiveCalculation()
if result > threshold {
break
}
}
}()
// This runs uninterrupted until natural yield points
for {
result := expensiveCalculation()
if result > threshold {
break
}
}
The scheduler overhead includes context switching, stack management, and the coordination between the scheduler and your goroutines. For most code, this is negligible. For tight loops processing millions of operations, it’s measurable.
The GMP Model in Practice
Go’s scheduler uses a GMP model: Goroutines (G) run on Machine threads (M) via Processors (P). Each P has a local run queue of goroutines, plus there’s a global run queue. When you create a goroutine, it gets queued for scheduling.
The scheduler’s work-stealing algorithm means goroutines can migrate between threads, which adds coordination overhead. For a single hot loop that doesn’t need concurrency, this is pure cost with no benefit.
I’ve started thinking about this choice through the lens of scheduler pressure:
- High-frequency tight loops: Run on main goroutine to avoid scheduling overhead
- Background/periodic work: Use goroutines for natural yielding and fairness
- I/O bound operations: Definitely goroutines (blocking syscalls trigger scheduler naturally)
- CPU-bound work that can be parallelized: Multiple goroutines, but be mindful of coordination costs
Cooperative Scheduling
Under cooperative scheduling, a goroutine holds the processor until it voluntarily yields at a known yield point, like a channel operation or a function call.
Async Preemption
But what about goroutines that never yield? Since Go 1.14, the sysmon background thread detects goroutines running longer than ~10ms and forces preemption via a SIGURG signal.
Real-World Patterns
In practice, I see three main patterns where this distinction matters:
1. Event loops in performance-critical paths:
// Main processing thread
for {
select {
case event := <-events:
handleCriticalPath(event)
case <-shutdown:
return
}
}
2. Background workers:
// Background cleanup, metrics, etc.
go func() {
ticker := time.NewTicker(time.Minute)
defer ticker.Stop()
for {
select {
case <-ticker.C:
cleanup()
case <-ctx.Done():
return
}
}
}()
3. Hybrid approaches:
// Main thread handles hot path
go backgroundWorker() // Cold path in goroutine
for {
select {
case urgent := <-urgentChan:
handleUrgent(urgent) // Zero-copy, minimal overhead
case routine := <-routineChan:
routineWorkChan <- routine // Delegate to background
}
}
Choosing the Right Pattern
The decision between a plain loop and a goroutine comes down to what you’re optimizing for. This decision tree captures the key branching points.
The nuance most don’t talk about
The real insight isn’t “loops vs goroutines”. It’s understanding when the scheduler helps you and when it gets in your way. Most Go education focuses on the happy path where goroutines solve everything. But production systems often need more surgical approaches.
I’ve seen systems gain 30% throughput just by moving one critical loop out of a goroutine. I’ve also seen systems become unresponsive because someone removed goroutines that were providing necessary yielding points.
The trick is knowing which scenario you’re in.
When this actually matters?
To be clear, this level of optimization matters for maybe 5% of Go applications. If you’re building typical web services, CRUD apps, or data pipelines, just use goroutines everywhere and call it a day. The scheduler overhead is negligible compared to I/O, database calls, and network latency.
But if you’re building at scale, then these micro-optimizations can make the difference between meeting your SLAs and missing them.
The takeaway
Go’s runtime gives you both tools for a reason. Goroutines for most things, plain loops when you need maximum control. The art is recognizing which is which.
It’s not about being clever. It’s about understanding your performance profile and choosing the right abstraction for the job.