Skip to content

Go Tool Pprof

Profile is a common concept among programming languages, it helps to access the program status like CPU, memory usage, and so forth. You may assume that only the languages with runtime could profile, but that's totally wrong. Languages without runtime like C, C++ could profile themselves as well by diverse approaches. For example, the compiler could insert some code instructions for profiling, or the profiler could interrupt the program to record its states periodically.

Half done as I don't have enough knowledge about Go runtime management.

In this blog, I focus on the profiler in the Go language, which is called pprof. It is constituted of two parts:

  • the profiler, which is at the code level, writes the runtime profiling data in the format expected by the pprof visualization tool
  • the visualization tool to analyze the output runtime data

The blog is not a manual as it could be found across the internet easily, instead, it briefs the concept and checks the implementation.

Pprof Usages

This topic briefly checks the APIs exported by pprof package, overlapped with the pprof godoc. It takes some outline notes.

  • The command go test natively supports the profile so we can pass the flags for profiling.

    go test -cpuprofile cpu.prof -memprofile mem.prof -bench .
    

  • Normal Go program needs to integrate the code manually.

  • A net/http/pprof package is more convenient to use.
  • Use go tool pprof to see the visual results.

Profile

The structure Profile is a collection of stack traces showing the call sequences that led to instances of a particular event, such as allocation.

The stack trace means the call stack from the function creates the profile, and a particular event refers to pre-defined events like memory allocation, function calls, or custom events.

The most common use is for tracking resources that must be explicitly closed, such as files or network connections. Users could create their own profiles and then add the profiling to the profile by Add API.

Built-in Profiles

There are global profiles created by pprof library so the pprof could track the go routines, heap, allocations, block and mutex.

profiles.m = map[string]*Profile{
  "goroutine":    goroutineProfile,
  "threadcreate": threadcreateProfile,
  "heap":         heapProfile,
  "allocs":       allocsProfile,
  "block":        blockProfile,
  "mutex":        mutexProfile,
}

They are stored inside a global map and can be retrieved by Lookup function by name. The CPU profile is not available as a Profile. It has a special API, the StartCPUProfile and StopCPUProfile functions, because it streams output to a writer during profiling.

One of the common uses is net/http/pprof package which provides an HTTP server to wrap these profilers. For this blog, we focus how things work instead of using details. So the introduction could stop here.

Custom Profiles

package main

import (
    "os"
    "runtime/pprof"
)

func prof() {
    profile := pprof.NewProfile("hello")
    profile.Add(1, 1)
    profile.WriteTo(os.Stdout, 1)
}

func main() {
    prof()
}
The output looks like this:
hello profile: total 1
1 @ 0x100469da4 0x100469dfc 0x1003f2b5c 0x1004220b4
#       0x100469da3     main.prof+0x43          /Users/yuchen.xie/GolandProjects/awesomeProject/main.go:10
#       0x100469dfb     main.main+0x1b          /Users/yuchen.xie/GolandProjects/awesomeProject/main.go:15
#       0x1003f2b5b     runtime.main+0x28b      /Users/yuchen.xie/workspace/go/src/runtime/proc.go:271

Pprof Profiler

The profiler resides under src/runtime/pprof as it couples the runtime heavily. It's neutral as if the runtime states can be easily retrieved from the runtime, why do we need another way to find these data?

Built-in Profilers

Each built-in profiler is defined by a count function and a write function like this.

var xxxProfile = &Profile{
    name:  "xxx",
    count: countXxx,
    write: writeXxx,
}
This format differs the built-in profilers from the custom profiler, which allows you to call Add and Remove method. The built-in profilers will panic when calling Add or Remove.

As I lack the knowledge about Go runtime schedule and management, the built-in profilers are impossible for me to understand.

TODO: I will finish this part in the future if I have time.

Custom Profiler

The pprof package exports the Profiler and respective methods to ease integration. Generally, it's just a simple wrapper of runtime.Caller and formatter.

NewProfiler

All profilers are managed inside the global variable profiles by the pprof. So every profiler created by NewProfiler will automatically be added to the global variable. The identifier is name and no duplication is allowed.

Add and Remove

Add adds the current execution stack to the profile, associated with value. It firstly checks whether it's a built-in profiler by asserting the write field. Then, it triggers the runtime.Caller to retrieve the caller information and then store the caller information inside its map. There is a default to prevent the skip is too large, which has no log at all.

    stk := make([]uintptr, 32)
    n := runtime.Callers(skip+1, stk[:])
    stk = stk[:n]
    if len(stk) == 0 {
        // The value for skip is too large, and there's no stack trace to record.
        stk = []uintptr{abi.FuncPCABIInternal(lostProfileEvent)}
    }

The test cases TestEmptyCallStack tests this fallback by calling:

p.Add("foo", 47674)
Remove is simple as it just removes the value from the map based on the key.

Write

WriteTo writes a pprof-formatted snapshot of the profile to a writer. It supports a gzip-compressed protocol buffer and the old legacy format. The code logic is very simple, as it only helps to flush the data into the expected format.

References