Pinpointing CPU Spikes in Node.js Services Using Clinic.js Flame

May 29, 2026 7 min read 38 views
Colorful flamegraph visualization on a dark monitor screen showing stacked CPU call frames in orange and blue

Your Node.js service looks fine in staging. Then production gets a burst of traffic and response times triple. The CPU pegs at 100% for a few seconds, then recovers β€” and you have no idea what caused it. console.log timestamps won't save you here. You need a flamegraph.

Clinic.js Flame is the fastest way to go from "something is eating CPU" to "this exact function is the bottleneck" without touching your production deploy.

What you'll learn

  • How to install and run Clinic.js Flame against a local or staging Node.js service
  • How to read a flamegraph and identify hot paths
  • How to drive realistic load so the profile actually captures the spike
  • Common traps that produce misleading flamegraphs
  • Concrete next steps once you find the culprit

Prerequisites

You'll need Node.js 16 or later, npm or yarn, and a service you can run locally or in a staging environment. Clinic.js works on Linux, macOS, and WSL2 on Windows. Native Windows support is limited, so if you're on Windows, use WSL2.

You'll also want a load-generation tool. autocannon (from the same Clinic.js ecosystem) pairs well, but wrk or even a simple for loop with curl will work.

Installing Clinic.js

Install Clinic.js globally so you can wrap any service without modifying its package.json:

npm install -g clinic
npm install -g autocannon

Confirm the install:

clinic --version

You should see a version number like 12.x.x. If you see a permission error on macOS or Linux, either use sudo or configure npm to install globals in your home directory.

Running Your First Flame Profile

Clinic.js Flame wraps your Node.js process, attaches a V8 CPU sampler, and produces an interactive HTML report when you stop the process. The command structure is simple:

clinic flame -- node server.js

If your service takes environment variables or flags, pass them the same way you normally would:

NODE_ENV=production clinic flame -- node dist/server.js --port 3000

Once the server is running, send load to it in a second terminal. Using autocannon:

autocannon -c 50 -d 30 http://localhost:3000/api/your-hot-endpoint

This opens 50 concurrent connections and hammers the endpoint for 30 seconds. Adjust -c and -d to match your production traffic pattern. The goal is to reproduce the spike, not just any load.

After the load run finishes, stop the server with Ctrl+C. Clinic.js processes the samples and opens an HTML report in your default browser automatically. It will also save the report file locally so you can share it.

Reading the Flamegraph

A flamegraph looks intimidating the first time. Here's the mental model: the x-axis is time (wider bars = more CPU time consumed), and the y-axis is the call stack (bottom is the entry point, top is the deepest call). You're looking for wide bars near the top of the stack β€” those are functions that consumed a lot of time and weren't immediately returning to a caller.

The color coding

Clinic.js Flame uses color to help you triage fast. By default:

  • Orange/red bars are your application code (the stuff you wrote).
  • Blue bars are Node.js core internals.
  • Gray bars are V8 internals and JIT overhead.

Start with the orange bars. If you see a wide orange bar sitting on top of a narrow stack, that function is doing real CPU work, not just waiting on I/O. That's your target.

Using the interactive controls

The HTML report lets you click any bar to zoom into that call chain. Use the search box (top right) to jump directly to a function name if you already have a suspect. The "Hottest" tab sorts frames by CPU time β€” this is the fastest path to the bottleneck when the graph is dense.

A Concrete Example

Suppose your flamegraph shows a wide orange bar labeled parseMarkdown sitting above an HTTP route handler. Every request to /api/posts is synchronously parsing markdown before returning. The fix is obvious once you see it: cache the parsed result or move the work off the hot path.

Here's what that route might look like before profiling reveals the problem:

const marked = require('marked');
const fs = require('fs');

app.get('/api/posts/:id', (req, res) => {
  const raw = fs.readFileSync(`./posts/${req.params.id}.md`, 'utf8');
  const html = marked.parse(raw); // synchronous, expensive
  res.json({ content: html });
});

Two problems are visible here: readFileSync blocks the event loop, and marked.parse runs on every request. Once the flamegraph confirms where the time goes, you have a clear target. Replace the sync read with an async one and cache the parsed output in memory or Redis.

Driving Realistic Load

The quality of your flamegraph depends entirely on the quality of your load. A profile taken with a single request hitting one endpoint will show you nothing useful about the spike that occurs during batch operations or concurrent authenticated sessions.

Think about what your service does right before the spike. If it's a scheduled job, trigger that job during the profile window. If it's a specific API call combination, write a small script that reproduces that sequence:

autocannon -c 100 -d 60 \
  --on-port 'autocannon -c 20 http://localhost:$PORT/api/auth' \
  http://localhost:3000/api/heavy-endpoint

If autocannon's scripting isn't flexible enough, use k6 or a custom Node.js script that fires fetch requests in parallel. The goal is to make the problem happen, not to generate the most requests per second.

Common Pitfalls

Profiling in development mode

Many frameworks (Express, Fastify, NestJS) have development middleware that does significant extra work: live reload, verbose logging, source map lookups. Always profile with NODE_ENV=production or the closest equivalent. Otherwise you're optimizing code that won't run in production.

Short or idle profile windows

If the CPU spike only happens under sustained load but you profile for five seconds, the sampler won't collect enough data. The flamegraph will look flat and misleading. Aim for at least 20–30 seconds of load that actually stresses the service.

Confusing I/O wait with CPU work

A flamegraph shows CPU samples, not wall-clock time. A function that waits 500ms for a database query will appear very narrow because the CPU is idle during that wait. If your service feels slow but the flamegraph looks flat, your bottleneck is I/O, not CPU. Use clinic doctor instead to check event loop delay and I/O saturation.

Sampling bias on short functions

V8's CPU sampler takes snapshots at intervals (typically 1ms). Very fast functions that complete in microseconds may not appear in the graph even if they're called millions of times. For that level of analysis, you'd want to use V8's built-in --prof flag and node --prof-process, which logs every function call rather than sampling.

Misleading inlining

V8 inlines small functions aggressively. A function you wrote may appear merged into its caller in the flamegraph. If you see a caller taking unexpected CPU time and can't explain it, check whether it calls any small helpers that V8 might have inlined into it.

Sharing and Saving Reports

Clinic.js saves the report as a folder named something like 12345.clinic-flame in your working directory, plus an HTML file. Share the HTML file with teammates directly β€” it's self-contained. You can also host it on any static file server.

If you're doing this in CI or a staging environment over SSH, use the --no-open flag to prevent Clinic.js from trying to open a browser, then copy the HTML back to your machine:

clinic flame --no-open -- node server.js
# ... run load ...
scp staging-host:/app/12345.clinic-flame.html ./flame-report.html

Going Further with Clinic.js Doctor

Clinic.js ships three tools: Flame (CPU sampling), Doctor (event loop health check), and Bubbleprof (async operation visualization). If your flamegraph shows healthy CPU usage but the service still feels slow, run clinic doctor next. Doctor measures event loop delay, memory usage, and I/O handles over time and gives you a written diagnosis β€” it's a good first pass before you commit to a deeper flame analysis.

clinic doctor -- node server.js

Doctor produces its own HTML report with annotated charts and, often, a plain-English recommendation like "your event loop delay spikes correlate with GC pauses β€” consider reducing object allocation in your hot path."

Wrapping Up

CPU profiling in Node.js doesn't have to be a guessing game. Clinic.js Flame gives you a clear, visual answer to "where is my CPU time going" in under five minutes of setup. Here are your next steps:

  1. Install Clinic.js globally and run a quick flame profile against your service right now, even if it's not currently spiking β€” knowing your baseline makes future spikes much easier to compare.
  2. Write a load script that mimics your worst-case traffic so your profiles reflect real conditions, not ideal ones.
  3. Look for wide orange bars near the top of the flamegraph first. Those are your application-code hotspots and the most actionable findings.
  4. If the flamegraph looks flat but the service is slow, switch to clinic doctor to check for event loop delay and I/O bottlenecks.
  5. Share the HTML report with your team before making any optimization. A shared visual is far more persuasive than a description of what you think you saw.

πŸ“€ Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

πŸ“¬ Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.