Converting Stack Samples to Trace Events

Low-Level DesignCodingOnsitePhoneMachine Learning EngineerSoftware EngineerReported May, 2026High Frequency

Basic Problem

A sampling profiler periodically records the full call stack at a timestamp. You are given a list of samples sorted by timestamp, where each sample contains:

struct Sample {
    double ts;                      // timestamp
    std::vector<std::string> stack; // call stack (outermost → innermost)
};

Your task is to convert these samples into a sequence of events suitable for visualizing a timeline of function calls:

struct Event {
    std::string kind;   // "start" or "end"
    double ts;          // timestamp
    std::string name;   // function name
};

Example

Input samples (in time order):

t=1.0: ["main"]

t=2.5: ["main", "func1"]

t=3.1: ["main"]

Expected output:

start 1.0 main

start 2.5 func1

end 3.1 func1

Requirements

Input is sorted — samples are provided in timestamp order

Between consecutive samples, determine what changed:

End functions that disappeared (deepest-first order)

Start functions that newly appeared (outer-to-inner order)

Do NOT automatically close remaining functions after the last sample

Handle recursion correctly — the same function name at different depths are treated as separate frames

Interface

List<Event> convertSamplesToEvents(List<Sample> samples)

Note for Candidate

The stack is ordered from outermost (e.g., "main") to innermost (e.g., leaf function)

When comparing consecutive stacks, identify the longest common prefix to determine which functions ended and which started

Basic Problem Solution

Key insight: For each consecutive pair of samples, find the longest common prefix between the previous stack and the current stack. Functions beyond the common prefix in the previous stack have ended (emit "end" deepest-first), and functions beyond the common prefix in the current stack are new (emit "start" outer-to-inner).
def convert_samples_to_events(samples):
    events = []
    prev_stack = []

    for sample in samples:
        ts, stack = sample.ts, sample.stack

        # Find longest common prefix between previous and current stack
        common_len = 0
        for i in range(min(len(prev_stack), len(stack))):
            if prev_stack[i] == stack[i]:
                common_len += 1
            else:
                break

        # End functions that disappeared (deepest first)
        for i in range(len(prev_stack) - 1, common_len - 1, -1):
            events.append(Event("end", ts, prev_stack[i]))

        # Start functions that newly appeared (outer to inner)
        for i in range(common_len, len(stack)):
            events.append(Event("start", ts, stack[i]))

        prev_stack = stack

    return events

Walkthrough with Example

Given samples: t=1.0: ["main"], t=2.5: ["main", "func1"], t=3.1: ["main"]

Handling Recursion

Recursion is handled correctly because we compare by position, not by name. For example:

t=1: ["main", "foo", "foo"]

t=2: ["main", "foo"]

Common prefix length is 2 ("main", "foo"). The deeper "foo" at index 2 ends — the shallower "foo" at index 1 is unaffected.

Complexity

Time: O(S × D) where S = number of samples, D = max stack depth

Space: O(D) for storing the previous stack

Follow-up 1: Debouncing by N Consecutive Samples

Sometimes short-lived leaf functions appear only briefly and cause noise in profiling data. Modify your solution to only emit start/end events for functions that appear in N consecutive samples (N is configurable).

Definition of "Consecutive"

A function is considered to appear in N consecutive samples if the same frame (same depth and ancestors) appears in N samples in a row.

Important: If a function ends and later restarts, its streak resets — you cannot count across restarts.

Example (N = 2)

Input samples:

t=1.0: ["main"]

t=2.0: ["main"]

t=3.0: ["main", "foo"]

t=4.0: ["main", "foo"]

Expected output:

start 2.0 main # main confirmed stable after 2 samples

start 4.0 foo # foo confirmed stable after 2 samples

Consecutive Definition - Critical Edge Case

Consider these samples:

Sample(1, ['a', 'b'])

Sample(2, ['a', 'b', 'c'])

Here, a and b are consecutive across both samples.

However, in this case:

Sample(1, ['a', 'b'])

Sample(2, ['c', 'b', 'a'])

a and b do NOT count as consecutive, because at t=2, the frames a and b from t=1 must end first before starting c, then b, then a again. These are different frames even though they have the same names.

Interface

List<Event> convertSamplesToDebouncedEvents(List<Sample> samples, int N)

Note for Candidate

You can choose whether to timestamp the start event at the 1st or Nth occurrence (both are acceptable if consistent)

Focus on maintaining the "consecutive" streak correctly — interruptions reset the counter

Follow-up 1 Solution

Key insight: Extend the basic solution by tracking a streak count and confirmed flag for each frame position. A frame's streak increments when it persists across consecutive samples (same position, same ancestors). When the streak reaches N, emit a "start" event. Only emit "end" events for frames that were previously confirmed. 
def convert_samples_to_debounced_events(samples, N):
    events = []
    prev_stack = []
    streak_counts = []  # how many consecutive samples each frame has appeared
    confirmed = []      # whether "start" has been emitted for each frame

    for sample in samples:
        ts, stack = sample.ts, sample.stack

        # Find longest common prefix
        common_len = 0
        for i in range(min(len(prev_stack), len(stack))):
            if prev_stack[i] == stack[i]:
                common_len += 1
            else:
                break

        # End confirmed functions that disappeared (deepest first)
        for i in range(len(prev_stack) - 1, common_len - 1, -1):
            if confirmed[i]:
                events.append(Event("end", ts, prev_stack[i]))

        # Build new streak counts and confirmed arrays
        new_streaks = []
        new_confirmed = []

        # Common prefix: increment streaks
        for i in range(common_len):
            new_streak = streak_counts[i] + 1
            new_streaks.append(new_streak)
            if new_streak >= N and not confirmed[i]:
                events.append(Event("start", ts, stack[i]))
                new_confirmed.append(True)
            else:
                new_confirmed.append(confirmed[i])

        # New functions: start with streak of 1
        for i in range(common_len, len(stack)):
            new_streaks.append(1)
            if N <= 1:
                events.append(Event("start", ts, stack[i]))
                new_confirmed.append(True)
            else:
                new_confirmed.append(False)

        prev_stack = stack
        streak_counts = new_streaks
        confirmed = new_confirmed

    return events

Walkthrough with Example (N = 2)

Given samples: t=1.0: ["main"], t=2.0: ["main"], t=3.0: ["main", "foo"], t=4.0: ["main", "foo"]

Key Design Decisions

Start event timestamp: Emitted at the Nth occurrence (when confirmed), not the 1st. This means the start timestamp reflects when we are confident the function is stable.

End events: Only emitted for confirmed frames. Short-lived frames that never reach N consecutive appearances are silently ignored — this is the intended denoising behavior.

Streak reset: When the common prefix is shorter than a frame's position, that frame's streak is lost entirely. Even if the same function name reappears later, it starts a fresh streak of 1.

Edge Case: Reordered Stacks

Sample(1, ['a', 'b'])

Sample(2, ['c', 'b', 'a'])

Common prefix length is 0 (since 'a' != 'c'), so all frames from sample 1 end and all frames in sample 2 start fresh with streak 1. The names 'a' and 'b' reappearing at different depths does not carry over any streak — they are treated as entirely new frames.

Complexity

Time: O(S × D) — same as the basic solution

Space: O(D) for streak counts, confirmed flags, and previous stack

Additional Discussion Topics

The following questions are for verbal discussion. The interviewer does not expect you to code solutions, but be prepared to explain your approach and reasoning.

Prefix vs Suffix Comparison

Question: The basic solution uses prefix comparison — finding the longest common prefix between consecutive stacks (matching from the root down). What changes if you use suffix comparison instead (matching from the leaf up), and which is the right choice for a profiler?

Setup: Both approaches need to decide, for each pair of consecutive samples, which frames "are the same frame still executing" and which are new. The difference is which end of the stack they anchor that decision on.

Same leaf change, different answers

Consider the common profiling case where only the leaf changes:

Stack A: [main, handler, parse, foo]

Stack B: [main, handler, parse, bar]

Prefix: common length = 3 → end foo, start bar (1 end + 1 start)

Suffix: common length = 0 (leaves foo vs bar differ) → end all 4 frames in A, start all 4 in B (4 ends + 4 starts)

Prefix is correct here. main → handler → parse was on the stack the whole time and was never popped. Suffix's answer claims those three frames returned and were immediately re-entered, which is not what happened. Prefix matches actual call semantics: outer frames call inner frames, so if outer frames are unchanged across samples, they stayed on the stack.

Where the two approaches actually differ

The interesting case is the opposite shape — same leaf, different roots:

Stack A: [threadA, dispatch, run, work]

Stack B: [threadB, dispatch, run, work]

Prefix: common length = 0 → 4 ends + 4 starts

Suffix: common length = 3 (dispatch, run, work match from the bottom) → end threadA, start threadB

Suffix says "the worker kept running, only the driver swapped." Prefix says "everything unwound and was re-pushed." Which is correct depends on what actually happened between samples — and the samples alone can't tell you. If threadA truly returned and threadB later called the same dispatch → run → work chain, prefix is right. If this is an async/coroutine handoff where the inner frames literally never unwound, suffix is closer to the truth.

Why prefix is the standard choice

Real profilers (Chrome trace, perfetto, pprof, Linux perf) use prefix/root-anchored matching. Reasons:

It matches stack semantics. A native call stack is LIFO — you can't keep an inner frame alive while replacing the outer frame around it. Treating a root change as "same leaf, different parent" describes a state the machine can't actually be in for synchronous code.

Trace viewers are tree-structured. Flame graphs and timeline views render a parent containing its children. Prefix matching produces well-nested events that drop straight into that model; suffix matching can produce intervals where a child outlives its parent, which the viewer has to either reject or paper over.

Suffix's "win" is rare and often wrong. The same-leaf-different-root pattern mostly shows up at coroutine resume points or thread pool reuse — and in those cases the inner frames did unwind between samples; the profiler just can't see the gap.

When suffix-style reasoning is useful

Not for emitting timeline events, but for aggregation: when you want to ask "how much total time was spent in work, regardless of who called it?", you bucket by leaf (or by suffix). That's how flame graphs aggregate self-time and how perf report groups samples. So suffix matching has a place — it's just at the analysis layer, not the event-conversion layer.

Bottom line: Prefix comparison is correct for converting samples into start/end events. Suffix-style grouping belongs in the aggregation step that runs over those events afterward.

Auto-save enabled

Loading editor…

Output

Run your code to see the output here.