GPT-3 Playground - Full-Stack Architecture

System DesignSystem DesignPhoneSoftware EngineerReported Apr, 2026

We'll be high-level architecting a website together. I'll send you a link to a shared diagramming space we can optionally use to do this.

Assume the image below is a high-fidelity mockup that we, the engineering team, are tasked with implementing. Also assume we have nothing built yet at all. This is an entirely greenfield project. Assume you will be implementing everything yourself and have complete control over all technical decisions. Our goal is to have a high level architecture of the various technologies and components that we'll need to accomplish this.

We'll talk about the rough problem this website and this mockup is trying to address. I'll then want to discuss how we structure our data on the backend, how we get that data to the frontend, how we handle that data on the frontend, how we display the data, how we handle user interactions, and how we keep state up to date, etc. We can also spend some time discussing the layout, theming, and other design elements here.

Background Information

This is a site called "The Playground". It's designed to let people play with GPT-3.

For this all you need to know about GPT-3 is that there's an API that takes some prompt (the unformatted text in the gif above), and returns a surprisingly coherent completion of your prompt (the green-highlighted text).

We want a frontend interface to this API because we can offer a much better user experience to play around and get started.

You can see that we should be able to:

Input any prompt and have our GPT-3 API stream back a response

Adjust parameters to change the behavior of the response

Save prompts that we like (called "Presets")

Load existing presets and run them

Reference solution

#32 GPT-3 Playground — Full-Stack Architecture — Solution

✦ AI-Generated Solution · System Design (greenfield full-stack) · Comprehensive Build "The Playground": a web UI over the GPT-3 completion API. Stream a completion for any prompt, adjust generation parameters, and save/load Presets. Greenfield, you own all technical decisions.

1. Scope & Requirements

Functional

Enter a prompt → stream back the completion, rendering the generated text inline (e.g. highlighted) as it arrives.
Adjust parameters (temperature, max_tokens, top_p, stop, frequency/presence penalty).
Presets: save the current prompt + parameters, list them, load and re-run.

Non-functional

Snappy, real-time streaming feel.
The GPT-3 API key must never reach the browser.
Simple to operate (greenfield, small team).

2. Architecture

Playground architecture

A single-page app talks to a thin API proxy. The proxy holds the secret key, enforces auth/rate limits, and relays the GPT-3 token stream to the browser via SSE.

3. Data Flow (the interviewer's checklist)

Backend data shape — two concerns:

Transient run: { prompt, params } in → streamed completion out. Nothing to persist for a run.
Presets: { id, name, prompt, params, created_at }. Small, per-user → a simple table (Postgres) or even per-user document; cache list in memory.

Get data to the frontend — POST /api/complete with stream: true; proxy opens an upstream GPT-3 stream and forwards Server-Sent Events. Presets via plain REST (GET/POST/PUT/DELETE /api/presets).

Handle data on the frontend — a single store holds { prompt, params, runStatus, completion, presets }. Token deltas append to completion. Use an AbortController for a "Stop" button.

Display — render prompt as editable text; append the streamed completion as a visually distinct (highlighted) span so the user sees model output vs their input, exactly like the mockup.

User interactions & state — editing params updates params immediately; "Run" starts a stream; "Save Preset" snapshots {prompt, params}; "Load Preset" hydrates the store; keep everything in sync through the single store so the UI is a pure function of state.

4. Frontend Architecture

App
├── PromptEditor      (textarea + inline highlighted completion)
├── ParamsPanel       (sliders/inputs: temperature, max_tokens, top_p, stop…)
├── PresetsSidebar    (list, save, load, delete)
└── store             (prompt, params, runStatus, completion, presets)

Streaming render: read the SSE body via fetch + ReadableStream, append deltas to completion, batch DOM updates with requestAnimationFrame for smoothness.
Component state vs global: ephemeral UI (cursor, focus) local; shared data (prompt/params/presets) in the store.

async function run(prompt, params, onDelta, signal) {
  const res = await fetch("/api/complete", {
    method: "POST", signal,
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ prompt, ...params, stream: true }),
  });
  const reader = res.body.getReader(), dec = new TextDecoder();
  for (;;) {
    const { value, done } = await reader.read();
    if (done) break;
    for (const f of dec.decode(value).split("\n\n")) {
      if (f.startsWith("data: ") && f.slice(6) !== "[DONE]")
        onDelta(JSON.parse(f.slice(6)).text);     // append to highlighted completion
    }
  }
}

5. Backend / API Proxy

POST /api/complete → validates params, injects the secret API key, opens the upstream GPT-3 stream, pipes SSE to the client. Stateless and horizontally scalable.
GET/POST/PUT/DELETE /api/presets → CRUD against a small Postgres table keyed by user.
Responsibilities: secret-key custody, auth (session/OIDC), per-user rate limiting & cost guardrails (cap max_tokens), input validation.

6. Presets Storage Options

Quick/offline-friendly: localStorage (per-browser, no login). Good for a demo/greenfield MVP.
Cross-device / shareable: server-side table keyed by user id (recommended once auth exists). Mention starting with localStorage and graduating to the API.

7. Layout, Theming, Resilience

Layout mirrors the mockup: large prompt area center, params panel right, presets list left.
Theming via CSS variables / design tokens (light/dark).
Resilience: AbortController to stop generation; backoff on 429/timeout; preserve typed prompt on error; show partial completion if the stream drops.

8. Summary

Concern	Decision
Streaming	SSE via fetch + ReadableStream
Secret key	Held by API proxy, never in browser
State	Single store; UI = f(state); token deltas append
Presets	localStorage MVP → per-user table for cross-device
Rendering	Highlighted completion span, rAF-batched appends
Guardrails	Auth, rate limit, max_tokens cap, validation

WhiteboardAuto-save enabled

Loading whiteboard…