Design a service that solves crossword puzzles. Given a board with word slots (positions, directions, lengths) and a dictionary of ~1 million words, find any valid assignment of words that satisfies all intersection constraints.
The board is medium-sized (~50x50) with ~100 word slots to fill. This is NOT an algorithm optimization problem—the interviewer wants to see you recognize that single-machine brute force is intractable (search space ~10^300) and design a distributed job scheduling system to parallelize the search.
Common mistake: Don't spend too much time optimizing the algorithm itself. The key insight is proving single-machine won't work, then designing how to distribute the search across workers, handle dead ends quickly, and dynamically split/redistribute tasks when some workers finish early or get stuck.
Board size not specified initially; interviewer clarified "medium-sized" (~50x50) with ~100 word slots and 1M word dictionary. Only need to find ANY valid solution.
Proved single-machine brute force is intractable, then proposed a stochastic optimization approach (simulated annealing style). Spent the entire interview explaining the E2E flow—no deep dives. Interviewer said: "First time seeing this approach, need to verify it later."
The 'correct' answer is distributed DFS, but I've never seen anyone answer that. Most people use simulation methods—split tasks and handle dead ends by redistributing.
Treated this as a pure algorithm problem and spent too much time trying to find the "optimal" algorithm. Realized mid-interview it's about distributed systems, not algorithm optimization.
Tried to recover by applying the standard system design template (functional requirements, non-functional requirements, high-level design)—this was also a mistake for this problem type.
This is essentially a job scheduler problem. When the board is large, single-machine brute force won't finish in time. You need to split the problem across workers, handle dead ends quickly, and dynamically redistribute when some tasks are too large or hit dead ends early.
A crossword puzzle solver is a service that takes a puzzle board with empty word slots and a dictionary, then finds a valid assignment of words that satisfies all constraints. The core challenge is solving a constraint satisfaction problem (CSP) at scale—when brute force on a single machine won't finish in time.
This walkthrough follows the Interview Framework. Use it as a guide, not a script—adapt based on interviewer cues.
Solve puzzles — Given a board with word slots and a dictionary, find any valid word assignment
Handle constraints — Words must fit their slots and share letters at intersections
Return solution — Output the solved puzzle with words and their positions/directions
Puzzle generation (creating puzzles from scratch)
Clue matching (we ignore clue semantics; clues may be present but are not used by the solver)
Multiple solutions (finding one valid solution is sufficient)
User interface or interactive editing
The interviewer will likely tell you the board is "medium-sized" (around 50×50). This is intentional—it's large enough that single-machine brute force won't work, forcing you to design a distributed solution.
Why single-machine brute force fails:
100 word slots to fill
Average of ~1,000 words in dictionary matching each slot's length
Naive search space: 1000^100 = 10^300 combinations
Even with pruning, this is intractable on one machine
The insight: While the total search space is astronomical, constraint propagation dramatically prunes it. The challenge is exploring the pruned tree fast enough—which requires distributed computation.
Common mistake: Don't spend too much time optimizing the single-machine algorithm. The interviewer wants to see you recognize when to distribute the problem. Prove single-machine won't work, then move on.
Puzzle
├── id: UUID
├── width: integer
├── height: integer
├── slots: Slot[]
└── status: "pending" | "solving" | "solved" | "failed"
Slot
├── id: integer
├── start_row: integer
├── start_col: integer
├── direction: "across" | "down"
├── length: integer
├── clue: string | null // Optional metadata, ignored by solver
└── assigned_word: string | null
Dictionary
├── words: string[]
└── by_length: Map<integer, string[]> // Pre-indexed by word length
SearchState
├── id: UUID
├── puzzle_id: UUID
├── assignments: Map<slot_id, word> // Current partial solution
├── remaining_slots: slot_id[] // Unassigned slots
├── depth: integer // How deep in the search tree
└── parent_state_id: UUID | null // For backtracking
Task
├── id: UUID
├── puzzle_id: UUID
├── state_id: UUID
├── priority: integer
├── status: "pending" | "running" | "completed" | "dead_end"
└── assigned_worker: string | null
Why track SearchState separately?
Enables work distribution—each state is an independent subtree to explore
Supports resumption after worker failures
Allows dynamic task splitting
Why index dictionary by length?
Slots have fixed lengths; only words of matching length are candidates
Reduces candidate set from 1M to ~1-10K per slot
This is a batch processing system, not a real-time API. We need endpoints for job submission and result retrieval.
POST /puzzles
Request: {
"width": 50,
"height": 50,
"slots": [
{ "start_row": 0, "start_col": 0, "direction": "across", "length": 5, "clue": "A greeting" }, // Optional, ignored
...
],
"dictionary_id": "en-1m" // Reference stored dictionary; inline list only for toy puzzles
}
Response: { "puzzle_id": "abc123", "status": "pending" }
GET /puzzles/{puzzle_id}
Response: {
"puzzle_id": "abc123",
"status": "solving",
"progress": { "explored_states": 150000, "active_workers": 8 }
}
GET /puzzles/{puzzle_id}/solution
Response: {
"puzzle_id": "abc123",
"status": "solved",
"solution": [
{ "slot_id": 1, "word": "HELLO", "direction": "across", "position": "1-Across" },
...
]
}
Storage Layer
Worker Pool
Task Queue
Coordination Layer
API Layer
Client
Submit puzzle
User
API Gateway
Coordinator Service
ZooKeeper
Message Queue Redis
Worker 1
Worker 2
Worker N
State Store PostgreSQL
Dictionary In-Memory
1. Submitting a Puzzle
Task QueueState StoreCoordinatorAPI GatewayUserTask QueueState StoreCoordinatorAPI GatewayUserPOST /puzzlesSubmit puzzleStore puzzle + initial stateEnqueue initial task (full puzzle)puzzle_id{ puzzle_id, status: "pending" }
2. Worker Processing Loop
Each worker runs a continuous loop:
while true:
task = queue.dequeue()
state = load_state(task.state_id)
if is_solved(state):
mark_puzzle_solved(state)
return
if is_dead_end(state):
mark_task_dead_end(task)
continue
slot = pick_most_constrained_slot(state)
candidates = get_valid_words(slot, state)
if len(candidates) == 0:
mark_task_dead_end(task)
continue
if len(candidates) > SPLIT_THRESHOLD:
for word in candidates:
new_state = apply_assignment(state, slot, word)
save_state(new_state)
queue.enqueue(new_task(new_state))
else:
for word in candidates:
new_state = apply_assignment(state, slot, word)
if explore_dfs(new_state):
return # Found solution
The key insight: This is distributed DFS with dynamic task splitting and a shared queue. When a worker finds a state with many candidates, it splits the work. When candidates are few, it explores sequentially to avoid queue overhead.
3. Constraint Propagation
When assigning a word to a slot, propagate constraints to intersecting slots:
After: Slot 1 = HELLO
Slot 1: HELLO ✓
Slot 2 (intersects at 'L') Candidates: ELITE ✓ EMBER ✗, ENTER ✗
Before Assignment
Slot 1 (Across) Candidates: HELLO, HELPS, HELIX
Slot 2 (Down) Candidates: ELITE, EMBER, ENTER
Constraint propagation is crucial—it eliminates invalid candidates early, dramatically pruning the search space.
Why DFS over BFS?
For crossword solving:
We only need any valid solution, not the optimal one
DFS memory is bounded (just the current path)
DFS can find solutions faster by going deep quickly
Distributed DFS Strategy
Parallel Exploration
First Split (3 candidates)
Initial State
Dead end
Dead end
Solution found!
Root: Empty Puzzle
Task 1: Slot1=HELLO
Task 2: Slot1=HELPS
Task 3: Slot1=HELIX
Worker 1 Explores T1 subtree
Worker 2 Explores T2 subtree
Worker 3 Explores T3 subtree
Backtrack
Backtrack
Notify all workers
Key mechanisms:
Task splitting — When a state has many candidates, split into subtasks
Queue-based work distribution — Idle workers pull tasks from the shared queue
Early termination — When any worker finds a solution, broadcast cancellation
Dead-end detection — When no valid candidates exist, backtrack immediately
Problem: Which slot should we fill next? Random ordering leads to wasted exploration.
Solution: Most Constrained Variable (MCV) heuristic—fill the slot with fewest valid candidates first.
function pick_most_constrained_slot(state):
min_candidates = infinity
best_slot = null
for slot in state.remaining_slots:
candidates = count_valid_words(slot, state)
if candidates < min_candidates:
min_candidates = candidates
best_slot = slot
return best_slot
Why MCV works:
Slots with few candidates are "bottlenecks"
Filling them first detects dead ends earlier
Dramatically reduces average tree depth explored
Interview gold: Mention that this is a classic CSP heuristic (also called Minimum Remaining Values or MRV). If the interviewer asks for alternatives, discuss the degree heuristic (choose the slot that constrains the most other slots) or least constraining value (for value ordering).
Problem: Some tasks explore huge subtrees; others hit dead ends instantly. Fixed task sizes cause load imbalance.
Solution: Dynamic splitting based on estimated subtree size.
SPLIT_THRESHOLD = 10 # Tune based on queue latency vs. parallelism
function should_split(candidates, depth):
return len(candidates) > SPLIT_THRESHOLD and depth < MAX_SPLIT_DEPTH
Adaptive splitting:
Track average task completion time
If workers are idle → split more aggressively
If queue is backed up → split less, do more local work
Coordinator Monitors
Adjust Strategy
High idle % → Lower SPLIT_THRESHOLD
Queue backup → Raise SPLIT_THRESHOLD
Queue Depth
Worker Idle %
Problem: Many paths lead to dead ends. How do we detect and abandon them quickly?
Solution: Forward checking before exploring.
Before assigning a word, check if the assignment leaves any valid word for intersecting slots. This is a lightweight consistency check (not full AC-3):
function is_consistent(state, slot, word):
new_state = apply_assignment(state, slot, word)
for intersecting_slot in get_intersections(slot):
if count_valid_words(intersecting_slot, new_state) == 0:
return false # Dead end guaranteed
return true
This prunes obviously-bad branches without full exploration.
Alternative to exhaustive DFS: Use stochastic optimization as a time-bounded pre-pass when deterministic search is too slow. If it fails, fall back to DFS to preserve completeness.
One interviewee mentioned using "stochastic optimization" and the interviewer said it was a valid approach. This is worth knowing as an alternative.
Simulated Annealing approach:
function solve_stochastic(puzzle):
state = random_assignment(puzzle)
temperature = INITIAL_TEMP
while temperature > MIN_TEMP:
slot = random_slot(state)
new_word = random_valid_word(slot)
old_violations = count_violations(state)
new_state = swap_word(state, slot, new_word)
new_violations = count_violations(new_state)
if new_violations < old_violations:
state = new_state
elif random() < exp((old_violations - new_violations) / temperature):
state = new_state
temperature *= COOLING_RATE
return state if count_violations(state) == 0 else null
Trade-offs:
Faster for some puzzles
No guarantee of finding solution even if one exists
Can be parallelized trivially (run multiple independent searches)
Problem: When one worker finds a solution, how do we stop all others?
Solution: Broadcast cancellation with generation numbers.
Coordinator:
on solution_found(puzzle_id, solution):
store_solution(puzzle_id, solution)
increment_generation(puzzle_id)
broadcast("cancel", puzzle_id, generation)
Worker:
before processing task:
if task.generation < current_generation(puzzle_id):
skip task # Stale work
periodically:
check for cancellation messages
if cancelled:
abandon current work
Worker 3Worker 2CoordinatorWorker 1Worker 3Worker 2CoordinatorWorker 1Solution found!Store solutionCancel (generation=2)Cancel (generation=2)Check generation, abandonCheck generation, abandon
Worker failure:
Coordinator tracks task assignments with heartbeats
If worker dies, re-enqueue its active task
State is stored in database, not just worker memory
Coordinator failure:
Use ZooKeeper for leader election
New coordinator reconstructs state from database
Workers reconnect automatically
Problem: What if the puzzle has no valid solution? How do we know when to stop?
Solution: Track task completion—when all tasks reach dead ends, no solution exists.
Coordinator tracks:
├── total_tasks_created: counter
├── tasks_completed: counter (dead ends + solution found)
└── active_tasks: set
When tasks_completed == total_tasks_created AND active_tasks is empty:
if no solution found:
mark puzzle as "unsolvable"
Key insight: Distributed DFS is complete—it explores the entire search space. If every branch hits a dead end and no tasks remain, we've proven no solution exists.
This is an important edge case to mention in interviews. Most candidates focus on the happy path (solution found) and forget about completeness guarantees.
Before finishing, verify you've covered:
Proved single-machine brute force won't work (10^300 search space)
Explained distributed DFS as the core approach
Described constraint propagation for pruning
Discussed variable ordering heuristics (MCV)
Covered dynamic task splitting and load balancing
Explained dead-end detection and backtracking
Addressed early termination when solution found
Mentioned fault tolerance for workers
Discussed how to detect "no solution exists"
Key takeaway: This problem is about recognizing when a brute-force approach needs to be distributed, and designing the job scheduling infrastructure to parallelize a fundamentally sequential algorithm (DFS) while handling load imbalance and failures.