Points of Interest (POI) / Yelp

System DesignSystem DesignOnsitePhoneSoftware EngineerReported Feb, 2026

Problem Statement

Design a scalable location-based service that allows users to discover nearby points of interest (POIs) such as restaurants, shops, and businesses. The system must efficiently handle geospatial queries to find the nearest POIs, support high read traffic from millions of users, and maintain up-to-date business information.

Scale considerations:

500 million POIs globally

100,000 queries per second (read-heavy workload)

Sub-100ms query latency (p99)

Global geographic distribution

Comprehensive Solution Resources

For detailed architectural solutions and technical implementation:

Design Yelp

Real Interview Experiences

Experience 1: General POI Query System

"Design a system that manages points of interest (POI) for a mapping application. Explain how you would store and retrieve POI data, scale the system for high traffic, and ensure data consistency and integrity."

Experience 2: Nearest K POIs with Deep Geohash/Indexing Focus

"How would you design a system to query the nearest 'k' points of interest (POIs)? Explain in detail your approach, including the use of geohash and indexing to record each block's POIs. Address edge cases by querying adjacent nodes. Additionally, discuss scaling strategies such as sharding and caching."

Experience 3: Index Building and Sharding Strategy

"In a design interview focused on Points of Interest (POI), you were asked about how to build an index and how to shard this index. Please describe your approach to indexing POIs and your strategy for sharding the index to ensure efficient data retrieval and scalability."

Experience 4: Yelp Review Site with Efficient Updates

"Design a system for a review site similar to Yelp. Focus on the basics of designing a scalable application without the need to take user feedback into consideration. Specifically, discuss how to update restaurant data efficiently."

Experience 5: POI Service (Yelp/Uber) with HR Advance Notice

"Prepare a system design for a Point of Interest (POI) service similar to those used by Yelp or Uber. You will be informed about this topic by HR during preparation. Focus on designing a robust and scalable system."

Experience 6: Geohash vs QuadTree Trade-offs (Foursquare)

"Design a 'get_poi' (point of interest) feature for a location-based service like Foursquare. Provide design approaches using geohash and quadtree, and discuss the strengths and weaknesses of each approach."

Experience 7: QuadTree for Exact Nearest N Points

"You might be asked to design a QuadTree that can efficiently find the exactly nearest N points to a given point. Discuss how you would approach this problem, including any initial methods you might consider and how you would iterate upon them to find a better solution."

Experience 8: Extremely Detailed Geo Index Questions

"Similar to design Yelp, they will ask very detailed questions about the geo index part, such as Geohash or quad tree"

Reference solution

#31 Points of Interest (POI) / Yelp — Solution

✦ AI-Generated Solution · System Design · Comprehensive Find nearby POIs at scale: 500M POIs, 100K QPS, p99 < 100ms, global. Across the reported experiences the interviewer drills hardest on the geo index (geohash vs quadtree), index building, and sharding.

1. Requirements

Functional

nearby(lat, lng, radius) and nearest_k(lat, lng, k) over POIs (restaurants, shops…).
Return POI details (name, category, rating).
Update business info / add-remove POIs.

Non-functional

100K QPS, read-heavy (≫ writes).
p99 < 100ms; global distribution.
Eventual consistency for POI details is fine; spatial results must be correct.

2. Capacity

500M POIs × ~1KB ≈ 500GB of metadata → fits a sharded DB; the index is the interesting part.
100K QPS read → heavy caching + replicas; writes are comparatively rare (business updates).

3. The Core Problem: Spatial Indexing

A B-tree on (lat, lng) can't answer "nearby" efficiently (2D range). You need a spatial index that maps 2D → 1D or partitions space:

Approach	Idea	Strengths	Weaknesses
Geohash	Interleave lat/lng bits → Base32 prefix string; nearby points share prefixes	Simple, string-prefix range scans, easy to shard & cache, works in any KV/SQL	Fixed grid → dense cities overflow a cell; edge problem (neighbors in adjacent cells)
Quadtree	Recursively split space into 4 quadrants until ≤ N points per leaf	Adapts to density (deep in cities, shallow in deserts); exact nearest-N	In-memory tree, harder to shard/persist, rebalancing on writes
S2 / H3	Hierarchical cells on a sphere (Google/Uber)	No pole/wraparound distortion	More machinery

Recommendation: Geohash as the primary, shardable, cacheable index for "nearby in a radius"; mention quadtree/S2 when the interviewer pushes on dense-area adaptivity or exact nearest-N. Most strong answers lead with geohash and discuss quadtree as the trade-off.

Handling the geohash edge problem

A POI just across a cell boundary can be closer than ones inside your cell. Always query the target cell plus its 8 neighbors, merge candidates, then sort by true haversine distance and take the radius/k.

def nearby(lat, lng, radius):
    precision = precision_for_radius(radius)         # coarser cell for larger radius
    cell = geohash.encode(lat, lng, precision)
    cells = [cell] + geohash.neighbors(cell)         # 3x3 block -> fixes edge cases
    candidates = []
    for c in cells:
        candidates += index.range_scan(prefix=c)     # shard-local prefix scan
    return sorted(candidates, key=lambda p: haversine(lat, lng, p))[:K]

4. Architecture

POI architecture

Geo Query Service computes the geohash cells and fans out prefix scans.
Geo Index is sharded by geohash prefix so a query touches few shards; POIs are co-located by region.
POI DB is the source of truth; an Index Builder (batch + streaming) keeps the index current.
Redis caches hot cells (popular city centers) — this is what carries the 100K QPS.

5. Sharding & Index Building

Shard by geohash prefix (region) → spatial locality, queries stay shard-local. Beware hotspots (Manhattan): split hot prefixes into finer sub-shards, and load-balance with replicas.
Index build: initial bulk build from the POI DB (MapReduce/Spark over all POIs → geohash → index). Keep it fresh via a CDC/stream from POI writes so adds/edits/moves update the index incrementally.
Avoid sharding purely by poi_id for the index — it would force every query to fan out to all shards.

6. Read & Write Paths

Read: geohash the request → check Redis for the cell block → on miss, scan index shards + enrich details from POI DB → cache → return distance-sorted results.
Write (efficient updates, called out in Experience 4): update POI DB → emit CDC event → index builder updates the affected geohash cell(s) → invalidate that cell's cache entry. Moving a POI = remove from old cell, add to new cell.

7. Scaling & Global

Read replicas + aggressive cell caching absorb the read-heavy 100K QPS.
Geo-distribute: serve users from the nearest region; POIs are inherently region-partitioned.
CDN/edge cache for static POI detail payloads.

8. Summary

Concern	Decision
Primary index	Geohash (prefix-shardable, cacheable)
Adaptive / exact-N	Quadtree / S2 as trade-off for dense areas
Edge correctness	Query cell + 8 neighbors, re-rank by haversine
Sharding	By geohash prefix; split hotspots; replicas
Freshness	Bulk build + CDC stream incremental updates
100K QPS	Redis hot-cell cache + read replicas

WhiteboardAuto-save enabled

Loading whiteboard…