Design a scalable location-based service that allows users to discover nearby points of interest (POIs) such as restaurants, shops, and businesses. The system must efficiently handle geospatial queries to find the nearest POIs, support high read traffic from millions of users, and maintain up-to-date business information.
Scale considerations:
500 million POIs globally
100,000 queries per second (read-heavy workload)
Sub-100ms query latency (p99)
Global geographic distribution
For detailed architectural solutions and technical implementation:
Design Yelp
"Design a system that manages points of interest (POI) for a mapping application. Explain how you would store and retrieve POI data, scale the system for high traffic, and ensure data consistency and integrity."
"How would you design a system to query the nearest 'k' points of interest (POIs)? Explain in detail your approach, including the use of geohash and indexing to record each block's POIs. Address edge cases by querying adjacent nodes. Additionally, discuss scaling strategies such as sharding and caching."
"In a design interview focused on Points of Interest (POI), you were asked about how to build an index and how to shard this index. Please describe your approach to indexing POIs and your strategy for sharding the index to ensure efficient data retrieval and scalability."
"Design a system for a review site similar to Yelp. Focus on the basics of designing a scalable application without the need to take user feedback into consideration. Specifically, discuss how to update restaurant data efficiently."
"Prepare a system design for a Point of Interest (POI) service similar to those used by Yelp or Uber. You will be informed about this topic by HR during preparation. Focus on designing a robust and scalable system."
"Design a 'get_poi' (point of interest) feature for a location-based service like Foursquare. Provide design approaches using geohash and quadtree, and discuss the strengths and weaknesses of each approach."
"You might be asked to design a QuadTree that can efficiently find the exactly nearest N points to a given point. Discuss how you would approach this problem, including any initial methods you might consider and how you would iterate upon them to find a better solution."
"Similar to design Yelp, they will ask very detailed questions about the geo index part, such as Geohash or quad tree"
✦ AI-Generated Solution · System Design · Comprehensive Find nearby POIs at scale: 500M POIs, 100K QPS, p99 < 100ms, global. Across the reported experiences the interviewer drills hardest on the geo index (geohash vs quadtree), index building, and sharding.
Functional
nearby(lat, lng, radius) and nearest_k(lat, lng, k) over POIs (restaurants, shops…).Non-functional
A B-tree on (lat, lng) can't answer "nearby" efficiently (2D range). You need a spatial index that maps 2D → 1D or partitions space:
| Approach | Idea | Strengths | Weaknesses |
|---|---|---|---|
| Geohash | Interleave lat/lng bits → Base32 prefix string; nearby points share prefixes | Simple, string-prefix range scans, easy to shard & cache, works in any KV/SQL | Fixed grid → dense cities overflow a cell; edge problem (neighbors in adjacent cells) |
| Quadtree | Recursively split space into 4 quadrants until ≤ N points per leaf | Adapts to density (deep in cities, shallow in deserts); exact nearest-N | In-memory tree, harder to shard/persist, rebalancing on writes |
| S2 / H3 | Hierarchical cells on a sphere (Google/Uber) | No pole/wraparound distortion | More machinery |
Recommendation: Geohash as the primary, shardable, cacheable index for "nearby in a radius"; mention quadtree/S2 when the interviewer pushes on dense-area adaptivity or exact nearest-N. Most strong answers lead with geohash and discuss quadtree as the trade-off.
A POI just across a cell boundary can be closer than ones inside your cell. Always query the target cell plus its 8 neighbors, merge candidates, then sort by true haversine distance and take the radius/k.
def nearby(lat, lng, radius):
precision = precision_for_radius(radius) # coarser cell for larger radius
cell = geohash.encode(lat, lng, precision)
cells = [cell] + geohash.neighbors(cell) # 3x3 block -> fixes edge cases
candidates = []
for c in cells:
candidates += index.range_scan(prefix=c) # shard-local prefix scan
return sorted(candidates, key=lambda p: haversine(lat, lng, p))[:K]

poi_id for the index — it would force every query to fan out to all shards.| Concern | Decision |
|---|---|
| Primary index | Geohash (prefix-shardable, cacheable) |
| Adaptive / exact-N | Quadtree / S2 as trade-off for dense areas |
| Edge correctness | Query cell + 8 neighbors, re-rank by haversine |
| Sharding | By geohash prefix; split hotspots; replicas |
| Freshness | Bulk build + CDC stream incremental updates |
| 100K QPS | Redis hot-cell cache + read replicas |