Google Calendar is a time-management and scheduling application that allows users to create events and view their schedule across different time ranges. The core challenges are efficiently querying events for different views (day/week/month/year) and synchronizing changes across multiple devices in near real-time.
This walkthrough follows the Interview Framework. Use it as a guide, not a script—adapt based on interviewer cues.
Create events — Users can create events with title, time, location, and description
View calendar — Users can view their events in day, week, month, or year view
Update/delete events — Users can modify or remove existing events
Multi-device sync — Changes on one device appear on all other devices
Recurring events (complex pattern matching)
Calendar sharing and permissions
Meeting invitations and RSVPs
Integration with video conferencing
User-facing notifications and reminders
Recurring events are a common follow-up. If asked, describe storing a recurrence rule (RRULE) and generating instances on demand.
| Requirement | Target | Notes |
|---|---|---|
| Availability | 99.9% | Calendar is critical for scheduling |
| Latency | < 200ms for reads | Calendar views should load quickly |
| Consistency | Eventual (< 2s sync) | Users expect changes to sync within seconds |
| Scale | 500M users, 250B events | Global scale |
The interviewer may push on consistency requirements. For calendar, eventual consistency is acceptable—if you create an event on your phone, seeing it on your laptop within 2 seconds is fine.
Assumptions:
500M total users, 100M DAU
Average user has 500 events total
5 new events created per user per week
Average event size: 1 KB
Storage:
Total events: 500M users × 500 events = 250B events
Storage: 250B × 1 KB = 250 TB
Throughput:
Event creates: 100M DAU × 5/week ÷ 7 days = ~70M creates/day = ~800 writes/sec
Calendar views: 100M DAU × 10 views/day = ~12K reads/sec
This is a read-heavy system (15:1 read-to-write ratio). The real challenge isn't write throughput—it's efficiently querying events for different time ranges across billions of events.
User
├── id: UUID
├── email: string
├── timezone: string
└── created_at: timestamp
Event
├── id: UUID
├── user_id: UUID
├── title: string
├── description: string (optional)
├── location: string (optional)
├── start_time: timestamp
├── end_time: timestamp
├── created_at: timestamp
├── updated_at: timestamp
└── deleted_at: timestamp (null if not deleted, for soft delete)
UserDevice
├── user_id: UUID
├── device_id: string (unique per device)
├── push_token: string (APNs/FCM token)
├── last_sync_time: timestamp (server-issued)
└── device_type: "ios" | "android" | "web"
Why timestamps in UTC?
Consistent storage regardless of user timezone
Convert to local timezone at display time
Handles timezone changes correctly (e.g., traveling)
Why track updated_at?
Essential for sync—devices query "events changed since X"
Enables incremental sync instead of full refresh
Why soft delete (deleted_at) instead of hard delete?
Sync requires knowing what was deleted since last sync
Hard deletes leave no trace—devices wouldn't know to remove the event
Regular reads filter deleted_at IS NULL; sync queries include deleted_at > last_sync_time
Why separate UserDevice table?
One user has multiple devices (phone, tablet, laptop)
Track sync state per device independently
Store push tokens for each device
Enable targeted notifications—only notify devices that don't have the latest data
We need two types of communication:
REST APIs for CRUD operations
Push mechanism for real-time sync across devices
POST /events
Request: { "title": "Team Meeting", "start_time": "2024-01-15T10:00:00Z",
"end_time": "2024-01-15T11:00:00Z", "location": "Room 101" }
Response: { "event_id": "abc123", "created_at": "..." }
GET /events?start={timestamp}&end={timestamp}
Response: { "events": [...] }
// Only non-deleted events (deleted_at IS NULL)
PUT /events/{event_id}
Request: { "title": "Updated Title", ... }
Response: { "event_id": "abc123", "updated_at": "..." }
DELETE /events/{event_id}
Response: { "success": true }
GET /sync?since={sync_time}
Response: { "events": [...], "sync_time": "..." }
// sync_time is server-issued; if needed, treat it as a cursor (timestamp + event_id)
// Events include deleted_at field; non-null means deleted
// Server → Device (via APNs/FCM)
{
"type": "sync_required",
"user_id": "...",
"timestamp": "..."
}
The /sync endpoint is crucial for multi-device support. Instead of fetching all events, devices only request changes since their last sync time (server-issued)—dramatically reducing bandwidth and latency.
Push Infrastructure
Storage Layer
Cache Layer
Application Layer
Edge Layer
Clients
Phone App
Web Browser
Tablet App
Load Balancer
API Gateway
Calendar Service
Sync Service
Notification Service
Redis Cache
PostgreSQL
Events
APNs
FCM
Push notifications flow back from APNs/FCM to client devices to trigger sync.
Device B
Notification Service
Sync Service
PostgreSQL
Calendar Service
API Gateway
Device A
Device B
Notification Service
Sync Service
PostgreSQL
Calendar Service
API Gateway
Device A
POST /events
Create event
INSERT event
OK
event_id
{ event_id: "abc123" }
Event created notification
Notify other devices
Push: sync_required
GET /sync?since=...
Fetch changes
Query changes
Changes
Changes
New event data
The key insight: Device A doesn't wait for Device B's sync to complete. The create operation returns immediately, and sync happens asynchronously.
PostgreSQL
Redis Cache
Calendar Service
API Gateway
Device
PostgreSQL
Redis Cache
Calendar Service
API Gateway
Device
alt
[Cache hit]
[Cache miss]
GET /events?start=...&end=...
Query events
Check cache
Cached events
Query by time range
Events
Update cache
Events
{ events: [...] }
When Device A creates an event, Device B needs to know about it:
Device B (Idle)
Server
Device A
Device B (Idle)
Server
Device A
Device B last synced at T1
Device B wakes up
Device B updates local DB
Last sync = T2
POST /events (new meeting)
OK
Push notification: sync_required
GET /sync?since=T1
{ events: [new meeting], sync_time: T2 }
This is the core of the follow-up question. Let's explore the sync mechanism in detail.
| Strategy | Latency | Battery | Bandwidth | Complexity |
|---|---|---|---|---|
| Polling | 30s-60s | High | High | Low |
| Long Polling | 1-5s | Medium | Medium | Medium |
| Push + Pull | 1-2s | Low | Low | Medium |
| WebSocket | <1s | Medium | Low | High |
Push + Pull is the sweet spot for calendar apps. It combines push notifications for instant alerting with incremental sync for efficient data transfer. WebSocket would be overkill—calendar updates are infrequent compared to chat or docs.
Step 1: Push Notification (Wake-up Signal)
When an event is created/updated/deleted:
Calendar Service writes to database
Sync Service identifies which other devices need notification
Notification Service sends lightweight push via APNs/FCM
Push contains minimal data—just "sync required"
Step 2: Incremental Sync (Pull)
When device receives push (or app opens):
Device calls /sync?since={last_sync_time} (cursor from server, not device clock)
Server queries events with updated_at > since OR deleted_at > since
Response includes all changed events (with deleted_at field) and a new sync_time
Device processes each event: if deleted_at is set, remove locally; otherwise upsert
Device updates its last_sync_time with the returned sync_time
To avoid missing updates with the same timestamp, the cursor can include a tie-breaker (event_id) and the query can compare (updated_at, id).
-- Server query for sync: get all changed events (created, updated, or deleted)
-- Use server-issued sync_time as the cursor; add an id tie-breaker if needed.
SELECT id, title, start_time, end_time, updated_at, deleted_at
FROM events
WHERE user_id = ?
AND (updated_at > ? OR deleted_at > ?)
ORDER BY COALESCE(deleted_at, updated_at) ASC, id ASC
LIMIT 1000;
-- Client checks deleted_at:
-- If deleted_at IS NOT NULL → remove from local DB
-- If deleted_at IS NULL → upsert to local DB
Offline Device Comes Online:
Device may have been offline for days/weeks
Use same /sync endpoint—just larger response
For very long offline periods, may need to paginate
Concurrent Edits from Multiple Devices:
Last-write-wins based on updated_at timestamp
For calendar events, this is acceptable (unlike collaborative docs)
If Device A and B both edit same event, last save wins
Device Sync Falls Behind:
Track last_sync_time (server-issued) per device
If gap is too large (>30 days), do full refresh instead of incremental
Common mistake: Trying to implement CRDTs or OT for calendar sync. Unlike Google Docs, calendar events don't need character-level conflict resolution—last-write-wins is sufficient for most cases.
Problem: "Show me events for January 2024" needs to scan billions of events.
Solution: Composite index on (user_id, start_time)
CREATE INDEX idx_events_user_time
ON events (user_id, start_time, end_time);
-- Query for month view
SELECT * FROM events
WHERE user_id = ?
AND deleted_at IS NULL
AND start_time < '2024-02-01'
AND end_time > '2024-01-01'
ORDER BY start_time;
Why this works:
Index allows direct seek to user's events
Time range filter eliminates most rows
Average user has ~500 events—small dataset per user
Alternative: Time-partitioned tables
For extremely high scale, partition events by month within each user_id shard:
events_2024_01
events_2024_02
events_2023_12
...
Query only touches relevant partitions for the requested time range.
What to cache:
Recent calendar views (current week, current month)
User preferences and timezone
Device sync state
Cache invalidation:
# On event create/update/delete
def invalidate_cache(user_id, event):
# Invalidate affected time ranges
affected_days = get_days_between(event.start_time, event.end_time)
for day in affected_days:
cache.delete(f"calendar:{user_id}:day:{day}")
cache.delete(f"calendar:{user_id}:week:{get_week(day)}")
cache.delete(f"calendar:{user_id}:month:{get_month(day)}")
Cache at the view level (day/week/month), not individual events. This matches access patterns—users typically load entire views, not single events.
Problem: 250TB of events in a single database doesn't scale.
Solution: Shard by user_id
Shards
Router
Shard Router
Shard 1: hash user_id mod 4 = 0
Shard 2: hash user_id mod 4 = 1
Shard 3: hash user_id mod 4 = 2
Shard 4: hash user_id mod 4 = 3
Why shard by user_id:
All queries are user-scoped (my calendar, my events)
No cross-user queries needed
Each user's data is fully contained in one shard
Even distribution with hash-based routing
Why NOT shard by time:
Would require querying all shards for "show my month"
Hot shards for current time periods
User data scattered across shards
Problem: Push notifications can fail (network issues, app uninstalled).
Fallback mechanisms:
App-open sync — Always sync when app opens, regardless of push
Background fetch — iOS/Android periodic background refresh
Exponential backoff — Retry failed pushes with increasing delay
Silent push — Use silent notifications that don't alert user but trigger sync
def send_sync_notification(user_id, device_id):
try:
push_service.send(device_id, {"type": "sync_required"})
except PushFailed:
# Schedule retry with backoff
retry_queue.add(device_id, backoff=calculate_backoff(attempt))
Scenario: User creates event on phone in airplane mode.
Options:
Optimistic local-first (Recommended)
Save to local DB immediately
Queue for sync when online
Handle conflicts on reconnection
Pessimistic server-first
Require server confirmation
Better consistency, worse offline experience
Calendar apps should prioritize availability over consistency. Users expect to view and create events offline. Sync conflicts are rare (how often do you edit the same event from two devices simultaneously?) and easy to resolve (last-write-wins).
Over-engineering sync — Calendar events are updated infrequently. You don't need WebSockets or CRDTs. Push notification + incremental pull is sufficient.
Ignoring time zones — Store everything in UTC. Convert to local time at display. Never store local times—they break when users travel.
Full calendar sync on every change — Transferring all events on each sync wastes bandwidth and battery. Use incremental sync with updated_at timestamps.
Sharding by time — Seems intuitive but forces cross-shard queries for "show my calendar." Shard by user_id instead.
Not handling offline — Calendar is a critical app. Users expect it to work without internet. Design for local-first with background sync.
Polling for sync — Drains battery and wastes bandwidth. Use push notifications to wake devices only when changes occur.
Before finishing, verify you've covered:
Explained time-range query optimization (indexing strategy)
Described multi-device sync mechanism (push + pull)
Addressed how incremental sync works (updated_at tracking)
Discussed caching strategy for calendar views
Explained sharding approach (by user_id)
Covered offline support and conflict resolution
Mentioned push notification reliability and fallbacks
| Component | Technology | Purpose |
|---|---|---|
| Event storage | PostgreSQL (sharded) | ACID transactions, time-range queries |
| Caching | Redis | Fast calendar view retrieval |
| Device sync | Push + Pull pattern | Low-latency, battery-efficient sync |
| Push notifications | APNs/FCM | Wake devices for sync |
| Time-range indexing | Composite index (user_id, start_time) | Efficient view queries |
| Offline support | Local-first with queue | Works without internet |