This OpenAI Machine Learning Engineer question was reported as a multi-part coding problem about constructing a schedule for data labeling work.
Part 1 is the easy version: build any valid schedule.
Part 2 adds stronger fairness constraints that must hold at every prefix of the returned list.
Part 3 is a streaming followup: tasks arrive in daily batches and the scheduler must update incrementally.
You are given:
t tasks, indexed 0..t-1
m models, indexed 0..m-1
h human labelers, indexed 0..h-1
a target k
Return a scheduling, represented as a list of tuples:
(task, model, human)
Each tuple means:
human human works on task task
that assignment is paired with model model
Your schedule must satisfy:
Every human labeler participates in at least k assignments total.
Every (task, human) pair appears at most once.
For every task x, and for every prefix of the schedule, the counts across models stay balanced:
max_i count_prefix(x, model=i) - min_i count_prefix(x, model=i) <= 1
For every task x, and for every prefix of the schedule, the counts across humans also stay balanced:
max_j count_prefix(x, human=j) - min_j count_prefix(x, human=j) <= 1
If no valid schedule exists, return failure in whatever form your language uses (None, empty list, exception, etc.).
Ignore the prefix-balance constraints for now. Just return any schedule such that:
each human appears at least k times
each human works on a given task at most once
Because each human can touch a task at most once, a human can do at most t assignments total. Therefore:
k > t => impossible
That is the main feasibility condition for Part 1.
If k <= t and m >= 1, assign each human to k distinct tasks, for example:
for human in range(h):
for offset in range(k):
task = (human + offset) % t
model = 0
schedule.append((task, model, human))
This is already enough for the easy part.
Now add the stronger requirement that the schedule must remain balanced at every intermediate point, not just at the end.
This sounds more complicated than it is. The clean construction is:
Schedule work in exactly k rounds.
In each round, every human gets exactly one assignment.
Rotate tasks cyclically so each human sees a new task each round.
For each task, assign models in local round-robin order.
For round r and human u, choose:
task = (u + r) mod t
This gives every human exactly one assignment per round, and over k rounds each human sees k distinct tasks as long as k <= t.
Now keep a counter task_seen[task], meaning how many times task task has already been scheduled. When task task appears again, assign:
model = task_seen[task] mod m
Then increment task_seen[task].
This makes the model counts for each fixed task cycle through:
0, 1, 2, ..., m-1, 0, 1, 2, ...
So for every task, the counts across models are always as even as possible, even at intermediate prefixes.
The (task, human) balance condition is mostly automatic once you enforce:
each (task, human) pair is used at most once
Since every such count is then either 0 or 1, the difference between the maximum and minimum is never more than 1.
That means the real nontrivial part is balancing models for each task while also giving every human at least k distinct tasks.
Suppose:
t = 3, m = 2, h = 4, k = 2
One valid schedule is:
[
(0, 0, 0),
(1, 0, 1),
(2, 0, 2),
(0, 1, 3),
(1, 1, 0),
(2, 1, 1),
(0, 0, 2),
(1, 0, 3),
]
Check the properties:
every human appears exactly 2 times
no human repeats the same task
for task 0, model counts evolve as (1,0), then (1,1), then (2,1) so the difference never exceeds 1
the same holds for tasks 1 and 2
from typing import List, Optional, Tuple
Assignment = Tuple[int, int, int]
def build_basic_schedule(t: int, m: int, h: int, k: int) -> Optional[List[Assignment]]:
"""
Part 1: ignore prefix balance and build any valid schedule.
"""
if k == 0:
return []
if t <= 0 or m <= 0 or h <= 0:
return None
if k > t:
return None
schedule: List[Assignment] = []
for human in range(h):
for offset in range(k):
task = (human + offset) % t
model = 0
schedule.append((task, model, human))
return schedule
def build_balanced_schedule(t: int, m: int, h: int, k: int) -> Optional[List[Assignment]]:
"""
Part 2: build a schedule such that:
each human appears at least k times
each (task, human) pair appears at most once
for every task, model counts stay prefix-balanced
task-human balance is automatic because counts are only 0/1
Returns a minimal-length valid schedule of exactly h * k assignments,
or None if impossible.
"""
if k == 0:
return []
if t <= 0 or m <= 0 or h <= 0:
return None
if k > t:
return None
schedule: List[Assignment] = []
task_seen = [0] * t
for round_idx in range(k):
for human in range(h):
task = (human + round_idx) % t
model = task_seen[task] % m
schedule.append((task, model, human))
task_seen[task] += 1
return schedule
1. Each human gets at least k assignments
There are exactly k rounds, and each human receives exactly one assignment per round. So each human appears exactly k times.
For a fixed human u, the assigned tasks are:
(u + 0) mod t,
(u + 1) mod t,
...,
(u + k - 1) mod t
These are all distinct when k <= t.
Fix any task x. Each time x appears, we assign the next model in round-robin order:
0, 1, 2, ..., m-1, 0, 1, ...
After any number of occurrences of task x, each model has been used either:
floor(c / m) times, or
ceil(c / m) times
where c is the number of times task x has appeared so far.
Therefore the difference between the largest and smallest model count is always at most 1.
Every tuple increases the total assignment count of exactly one human by 1. Since each of the h humans must reach at least k, any valid schedule needs at least:
h * k
assignments.
Our construction uses exactly h * k, so it is optimal in schedule length.
The construction is straightforward:
Time: O(hk)
Space: O(t) auxiliary space, plus the output list
A separate followup variant turns the same problem into an online one. Instead of returning a full schedule up front, the platform receives a fresh batch of tasks each day and must emit assignments incrementally.
Each day a batch of tasks arrives. The scheduler must produce that day's (task, model, human) tuples before the next day begins.
Each human can do at most one task per day.
History accumulates across days. The same (task, human) pair is still forbidden, even across different days.
The scheduler must support incremental updates. It cannot recompute the entire history from scratch every day.
Long-term goal: across all days combined, model and human counts per task remain balanced (the same prefix-balance condition as in Part 2).
The k target from Parts 1 and 2 is not part of this variant. Instead, balance is measured against the cumulative schedule.
Maintain a small amount of persistent state across days:
next_model[task]: round-robin pointer for the next model to assign to this task
next_human[task]: round-robin pointer for the next human to try for this task
pair_used: set of (task, human) pairs already scheduled in any prior day
Each day, also track:
used_today: set of humans already assigned today (resets each day)
For every task in today's batch, walk the human round-robin starting at next_human[task] and pick the first human that is neither in used_today nor already paired with this task. Pick the model as next_model[task] % m and advance the model pointer.
from typing import Iterable, List, Set, Tuple
Assignment = Tuple[int, int, int]
class StreamingScheduler:
"""
Incremental scheduler for daily batches of labeling tasks.
Each call to schedule_day consumes the tasks that arrived that day
and returns the assignments produced for that day, while updating
persistent state used by subsequent days.
"""
def init(self, t: int, m: int, h: int) -> None:
if t <= 0 or m <= 0 or h <= 0:
raise ValueError("t, m, h must all be positive")
self.t = t
self.m = m
self.h = h
self.next_model = [0] * t
self.next_human = [0] * t
self.pair_used: Set[Tuple[int, int]] = set()
def schedule_day(self, tasks_today: Iterable[int]) -> List[Assignment]:
used_today: Set[int] = set()
result: List[Assignment] = []
for task in tasks_today:
human = self._pick_human(task, used_today)
if human is None:
continue
model = self.next_model[task] % self.m
result.append((task, model, human))
used_today.add(human)
self.pair_used.add((task, human))
self.next_model[task] += 1
self.next_human[task] = (human + 1) % self.h
return result
def _pick_human(self, task: int, used_today: Set[int]) -> int | None:
start = self.next_human[task]
for offset in range(self.h):
human = (start + offset) % self.h
if human in used_today:
continue
if (task, human) in self.pair_used:
continue
return human
return None
Models per task. next_model[task] advances by exactly one whenever task task is scheduled, so the model assigned cycles through 0, 1, ..., m-1, 0, 1, .... After any number of occurrences of task task, model counts differ by at most one. This is the same argument as Part 2, just spread across days.
Humans per task. Each (task, human) pair is used at most once, so per-task human counts are always 0 or 1. The difference between max and min is therefore at most one.
Per-day cap. used_today resets each day and blocks reusing a human, so no labeler is overloaded within a day.
If a task arrives on a day where no human satisfies both constraints (every available human has already labeled this task before, or has already been used today), the scheduler must defer that task. The reference solution above silently skips it. Real systems would push it onto a backlog and retry the next day.
The lifetime hard limit is h assignments per task, since each (task, human) pair can be used at most once. Once every human has labeled a given task, no further work for that task can ever be scheduled, and the total schedule length across all days is bounded by t * h.
Let b be the size of today's batch.
Time: O(b * h) worst case, since picking a human scans up to h candidates.
Space: O(t + P) where P is the cumulative number of (task, human) pairs already used. P grows monotonically and is bounded by t * h.