Scheduling
Orchestrate long-running tasks — cron-style job scheduling, distributed task coordination with locks and leader election, and progress tracking with status APIs.
Table of Contents
The Big Picture — Why Tasks Need Scheduling
Not every operation should happen immediately. Reports need to generate at midnight. Stale data needs cleanup every hour. Video transcoding takes minutes and can't block the upload response. These tasks need to run at the right time, exactly once, and with visibility into their progress.
Alarm Clock + Team + Progress Board
Think of a factory with three systems working together. The alarm clock (cron scheduler) decides WHEN tasks run — 'Generate the daily report at 6 AM.' The team coordinator (distributed coordination) ensures only ONE person does the job — if 5 workers hear the alarm, only one actually runs the report. The progress board (status API) shows everyone the current state — 'Report: 60% complete, processing sales data.' Without the alarm, tasks don't run on time. Without coordination, the same report generates 5 times. Without the board, nobody knows if it's done or stuck.
Timing
Tasks must run at specific times or intervals. A daily report at midnight, a cleanup job every hour, a retry after 30 minutes. Without scheduling, someone has to manually trigger every job.
Coordination
In a distributed system with 10 instances, a cron job fires on ALL 10. Without coordination, the same job runs 10 times — 10 duplicate reports, 10 duplicate emails.
Visibility
A video transcoding job takes 15 minutes. Without progress tracking, the user stares at a spinner with no idea if it's 10% done or 90% done — or if it failed silently.
🔥 Key Insight
Scheduling is three problems in one: when to run (timing), who runs it (coordination), and what's happening (visibility). Solving only one creates gaps — a perfectly timed job that runs 10 times, or a coordinated job that nobody can monitor.
Scheduling Architecture
Scheduler
Triggers at time
Coordinator
Ensures single exec
Queue
Buffers work
Worker
Executes task
Status Store
Tracks progress
SCHEDULER (timing) → Evaluates cron expressions → Triggers jobs at the right time → Pushes job messages to the queue → Does NOT execute the job itself COORDINATION LAYER (single execution) → Distributed lock (Redis SETNX / ZooKeeper) → Leader election (only leader triggers jobs) → Lease-based: lock expires if holder crashes → Prevents duplicate execution across instances QUEUE (decoupling) → Buffers jobs between scheduler and workers → Handles backpressure (workers busy → jobs wait) → Provides at-least-once delivery with ack WORKER (execution) → Picks up jobs from queue → Executes the actual task → Updates progress in status store → Acks the job on completion (or nacks on failure) STATUS STORE (visibility) → Stores: job_id, status, progress %, logs, errors → Status API: GET /api/jobs/{id} → { status, progress, ... } → Enables UI progress bars, admin dashboards, alerting
Cron-Style Job Scheduling
Cron scheduling runs jobs at fixed times or intervals defined by a cron expression. It's the simplest and most widely used scheduling pattern — every operating system, every cloud provider, and most frameworks support it.
Format: minute hour day-of-month month day-of-week Examples: "0 * * * *" → Every hour (at minute 0) "0 0 * * *" → Every day at midnight "0 6 * * 1" → Every Monday at 6:00 AM "*/5 * * * *" → Every 5 minutes "0 0 1 * *" → First day of every month at midnight "0 9-17 * * 1-5" → Every hour from 9 AM to 5 PM, weekdays only How it works internally: 1. Scheduler evaluates all registered cron expressions every minute 2. For each expression that matches the current time: → Create a job message → Push to the job queue 3. Worker picks up the job and executes it Scheduler does NOT execute jobs — it only triggers them. This separation allows scaling workers independently.
Real-World Use Cases
Daily Reports
Generate sales reports at midnight. Aggregate yesterday's data, build PDF, email to stakeholders. Runs once per day, takes 5-30 minutes.
Data Cleanup
Delete expired sessions every hour. Purge soft-deleted records after 30 days. Archive old logs. Keeps the database lean.
Health Checks
Ping all downstream services every minute. If a service is down, trigger an alert. Continuous monitoring without manual intervention.
Strengths
- ✅Simple and universally understood
- ✅Predictable — runs at exact times
- ✅No external dependencies (built into OS/framework)
- ✅Easy to audit (cron expression = schedule)
- ✅Decades of battle-tested reliability
Limitations
- ❌Not dynamic — can't schedule 'run in 30 minutes from now'
- ❌Time drift — clock skew between servers causes inconsistency
- ❌No built-in coordination — fires on every instance
- ❌No backpressure — triggers even if previous run isn't done
- ❌Minimum granularity is typically 1 minute
🎯 Interview Insight
Cron is the foundation, but never use it alone in a distributed system. Say: "I'd use cron to define the schedule, but wrap it with a distributed lock so only one instance triggers the job. The job goes to a queue, and workers execute it. This gives me timing (cron) + coordination (lock) + reliability (queue)."
Distributed Task Coordination
In a distributed system with N instances, a cron job fires on all N simultaneously. Without coordination, the same job executes N times. Distributed coordination ensures exactly-once triggering.
5 API server instances, each running the same cron: "0 0 * * *" → Generate daily report At midnight: Instance 1: cron fires → generate report ← ✅ Instance 2: cron fires → generate report ← ❌ duplicate Instance 3: cron fires → generate report ← ❌ duplicate Instance 4: cron fires → generate report ← ❌ duplicate Instance 5: cron fires → generate report ← ❌ duplicate Result: 5 identical reports generated, 5 emails sent. Users get 5 copies. Database does 5x the work.
Coordination Techniques
Distributed Lock (Redis SETNX)
Before executing, the instance tries to acquire a lock: SET job:daily_report:2025-01-15 NX EX 300. Only one instance succeeds (NX = set if not exists). The winner executes the job. Others see the lock exists and skip. The lock expires after 5 minutes (EX 300) in case the winner crashes.
Leader Election
One instance is elected as the 'leader' (via ZooKeeper, etcd, or a database row). Only the leader runs the scheduler. Other instances are standby. If the leader crashes, a new leader is elected. This is simpler than per-job locking but has a single point of scheduling.
Lease-Based Execution
A job is 'leased' to a worker for a fixed duration (e.g., 10 minutes). If the worker completes within the lease, it marks the job done. If it crashes, the lease expires and another worker can pick it up. Combines lock + timeout + retry.
At midnight, all 5 instances try: Instance 1: SET "lock:daily_report:2025-01-15" "instance-1" NX EX 300 → Response: OK (lock acquired ✅) → Execute job → push to queue → release lock Instance 2: SET "lock:daily_report:2025-01-15" "instance-2" NX EX 300 → Response: nil (lock exists, someone else has it) → Skip execution ✅ Instance 3-5: same as Instance 2 → skip ✅ Key design: → Lock key includes the date: prevents re-running tomorrow's job today → NX: atomic set-if-not-exists (no race condition) → EX 300: lock expires in 5 minutes (crash recovery) → Value = instance ID (for debugging: who holds the lock?) If Instance 1 crashes mid-execution: → Lock expires after 300 seconds → Next cron tick: another instance acquires the lock → Job retries (must be idempotent!)
| Technique | How It Works | Pros | Cons | Best For |
|---|---|---|---|---|
| Distributed Lock | SETNX per job execution | Simple, per-job granularity | Lock expiry tuning, Redis dependency | Most cron jobs, simple coordination |
| Leader Election | One instance is the scheduler | Simple logic, no per-job locks | Single point of scheduling, failover delay | Small clusters, few scheduled jobs |
| Lease-Based | Job leased with timeout | Handles crashes gracefully, auto-retry | More complex, needs idempotent jobs | Long-running jobs, unreliable workers |
🎯 Interview Insight
Distributed lock with Redis is the standard answer. Say: "Each instance tries to acquire a lock with SETNX before executing the cron job. Only one succeeds. The lock includes the job name and date to prevent re-execution. It has a TTL for crash recovery. The job must be idempotent in case of lock expiry and re-execution."
Progress Tracking & Status APIs
When a job takes minutes or hours, users and operators need visibility. A status API exposes the current state of every job — pending, running, progress percentage, completion, or failure with error details.
States: PENDING → Job created, waiting in queue RUNNING → Worker picked it up, executing COMPLETED → Finished successfully FAILED → Failed (with error details) RETRYING → Failed, scheduled for retry Transitions: PENDING → RUNNING (worker picks up job) RUNNING → COMPLETED (success) RUNNING → FAILED (error, max retries exceeded) RUNNING → RETRYING (error, will retry) RETRYING → RUNNING (retry attempt starts) Status API: GET /api/jobs/job_abc123 Response: { "id": "job_abc123", "type": "video_transcode", "status": "RUNNING", "progress": 65, "created_at": "2025-01-15T10:00:00Z", "started_at": "2025-01-15T10:00:05Z", "updated_at": "2025-01-15T10:03:22Z", "metadata": { "input": "video_456.mp4", "current_step": "Transcoding 720p variant", "steps_completed": 2, "steps_total": 4 }, "attempts": 1, "max_attempts": 3 }
How Workers Report Progress
function process_video_transcode(job): update_status(job.id, "RUNNING", progress=0) // Step 1: Download original update_status(job.id, "RUNNING", progress=10, step="Downloading original") download(job.input_url) // Step 2: Transcode 1080p update_status(job.id, "RUNNING", progress=30, step="Transcoding 1080p") transcode(input, "1080p") // Step 3: Transcode 720p update_status(job.id, "RUNNING", progress=55, step="Transcoding 720p") transcode(input, "720p") // Step 4: Transcode 480p + thumbnail update_status(job.id, "RUNNING", progress=80, step="Transcoding 480p") transcode(input, "480p") generate_thumbnail(input) // Step 5: Upload variants update_status(job.id, "RUNNING", progress=95, step="Uploading variants") upload_all_variants() update_status(job.id, "COMPLETED", progress=100) Where status is stored: → Redis (fast writes, good for real-time progress) → Database (durable, good for audit trail) → Both: Redis for live progress, DB for permanent record
What to Track
- ✅Job status (pending, running, completed, failed)
- ✅Progress percentage (0-100%)
- ✅Current step description ('Transcoding 720p')
- ✅Timestamps (created, started, updated, completed)
- ✅Attempt count and max attempts
- ✅Error details on failure (message, stack trace for internal use)
How Clients Consume Status
- ✅Polling: GET /api/jobs/{id} every 2-5 seconds
- ✅Long polling: server holds request until status changes
- ✅WebSocket: server pushes updates in real-time
- ✅Webhook: server calls client's URL on completion
- ✅SSE: server streams status updates over HTTP
🎯 Interview Insight
Progress tracking transforms UX. Say: "The worker updates progress in Redis as it completes each step. The client polls GET /api/jobs/{id} every 3 seconds to show a progress bar. On completion, we send a webhook to the client's callback URL. This gives users real-time visibility without blocking."
End-to-End Scenario
Let's design a job scheduling system for a platform that generates daily analytics reports and processes user-uploaded videos.
SCHEDULED JOBS (cron + coordination): Job: Daily Analytics Report Cron: "0 0 * * *" (midnight) Flow: 1. Cron fires on all 5 instances at midnight 2. Each instance tries: SETNX "lock:analytics:2025-01-15" EX 300 3. Instance 3 wins the lock 4. Instance 3 pushes job to queue: { type: "analytics_report", date: "2025-01-15" } 5. Worker picks up job → queries data warehouse → builds report 6. Worker updates status: RUNNING → 30% → 60% → 90% → COMPLETED 7. Report stored in S3, link emailed to stakeholders ON-DEMAND JOBS (user-triggered): Job: Video Transcoding Trigger: User uploads video Flow: 1. Upload completes → API creates job: POST /api/jobs { type: "video_transcode", input: "s3://uploads/video_789.mp4" } 2. Job pushed to priority queue (user-facing = high priority) 3. Worker picks up → transcodes → updates progress every 10% 4. Client polls: GET /api/jobs/job_xyz → { status: "RUNNING", progress: 65 } 5. UI shows: "Processing your video... 65%" 6. Worker completes → status: COMPLETED → webhook fires 7. Client receives webhook → shows "Video ready!" COORDINATION: → Scheduled jobs: Redis distributed lock (prevent duplicates) → On-demand jobs: queue handles coordination (each message consumed once) → Both: idempotent workers (safe to retry on failure) MONITORING: → Dashboard: all jobs, status, duration, failure rate → Alerts: job stuck in RUNNING > 30 min, failure rate > 5% → Dead-letter queue: failed jobs after max retries → manual review
💡 This Is How Production Systems Work
Airflow, Temporal, and Celery all implement this pattern: scheduler (timing) + queue (decoupling) + workers (execution) + status store (visibility) + coordination (single execution). The specific tools vary, but the architecture is universal.
Trade-offs & Decision Making
| Decision | Option A | Option B | Choose A When | Choose B When |
|---|---|---|---|---|
| Scheduling approach | Cron (fixed schedule) | Dynamic (run at arbitrary time) | Recurring jobs (daily reports, cleanup) | User-triggered (process in 30 min, retry at specific time) |
| Coordination | Distributed lock (per-job) | Leader election (single scheduler) | Many different jobs, independent schedules | Few jobs, simple setup, small cluster |
| Progress delivery | Polling (client pulls) | Push (WebSocket/webhook) | Simple, stateless, most use cases | Real-time UX needed, long-running jobs |
| Scheduler | In-app (library-based) | External (Airflow, Temporal) | Simple cron jobs, small team | Complex DAGs, dependencies, large team |
🔧 Simple Stack (most teams)
- Cron expression in app config
- Redis distributed lock for coordination
- SQS/RabbitMQ for job queue
- Redis for progress, PostgreSQL for audit
- Polling API for status
🏗️ Advanced Stack (large teams)
- Airflow / Temporal for orchestration
- DAG-based job dependencies
- Built-in retry, timeout, alerting
- UI dashboard for job management
- Webhook + SSE for real-time status
Interview Questions
Q:How does cron scheduling work in a distributed system?
A: Cron defines WHEN a job should run (e.g., '0 0 * * *' = midnight). In a distributed system with N instances, the cron fires on all N simultaneously. To prevent duplicate execution, wrap it with a distributed lock: each instance tries SETNX in Redis before executing. Only one succeeds. The winner pushes the job to a queue, and a worker executes it. The lock key includes the job name and date (e.g., 'lock:daily_report:2025-01-15') to prevent re-execution. TTL on the lock handles crash recovery.
Q:How do you prevent duplicate job execution?
A: Three approaches: (1) Distributed lock (Redis SETNX) — before executing, acquire a lock. Only one instance succeeds. Lock has TTL for crash recovery. (2) Leader election — one instance is the scheduler, others are standby. Only the leader triggers jobs. (3) Queue-based dedup — push the job to a queue with a deduplication ID. The queue ensures each message is delivered once. In all cases, workers should be idempotent — if a job accidentally runs twice (lock expired, leader failover), the result should be the same.
Q:How do you track job progress in a long-running task?
A: The worker updates a status store (Redis for real-time, DB for persistence) at each step: { job_id, status: 'RUNNING', progress: 65, step: 'Transcoding 720p' }. Clients consume this via: (1) Polling — GET /api/jobs/{id} every 3 seconds. (2) WebSocket — server pushes updates. (3) Webhook — server calls client's URL on completion. The status includes: state (pending/running/completed/failed), progress %, current step, timestamps, attempt count, and error details on failure.
You're designing a report generation system that runs daily for 10,000 tenants
How would you schedule and coordinate this?
Answer: (1) Cron triggers at midnight: push 10,000 job messages to a queue (one per tenant). (2) Distributed lock ensures only one instance triggers the batch. (3) 20 workers consume from the queue in parallel — each generates one tenant's report. (4) Each worker updates progress in Redis. (5) Status API: GET /api/reports/{tenant_id}/latest → { status, progress, download_url }. (6) On completion, webhook notifies the tenant. (7) Failed jobs go to a dead-letter queue for manual review. This processes 10,000 reports in parallel instead of sequentially — 20 workers × 5 min/report = ~42 minutes total.
Common Pitfalls
Duplicate job execution
Cron fires on all 5 instances. No distributed lock. The daily report generates 5 times. 5 emails sent to every stakeholder. The database does 5x the aggregation work. Users lose trust in the system.
✅Always wrap cron jobs with a distributed lock (Redis SETNX with TTL). The lock key should include the job name and execution date. Only the instance that acquires the lock triggers the job. All others skip silently.
Clock drift between servers
Server A's clock is 3 seconds ahead of Server B. Server A's cron fires first, acquires the lock, and runs the job. But sometimes Server B fires first due to NTP corrections. Jobs run at inconsistent times, and occasionally both fire within the lock's acquisition window.
✅Use NTP to synchronize clocks across all servers. Set lock TTL longer than the maximum expected clock drift (e.g., 60 seconds). Use the queue as the source of truth for job execution — the scheduler only triggers, the queue ensures exactly-once delivery.
No visibility into job state
A video transcoding job is submitted. The user sees 'Processing...' for 20 minutes with no progress indicator. Is it 10% done? 90% done? Did it fail? The user refreshes, submits again (duplicate), and contacts support. Support has no way to check the job status either.
✅Implement a status API from day one. Workers update progress at each step. The API returns: status, progress %, current step, timestamps, and error details. The UI shows a progress bar. Support can look up any job by ID. Alerts fire if a job is stuck in RUNNING for too long.
Poor failure handling
A worker crashes mid-execution. The job is marked as RUNNING forever — it never completes, never fails, never retries. The lock is held indefinitely (no TTL). The next scheduled run can't acquire the lock. The job never runs again.
✅Use lease-based execution: locks have TTL (expire if not renewed). Workers send heartbeats to extend the lease. If a worker crashes, the lease expires and another worker can pick up the job. Set max execution time — if a job exceeds it, mark as FAILED and retry. Always have a dead-letter queue for jobs that fail after max retries.