Each layer is independently scalable and independently replaceable. The boundaries between them are where most "export feature" bugs actually live.
User clicks Export → API enqueues a job → returns instantly with a job ID. The HTTP request is done in < 50ms regardless of how big the export is. Use BullMQ on Redis (modern, replaces old Bull). It gives you priorities, retries with exponential backoff, concurrency caps per worker, dead-letter handling, and Prometheus-friendly metrics out of the box.
Worker pulls rows from a DB cursor (Postgres: pg-cursor, Mongo: native cursor), pipes them through a CSV/JSONL transform (csv-stringify in object mode), pipes through gzip (zlib.createGzip()), pipes into an S3 multipart upload (@aws-sdk/lib-storage Upload class). Every layer is a Node stream — backpressure handles itself.
When the upload completes, generate an S3 presigned URL (24-hour expiry typical) and notify the user via the channel they expect — email (Resend / Postmark / SES), webhook, in-app notification, or websocket. User downloads from S3 directly. Your API never sees the bytes.
Concrete versions of each layer. Swap any piece for an equivalent in your stack — the architecture survives.
// 1. API endpoint — instant ACK, returns job ID app.post('/exports', async (req, res) => { const job = await exportQueue.add('csv', { userId: req.user.id, query: req.body.query, format: 'csv' }); res.json({ jobId: job.id, status: 'queued' }); }); // 2. Worker (separate process) — streams the export import { Worker } from 'bullmq'; import { Upload } from '@aws-sdk/lib-storage'; import Cursor from 'pg-cursor'; import { stringify } from 'csv-stringify'; import { createGzip } from 'zlib'; import { PassThrough } from 'stream'; new Worker('exports', async (job) => { const client = await pool.connect(); const cursor = client.query(new Cursor(job.data.query)); const body = new PassThrough(); const upload = new Upload({ client: s3, params: { Bucket: 'exports', Key: `${job.id}.csv.gz`, Body: body, ContentType: 'text/csv', ContentEncoding: 'gzip' }, partSize: 5 * 1024 * 1024 }); const uploadPromise = upload.done(); const csv = stringify({ header: true }); const gz = createGzip(); csv.pipe(gz).pipe(body); while (true) { const rows = await cursor.read(1000); if (!rows.length) break; for (const r of rows) if (!csv.write(r)) await once(csv, 'drain'); await job.updateProgress(cursor.rowCount); } csv.end(); await uploadPromise; await cursor.close(); client.release(); // 3. Deliver presigned URL const url = await getSignedUrl(s3, new GetObjectCommand({ Bucket: 'exports', Key: `${job.id}.csv.gz` }), { expiresIn: 86400 }); await sendEmail(job.data.userId, url); }, { connection: redis, concurrency: 4 });
If you're already on AWS and don't want to manage Redis, the architecture is identical — every layer maps cleanly to a native AWS service. Same 3 layers, less infrastructure to babysit.
| Layer | Bull/Redis stack | AWS-native equivalent |
|---|---|---|
| Job queue | BullMQ on Redis | Amazon SQS (standard or FIFO) |
| Worker process | Always-on Node process (EC2 / ECS / Fargate) | AWS Lambda triggered by SQS (zero infra, pay per message) |
| Retries + DLQ | Manual via BullMQ retry options | SQS DLQ built-in (set maxReceiveCount) |
| Job state | Redis hash per job | DynamoDB table keyed by job ID (or just SQS message attributes) |
| Multi-stage orchestration | Chained BullMQ jobs (manual) | AWS Step Functions — declarative state machine, automatic retry/branching |
// 1. API endpoint — enqueues to SQS import { SQSClient, SendMessageCommand } from '@aws-sdk/client-sqs'; const sqs = new SQSClient({ region: 'us-east-1' }); app.post('/exports', async (req, res) => { const jobId = crypto.randomUUID(); await sqs.send(new SendMessageCommand({ QueueUrl: process.env.EXPORT_QUEUE_URL, MessageBody: JSON.stringify({ jobId, userId: req.user.id, query: req.body.query }) })); res.json({ jobId, status: 'queued' }); }); // 2. Lambda handler — triggered by SQS, streams the export export const handler = async (event) => { for (const record of event.Records) { const { jobId, userId, query } = JSON.parse(record.body); const client = await pool.connect(); const cursor = client.query(new Cursor(query)); const body = new PassThrough(); const upload = new Upload({ client: s3, params: { Bucket: process.env.EXPORT_BUCKET, Key: `${jobId}.csv.gz`, Body: body, ContentType: 'text/csv', ContentEncoding: 'gzip' } }); const uploadPromise = upload.done(); const csv = stringify({ header: true }); csv.pipe(createGzip()).pipe(body); while (true) { const rows = await cursor.read(1000); if (!rows.length) break; for (const r of rows) if (!csv.write(r)) await once(csv, 'drain'); } csv.end(); await uploadPromise; await cursor.close(); client.release(); // 3. Presigned URL → SES email const url = await getSignedUrl(s3, new GetObjectCommand({ Bucket: process.env.EXPORT_BUCKET, Key: `${jobId}.csv.gz` }), { expiresIn: 86400 }); await sendSesEmail(userId, url); } };
When to pick which stack:
Each one looks reasonable until row count crosses a threshold, then everything breaks at once. Worth a checklist before you ship.
JSON.stringifyWorks perfectly at 10k rows. OOMs your worker at 1M.
OFFSET 1000000 LIMIT 1000 paginationEvery page rescans all earlier rows. O(n²) total. Last page on a 10M-row table can take minutes.
WHERE id > $lastId ORDER BY id LIMIT 1000. Constant time per page. Or just use a server-side cursor and skip pagination entirely.One concurrent export pegs the event loop for everyone else's requests. p99 latency goes nonlinear.
Load balancer timeouts, broken retries, exhausted egress, frontend that hangs for 12 minutes.
Worker crashes 80% of the way through. Job retries — restarts from row 1 with no record of progress, double-writes the partial file.
The capability layer is solved. Every cloud has S3-equivalent object storage, every language has streaming primitives, every queue library on Redis handles distributed work. The pieces are abundant.
The translation gap is wiring those pieces together correctly the first time. Most teams ship a synchronous export endpoint, hit a memory ceiling, then bolt on a queue. Then hit OFFSET slowness, bolt on cursors. Then hit egress costs, bolt on S3. Each retrofit is 2-4 weeks of work that the original architecture would have absorbed for free.
The 3-layer pattern is the operator-translation — turning "Node.js can do this" into "here's the wiring that works in production from day one." Worth getting right before row counts force the rewrite.
Capability is abundant. The differentiation is wiring.
Three layers, each doing one job — and the export problem is permanently solved.
The questions readers send most often after reading the architecture. Honest answers, both stacks covered.
If you're already running Redis, want a job-monitor UI like Bull Board, need fine-grained concurrency control, or are multi-cloud / on-prem — use BullMQ. If you're already on AWS, don't want Redis ops, jobs typically complete in under 15 minutes, and want pay-per-use cost — use SQS + Lambda. SQS + Lambda is the default for AWS-native shops because it has zero infrastructure to babysit and SQS DLQ is built in.
Almost certainly because you're loading the full result set into memory then JSON.stringify-ing it. Works perfectly at 10K rows, OOMs at 1M. The fix is to stream from a database cursor (pg-cursor for Postgres, native cursor for MongoDB) through a CSV transform (csv-stringify in object mode), through gzip (zlib.createGzip), into an S3 multipart upload. Memory stays flat at 50-100MB regardless of row count because every layer is a Node stream and backpressure handles itself.
Lambda has a hard 15-minute execution cap. For exports that may exceed that, route them to AWS Batch (containerized worker, no time limit, runs to completion) or break the export into stages with AWS Step Functions for declarative orchestration. For most "export this report" use cases, 15 minutes is plenty — 1M rows of CSV gzipped to S3 typically completes in 2-4 minutes from a well-indexed query. If your exports regularly approach the cap, the database query is usually the bottleneck (missing index, OFFSET pagination), not Lambda.
OFFSET pagination at scale is O(n²) — every page rescans all earlier rows. The last page on a 10M-row table can take minutes to return. The fix is keyset pagination: WHERE id > $lastId ORDER BY id LIMIT 1000. Constant time per page regardless of how deep into the dataset you are. Even better for exports: skip pagination entirely and use a server-side cursor that streams rows continuously.
No. One concurrent export will peg the event loop for everyone else's requests, and your p99 latency goes nonlinear. Always separate the worker process from the API. Horizontal scaling becomes a matter of running more worker containers (or in AWS-native land, Lambda autoscales for you). The decoupling also means you can cap export concurrency without affecting API throughput, and a worker crash doesn't take down the API.
Use a stable S3 object key per job ID — the same job retrying writes to the same S3 location, so retries overwrite cleanly instead of producing duplicate partial files. Persist progress to job state (BullMQ job.updateProgress, or DynamoDB for SQS+Lambda). For multi-hour exports, write checkpoints (last_processed_id) so a crashed worker can resume from the checkpoint instead of restarting from row 1.
SQS + Lambda + S3 is the cheapest at-scale stack on AWS for export workloads under 15 minutes. SQS is pay-per-message (fractions of a cent per export). Lambda is pay-per-invocation + execution-time. S3 standard storage is cheap, and presigned URLs let users download directly without proxying through your infrastructure (which would burn data-egress costs). For a startup running a few hundred exports per day, the entire stack typically costs under $10/month.
Three reasons. (1) Load balancer timeouts — gigabyte downloads through your API will exceed the 30-60 second timeout most LBs enforce. (2) Broken retries — if the download fails partway, the user has to re-trigger the entire export instead of just resuming the download. (3) Egress costs — proxying gigabytes through your API doubles your data-transfer bill. S3 presigned URLs (24-hour expiry typical) let the user download directly from S3, which was built for exactly this pattern.
If you're sketching the queue + streaming + delivery pieces and want a second set of eyes on a specific shape — text the row count, DB, and current bottleneck. Most export problems collapse to two or three architectural decisions. No funnel, no SOW.
Text PJ · 858-461-8054Don't see what you were looking for?
Text PJ a sentence about what you actually need — I'll build you a free custom shareable on the house. No email, no funnel, no SOW.
📲 Text PJ — free shareable