Derex.dev

Chucked Uploads, Persistent Sessions

I was wiring up a file upload system in Node.js — not the flashy, frontend kind with progress bars and drag-and-drop zones — but the backend guts. The part that takes a stream of chunks, holds onto them safely, and eventually stitches them together into the final file.

I didn’t want to build an HTTP server. I didn’t want a frontend client. I just needed a Node.js class that could handle uploads in pieces — reliably, resumably, and testably.

Turns out, that second word — resumably — is where things got tricky.


Why Chunked Uploads?

Let’s set the stage. Chunked uploads matter when:

  • Files are large (video, images, backups)
  • Network stability is unreliable
  • You want to support pause/resume functionality

I wanted a simple API:

startUpload(id, filename);
appendChunk(id, chunkIndex, data);
completeUpload(id);

Later, I added a session-based version with metadata and visibility support:

startSession(id, totalChunks, originalName, targetPath, visibility);
receiveChunk(id, chunkIndex, chunk);
finalizeUpload(id);

Two APIs, one underlying system. It worked. Until I restarted the process.


The Problem: Memory Is Not Persistence

All session state lived in memory:

private sessions = new Map<string, UploadSession>();

So if the process crashed or restarted, it lost track of what chunks were received. It couldn’t resume an upload. The actual chunk files were still on disk, but without the metadata, they were just orphaned data.

It hit me: resumability isn’t just about chunk files — it’s about remembering the upload state.

That meant one thing: I had to persist session metadata to disk.


The Fix: Metadata Files

Every session now gets a metadata.json file alongside its chunks:

{
  "id": "upload123",
  "originalName": "photo.jpg",
  "totalChunks": 3,
  "receivedChunks": [0, 1],
  "targetPath": "/storage/uploads/photo.jpg",
  "visibility": "public"
}

When the server starts, I call:

loadSessionsFromDisk();

It crawls the upload directory, finds metadata files, and reconstructs the sessions map. Just like before — but now durable.

This one change made everything more robust. Unexpected crash? No problem. The next chunk will pick up right where the last one left off.


Logging Progress

While I was at it, I added a tiny improvement: log progress.

getProgress(sessionId); // → 66.6%

This was especially helpful in tests (and will be even more useful in real UIs). It’s based on the receivedChunks.size / totalChunks ratio.


Testing the Whole Flow

I didn’t want to trust myself to remember how this worked in 6 months. So I wrote unit tests:

  • ✅ Session creation
  • ✅ Chunk appending
  • ✅ Upload completion
  • ✅ Progress tracking
  • ✅ Session recovery after restart

One test even manually simulates a restart:

// Start session, write chunks
// Simulate restart: new ChunkManager()
// Call loadSessionsFromDisk()
// Finalize and assert output file exists

What I Learned

  1. State is the hard part. Writing files is easy. Remembering what’s been written — and why — is where the real design lives.
  2. Tests give you courage. I could refactor, extend, and restart with confidence because tests had my back.
  3. Resumability isn’t magic. It’s just careful persistence, plus a willingness to think through failure modes.

What’s Next?

  • Support for retrying failed chunks
  • Expiring old sessions
  • Moving files to cloud storage after finalization
  • Exposing this to a real client or CLI

If you’re building file upload systems in Node, I hope this helps you skip the mental potholes I hit. And if you’re like me — building to understand — I hope this gave you a little more clarity than you had before.

As always, I’m still learning. If you’ve built something like this before, I’d love to hear how you approached it.

Did I make a mistake? Please consider Send Email With Subject