
Web Dev Limits
Understanding Web Development Limits
In February 2025, a developer posted on the MongoDB Community Forums with the kind of message nobody wants to write: their entire production database was down. Not a slow query. Not a failed write. The whole service, offline.
The culprit was a single document that had quietly grown past MongoDB's 16MB BSON limit. The app had been running fine for months. Every deploy was clean. Every test passed. But somewhere in production, one document was silently accumulating data, an array getting longer with every user interaction, a nested object growing one field at a time. Nobody was monitoring document sizes because why would they? It worked yesterday.
Then one day it didn't. The document crossed 16MB, MongoDB threw a BSONObjectTooLarge error, and the service crashed. Not gracefully. Not with a helpful warning. It just went down.
This is what hitting a web development limit looks like. Not a dramatic explosion we can see coming. A slow, invisible creep toward a hard ceiling that nobody checked for, followed by a production outage that takes everyone by surprise.
Web development is full of these ceilings. They don't show up in tutorials. They barely get a footnote in the docs. And they love to reveal themselves at the worst possible moment. An API payload gets rejected at 10MB. A Lambda function chokes at 6MB. A browser refuses to store more than 5MB in LocalStorage. A server drops a request because the headers are too large. Every layer of the stack has limits baked in, and most of them are invisible until we've already crossed them.
Worse, some of the "rules" developers repeat to each other haven't been true for years. We'll call those out as we go.
Myths That Won't Die
Web development is full of numbers that get repeated long after they've stopped being accurate. Here are the ones this post corrects:
The Myth The Reality URLs max out at 2,048 characters That was an Internet Explorer limit. Chrome supports 2MB. The real bottleneck today is the web server (4KB-8KB default). The V8 engine defaults to a 1.4GB heap Since Node.js 12, V8 is container-aware and scales to ~50% of available memory, up to 2GB. Browsers limit you to 6 concurrent connections That's HTTP/1.1. HTTP/2 multiplexes requests over a single connection, making the limit irrelevant on modern servers. LocalStorage gives you 5MB and that's fine for most things 5MB is a ceiling, not a floor, and some browsers are more aggressive. IndexedDB offers gigabytes and was built for real local storage. File uploads "just work" Nginx's default upload limit is 1MB. One megabyte. A single iPhone photo won't make it through without explicit server configuration. If any of these surprised you, that's exactly the point. The limits are there whether we check for them or not.
Once we know where these walls are, we can stop running into them face-first. Let's walk through the limits that will eventually bite every web developer, and what to do when they do.
1. API Payload Size Limits
We'd all like to think sending data through an API is straightforward. And it is, right up until the payload gets too big and the gateway slams the door with a 413 Payload Too Large.
AWS API Gateway has a hard 10MB payload limit for both REST and HTTP APIs. Hard as in "no, we cannot call support and get it raised" hard. And if there's a Lambda function behind that gateway, the synchronous invocation payload caps at 6MB. That's the one that usually gets us first.
What to do about it:
Chunk the data. If we're sending a big dataset, paginate it across multiple requests. Compress payloads with gzip or Brotli before sending. Batch small operations into single requests to cut down on round trips. And for large files, skip the API entirely. Generate a presigned S3 URL, let the client upload directly to storage, and have a Lambda pick it up from there. The API Gateway never needs to touch the file.
2. MongoDB Document Size Limits
MongoDB documents max out at 16MB (BSON). We already know how this ends if we read the intro. That sounds like a lot until we start nesting objects three levels deep, stuffing arrays with thousands of entries, or (please don't) embedding base64 images directly in our documents. The danger isn't that we'll build something obviously too big. It's that documents grow slowly, invisibly, one push to an array at a time, until they cross the line.
What to do about it:
Normalize the data. If our instinct is to cram everything into one document "for convenience," future-us will regret it. Break related data into separate collections and link them. Use GridFS for binary data like images and files. And monitor document sizes proactively using $bsonSize in aggregation pipelines. Document bloat sneaks up one field at a time, and by the time we get that BSONObjectTooLarge error, it's already too late.
3. Browser Storage Limitations
The browser gives us a few places to stash data locally, and every one of them comes with strings attached.
LocalStorage and SessionStorage cap out at roughly 5MB per origin in most browsers (some allow up to 10MB, but don't count on it). IndexedDB is far more generous, often allowing hundreds of megabytes to several gigabytes depending on the browser and available disk space. And cookies? About 4KB each, with most browsers allowing somewhere between 50 and 180 per domain.
What you've heard: 5MB of LocalStorage is plenty for caching data on the client.
What's actually true: 5MB is the ceiling, not a comfortable budget. Some browsers and modes are more restrictive. And LocalStorage is synchronous, blocks the main thread, and has no indexing or querying. IndexedDB offers gigabytes of space and was purpose-built for real client-side storage. If we're reaching for LocalStorage to store anything beyond a handful of preferences, we're using the wrong tool.
What to do about it:
For applications that need real local storage, use IndexedDB. It's more complex than LocalStorage, but it's built for it. Implement cleanup routines so we're not hoarding stale data forever. And for anything substantial, push it to the cloud. LocalStorage was never meant to be a database, even though we've all used it like one.
4. URL Length Restrictions
Here's a fun one. We've all probably heard that URLs max out at 2,048 characters. That number gets repeated everywhere.
What you've heard: URLs max out at 2,048 characters.
What's actually true: That number came from Internet Explorer, a browser that's been discontinued. Modern browsers are way more permissive: Chrome handles URLs up to 2MB, Firefox supports around 64KB, and Safari goes up to about 80,000 characters. The real bottleneck in 2026 is the server. Apache defaults to about 8KB for the request line and headers. Nginx is similar at 4KB-8KB. That's where our URLs will actually break.
What to do about it:
Stop putting entire application state in query strings. Use POST requests with proper request bodies for complex queries. Implement client-side state management instead of encoding everything in the URL. And if we genuinely need long URLs for a specific use case, check the web server configuration. That's almost certainly where the limit lives.
5. Memory and Timeout Constraints
Every application runs inside a box, and that box has a finite amount of memory and patience.
What you've heard: The V8 engine defaults to about 1.4GB of heap on 64-bit systems.
What's actually true: That was accurate years ago, but modern V8 is smarter. Since Node.js 12, the engine has been container-aware: it reads cgroup limits and sets the heap to roughly 50% of available memory, up to a default maximum of about 2GB. We can still override it with
--max-old-space-sizeif we need more, but the days of hard-coding memory flags for every deployment are mostly behind us.
One caveat worth knowing: the container-aware behavior broke in some environments when clusters upgraded to cgroup v2, causing Node.js to read host memory instead of container limits. That's been patched in newer versions, but if we're running older Node images on cgroup v2, it's worth verifying that V8 is actually seeing the right memory ceiling. A quick check of process.memoryUsage() against the container's cgroup limit will confirm.
Timeouts are the other silent killer. Most APIs enforce a 30-second limit for synchronous requests. AWS API Gateway defaults to 29 seconds (yes, 29, not 30). Since mid-2024, we can request a higher timeout for Regional and private REST APIs, but it comes at the cost of reduced throttle quota. Lambda itself can run for up to 15 minutes, but if we're behind API Gateway with the default timeout, that doesn't help.
What to do about it:
Clean up after ourselves. Dereference objects we're done with. Break long-running work into smaller async chunks. Use streams for large data processing so we're not loading entire files into memory. And for anything that takes more than a few seconds, move it to a background job and give the client a webhook or a polling endpoint.
6. Browser Connection Limits
What you've heard: Browsers limit concurrent connections to about 6 per domain.
What's actually true: That's an HTTP/1.1 constraint. HTTP/2 multiplexes many requests over a single connection, so the old 6-connection ceiling is basically irrelevant if the server supports it. And in 2026, most servers and CDNs do.
If we're firing off 20 requests at once under HTTP/1.1, 14 of them are sitting in a queue, waiting their turn. With HTTP/2, they all go through.
What to do about it:
Make sure the server supports HTTP/2 (it probably already does). If we're stuck on HTTP/1.1 for some reason, batch requests and lazy-load non-critical resources. Use a CDN, which naturally distributes traffic across edge servers. But honestly, if we're still fighting the 6-connection limit in production, the fix might just be enabling HTTP/2.
7. Cache Storage Limits
Browser caches aren't infinite, and mobile devices can be especially aggressive about clearing them when storage gets tight. The Cache API (used by Service Workers) shares a storage quota with IndexedDB and other origin-level storage, typically capped at a percentage of available disk space.
What to do about it:
Cache what matters and let go of the rest. Implement a stale-while-revalidate strategy so the app stays fast without hoarding stale assets. And don't assume cached data will always be there. Our Service Workers should handle cache misses gracefully, because on a budget Android phone, the browser will evict our cache without a second thought.
8. File Upload Size Restrictions
This one catches people because the defaults are so much lower than what we'd expect.
What you've heard: File uploads just work if the server is running.
What's actually true: Nginx's default
client_max_body_sizeis 1MB. One megabyte. A single iPhone photo won't make it through. Apache defaults to about 2MB. Various hosting platforms set their own caps on top of that. The theoretical ceiling for most servers is around 2GB (32-bit integer constraints), but the actual limit is whatever we've configured. And if we haven't configured it, the answer is "not much."
What to do about it:
First, explicitly set upload limits in the server config. Don't rely on defaults. For large files, implement chunked uploads so files get broken into smaller pieces and can resume if the connection drops halfway through. Libraries like tus and Uppy make this surprisingly painless.
9. Database Connection Limits
Every database has a ceiling on concurrent connections, and cloud-hosted databases make that ceiling very real. MongoDB Atlas free-tier clusters cap at around 500 connections. Managed PostgreSQL services vary by tier. Exceed the limit and the app starts throwing connection errors at our users.
This is one of those limits that bites us specifically when things are going well. The app gets popular, traffic spikes, and suddenly every request is fighting for a database connection. The errors look intermittent and random, which makes them incredibly hard to debug if we don't know what we're looking for.
What to do about it:
Connection pooling. Always. Reuse connections instead of spinning up a new one for every request. Use read replicas to spread the load for read-heavy workloads. Monitor connection count in production, because pool exhaustion tends to manifest as intermittent, hard-to-reproduce failures that will drive us absolutely crazy.
10. TCP Ephemeral Port Limits
This one is arguably the most invisible limit on this entire list. When our application makes a TCP connection to a server, the OS assigns a local port on our side of the connection. We never see it, never configure it, and never think about it. These are called ephemeral ports, and on Linux, the default range is 32768 to 60999, giving us roughly 28,000 of them. That sounds like plenty until it isn't.
In April 2026, Bluesky went down because of this exact limit. Their data plane service talked to memcached, and a new internal service started sending batch requests containing 15,000-20,000 URIs each. For each URI, the service spawned a goroutine that opened a new TCP connection to memcached. The connection pool only held 1,000 idle connections, so 14,000+ new connections had to be created per batch. Each one consumed an ephemeral port. And here's where it gets worse: after a connection closes, the port doesn't become available immediately. It enters a TCP state called TIME_WAIT for about 60 seconds. Two waves of these batch requests back to back, and the entire ephemeral port range was exhausted. No new connections could be made. The service went down. The traffic was only 3 requests per second. It wasn't volume that killed them. It was batch size.
What to do about it:
Cap concurrency. If we're fanning out thousands of concurrent connections from a single process, we need explicit limits on how many can be active at once (Go's errgroup with SetLimit exists for exactly this reason). Use connection pools that actually bound the number of concurrent connections, not just idle ones. If we're making heavy use of localhost connections, consider SO_REUSEADDR or tuning net.ipv4.ip_local_port_range and net.ipv4.tcp_tw_reuse. And monitor ephemeral port usage in production, because by the time we see "bind: address already in use" in the logs, we're already in an outage.
11. WebAssembly Memory Constraints
WebAssembly memory is allocated in 64KB pages, with a theoretical max of 4GB (the 32-bit address space limit). In practice, browser implementations may cap it lower, and mobile devices will hit physical memory limits well before reaching that ceiling.
What to do about it:
Be deliberate about memory allocation and deallocation in Wasm modules. Use streaming compilation to avoid loading everything upfront. Profile the application to find memory bottlenecks early. If we're pushing Wasm to its limits, we're probably doing something interesting, but that also means we need to be careful.
12. LLM API Token Limits
If we're building anything that touches a large language model, token limits are part of the job now. These numbers shift constantly as new models drop, but here's where things stand:
OpenAI's GPT-4.1 family supports up to 1 million tokens of context. The older GPT-4o handled 128K tokens. Legacy models like GPT-3.5-turbo (4K-16K tokens) are effectively deprecated at this point. Anthropic's Claude models support up to 200K tokens. Embedding models typically handle around 8K tokens per request.
What to do about it:
Chunk text intelligently. Use sliding windows for overlapping context when we need continuity across chunks. Summarize long documents before feeding them to the model. Count tokens before sending requests (libraries like tiktoken make this easy). And build the architecture to be model-agnostic, because the numbers in this paragraph will probably be outdated within six months. Always check the provider's docs for current limits.
13. HTTP Header Size Limits
This one's sneaky. Most of us never think about header sizes until we're debugging a 431 Request Header Fields Too Large error at 4 PM on a Friday.
Nginx defaults to 4KB-8KB for headers. Apache allows about 8KB. CDNs enforce similar limits. The usual culprits? Bloated cookies, long JWT tokens, excessive custom headers, or a combination of all three.
What to do about it:
Keep cookies lean. Don't store entire session objects in them. Use shorter token formats where possible. If we need to pass a large amount of metadata with a request, move it into the request body instead of cramming it into headers. And if we're using JWTs, audit the claims. We'd be surprised how much unnecessary data ends up in tokens over time.
Quick Reference
| Limit Type | Constraint | Solution Strategy |
|---|---|---|
| API Payload Size | 10MB (API Gateway), 6MB (Lambda sync) | Chunking, compression, presigned URLs |
| MongoDB Document | 16MB per document (BSON) | Schema normalization, GridFS for binaries |
| Browser Storage | ~5-10MB (LocalStorage), GBs (IndexedDB) | Use IndexedDB, implement cleanup routines |
| URL Length | 2MB (Chrome), 4KB-8KB (servers) | POST requests, state management, server config |
| Memory & Timeouts | ~2GB default V8 heap max, 29s API Gateway default | Async operations, streaming, memory cleanup |
| Browser Connections | 6 per domain (HTTP/1.1), moot with HTTP/2 | Upgrade to HTTP/2, request batching, CDNs |
| Cache Storage | Device and quota dependent | Focus on critical data, cache strategies |
| File Uploads | 1MB (Nginx default), configurable | File chunking, explicit server configuration |
| Database Connections | Service and tier-specific limits | Connection pooling, read replicas |
| Ephemeral Ports | ~28K (Linux default range) | Cap concurrency, tune TIME_WAIT, monitor usage |
| WebAssembly Memory | 64KB pages, up to 4GB | Efficient allocation, streaming compilation |
| LLM API Tokens | 128K-1M+ tokens (model-dependent) | Text chunking, sliding windows, summarization |
| HTTP Headers | 4KB-8KB (typical server defaults) | Lean cookies, shorter tokens, body over headers |
The Pattern
Every one of these limits follows the same arc. Something works fine in development. It works fine in staging. It works fine in production for weeks or months. And then one day, the data gets a little bigger, the traffic gets a little higher, or a document grows one field too many, and the whole thing falls over.
That MongoDB developer's documents grew slowly over months before crashing the service. Bluesky's ephemeral ports ran out not because of high traffic, but because one endpoint sent large batches that nobody thought to cap. Nobody wakes up planning to hit a limit. But the limits are there whether we plan for them or not. The 16MB BSON ceiling doesn't care that our schema looked reasonable when we designed it. The 28,000 ephemeral ports don't care that our request rate is only 3 per second. The 10MB API Gateway cap doesn't care that our payload was 9MB last week. The 5MB LocalStorage limit doesn't care that we only meant to cache "a few things."
The best architectures aren't the ones that ignore these constraints. They're the ones that respect them from day one, monitor for them in production, and have a plan for when things get close to the edge. Because they always do.
Have we run into a limit that made us question our career choices? Drop it in the comments. Every war story helps someone else avoid the same 2 AM debugging session.
References
- MongoDB BSONObjectTooLarge Production Crash (Feb 2025)
- AWS API Gateway Quotas
- AWS Lambda Payload Limits
- AWS API Gateway Integration Timeout Increase (June 2024)
- MongoDB BSON Document Size Limit
- MDN Web Storage API
- MDN IndexedDB API
- Chromium URL Display Guidelines
- Node.js 20+ Memory Management in Containers (Red Hat)
- Node.js cgroup Memory Limits (PR #27508)
- Node.js cgroup v2 Memory Detection Issue (#47259)
- V8 Heap Size Limit
- Nginx client_max_body_size Directive
- Bluesky April 2026 Outage: Ephemeral Port Exhaustion (Surfing Complexity)
- WebAssembly Memory (MDN)
- OpenAI GPT-4.1 Announcement
- Anthropic Claude Models
Discussion (0)
This website is still under development. If you encounter any issues, please contact me