Non-Functional Requirements
Don't draw boxes until you know what the system demands. For each NFR this doc covers what it means, how the answer changes your architecture layer by layer, key terms, and which real systems make it their top priority. Pick the ones most relevant to the system and let them drive your design.
Scale
How big is the system, and where does the load actually hit? Scale affects every layer — not just the database.
Ask:
- How many daily active users?
- What's the read/write ratio?
- Any bursty traffic patterns (holidays, events)?
DAU → QPS: Use 100,000 seconds/day for easy mental math. QPS = DAU × requests_per_user_per_day ÷ 100,000. Peak = avg × 2–3×. See Capacity Estimation for worked examples and per-technology limits.
| DAU | QPS (est.) | Key Architectural Decisions |
|---|---|---|
| 10K | ~1 | Single server. No LB, replicas, or cache needed. |
| 100K | ~10 | Add LB for redundancy (not load). CDN for static assets. |
| 1M | ~100 | Multiple app servers. DB read replicas (1–2). Connection pooler. |
| 10M | ~1,000 | Kafka for async writes. Redis Cluster. Read replicas sufficient — don't shard yet. |
| 100M+ | ~10,000+ | Multi-region. DB sharding or distributed SQL. CDN absorbs 80%+ of traffic. |
Read/Write ratio shapes your architecture:
- Read-heavy (100:1) → cache aggressively (Redis, CDN), DB read replicas. Twitter feed, Reddit homepage.
- Write-heavy (1:10) → message queues (Kafka) to absorb bursts, append-only logs, async consumers. Consider CQRS (separate write model from read model) to prevent reads from competing with writes. Logging pipeline, analytics ingestion.
- Balanced → general-purpose horizontal scaling.
Stateless vs stateful scaling: Stateless services (REST APIs, GraphQL) scale horizontally by adding nodes — any instance can handle any request. Stateful services (WebSocket servers, in-memory session stores) can't: a client connected to Server A can't transparently be routed to Server B. Fix with either session affinity (sticky routing via the LB) or by pushing state to an external store (Redis) so any app server can serve the session.
Burst traffic: If traffic spikes at predictable times (Black Friday, live events), design for auto-scaling and queue-based buffering, not steady-state peak capacity.
Storage estimate: DAU × avg_event_size × events_per_day × retention_days
Most critical for: Twitter/X (read-heavy feed), YouTube (video storage + CDN), Uber (surge traffic), ticketing systems (flash sales).
Latency
How fast must the system respond? This determines where you place compute, what stays synchronous, and what you offload.
Ask:
- What's the acceptable p99 response time?
- Are there specific operations that must be fast?
| Target | What It Means | Design Impact |
|---|---|---|
| < 10ms | Ultra-low. Real-time systems. | Data must live in-process memory. No network hops. Compute co-located with data. |
| < 100ms | Feels instant to users. | Read from Redis (≈1ms), not DB (≈10ms). CDN serves assets from edge, not origin. Precompute results offline. |
| < 500ms | Interactive. Standard web UX. | Cache reads from Redis. Async writes (publish to queue, return 200). DB reads must hit indexes. |
| 1–5s | Tolerable for complex queries. | Background jobs for heavy computation. DB aggregations OK if indexed. Show loading states. |
| > 5s | Batch is fine. | Async processing, queues, offline jobs. No sync response needed. |
Why measure p99, not average
P99 = the response time that 99% of requests complete faster than. Average hides the worst 1%.
99 requests at 50ms + 1 request at 10,000ms:
Average = 149ms ← looks healthy
P99 = 10,000ms ← system is on fire
At any meaningful scale, 1% is a lot of users. Average would never surface it.
P99 is also an early overload warning. When a system gets busy, the tail degrades first — P50 stays flat while P99 spikes. By the time P50 looks bad, you're already deep in trouble.
| State | P50 | P99 |
|---|---|---|
| Healthy | 40ms | 80ms |
| Getting busy | 45ms | 400ms |
| Overloaded | 80ms | 2,500ms |
Watch P99 — it gives you the window to act (shed load, scale out, open a circuit breaker) before most users feel anything.
Sync vs async — the biggest single latency lever. Every synchronous call in the chain adds to total response time and the user waits for all of it. The most impactful latency decision is whether the response must be synchronous at all. If the user doesn't need the result immediately — order confirmation, payment processing, sending a notification — make it async: return 202 Accepted immediately and process out-of-band. Synchronous means the user waits for every hop. Async breaks the chain.
Cascading latency in microservices. Serial service calls multiply: 5 services × 50ms each = 250ms minimum before any slow path or retry. Identify the critical path. Parallelize calls that are independent of each other. Cache aggressively between services. Every extra synchronous hop is a latency tax that compounds.
What actually causes tail latency (the worst 1%). Understanding the cause points to the fix:
- GC pauses — JVM/Go garbage collection stops the world for tens to hundreds of milliseconds. Tune heap size; use GC-friendly data structures.
- Lock contention — threads queuing for the same mutex. Reduce shared mutable state; prefer lock-free structures or actor models.
- Connection pool exhaustion — all DB connections in use; new requests wait. Right-size the pool; add a connection pooler (PgBouncer, RDS Proxy).
- Cold cache misses — first request after a deploy or eviction hits the DB. Warm the cache on startup; use a longer TTL for stable data.
Latency vs throughput: Latency is how fast one request completes. Throughput is how many complete per second. Batching increases throughput but adds per-request latency — know which the interviewer cares about.
Per-component costs: For per-hop numbers (LB, Redis, DB, Kafka, S3) and end-to-end breakdowns, see the Latency Reference Table in System Design Layers.
Most critical for: Search/autocomplete (Yelp, Google — < 100ms), stock trading (HFT — microseconds), multiplayer gaming, ride-matching (Uber — driver must get request fast).
Availability
How much downtime is acceptable? This drives redundancy, replication topology, and failover strategy across all layers.
Ask:
- What's the uptime requirement?
- What happens to users if this goes down?
| SLA | Downtime/Month | Architecture Pattern |
|---|---|---|
| 99% | ~7.2 hrs | Single region, single DB |
| 99.9% | ~43 min | Multi-AZ, auto-failover DB, Redis Sentinel |
| 99.99% | ~4.3 min | Active-active multi-region, CDN as buffer |
| 99.999% | ~26 sec | No SPOF anywhere — sync replication, blue/green deploys |
How each layer survives failure:
- Load Balancer: Health checks pull unhealthy app servers from rotation in seconds. Deploy across AZs so one AZ outage doesn't take the LB down.
- App Servers: Keep stateless — no local state — so any instance can handle any request. Auto-scaling group replaces failed instances automatically.
- Cache (Redis): Redis Sentinel auto-promotes a replica when the primary dies. Redis Cluster adds sharding so a dead node loses only its slice, not everything.
- Database: Primary + replica with automatic failover on primary crash. Cross-region replication (async for performance, sync for zero data loss) at higher SLA tiers.
- Message Queue: Kafka replication factor ≥ 3 — two broker deaths don't lose messages. Consumers resume from their last committed offset.
- CDN: Globally distributed by design. Absorbs traffic from a partially down origin and serves cached content during brief outages.
CAP Theorem tradeoff: During a network partition, choose availability (keep serving, possibly stale) or consistency (stop serving until consistent). Most consumer apps choose availability. Payment systems choose consistency.
Graceful degradation: When a dependency fails, degrade to a lesser but still useful response — don't fail completely. Netflix returns cached thumbnails when the recommendations service is down. A search service returns cached results when the index is unavailable. A checkout flow disables the "suggested add-ons" widget but still processes the order. Decide in advance what each service degrades to — it shouldn't be an incident-time decision.
SLI / SLO / SLA — know the difference:
| Term | What It Is | Example |
|---|---|---|
| SLI | The measured metric | "97.8% of requests completed in < 200ms this week" |
| SLO | Internal target your team is held to | "99.9% of requests must complete in < 200ms" |
| SLA | Customer-facing contract with financial penalties | "99.5% uptime or credits issued" |
SLA is always looser than SLO — if they were equal, every internal incident would trigger customer credits.
Error budget: 1 − SLO. A 99.9% SLO gives you ~43 min/month to spend on incidents and deploys. When the budget is gone, freeze non-critical changes until the window resets.
Each extra 9 is roughly 10× harder and more expensive. Push back if the requirement seems over-engineered.
Most critical for: Payment processors (Stripe, Visa — 99.999%), AWS infrastructure, healthcare systems, any system where downtime = revenue loss or safety risk.
Consistency
When a write happens, when do all nodes and users see it? This is the core CAP tradeoff in practice.
Ask:
- Can users see slightly stale data?
- If two users write at the same time, does it matter which one wins?
| Model | What It Means | When to Use | Real Example |
|---|---|---|---|
| Strong (Linearizable) | Every read sees the latest write across all nodes immediately. | Payments, inventory, bank balances | PostgreSQL, Zookeeper, Spanner |
| Read-your-writes | You always see your own latest write. Others may lag briefly. | Profile updates, settings | Most social apps for own data |
| Eventual | All nodes converge on the same value — eventually. Briefly stale is OK. | Social feeds, like counts, view counts | Cassandra, DynamoDB default, DNS |
| Causal | Cause before effect, globally. Unrelated writes can appear in any order. | Comments/replies, chat, collaborative editing | MongoDB sessions, DynamoDB transactions |
Read-your-writes in practice: Route your reads to the primary (or the specific replica that received your write) for a short window after a write. Without this, you may hit a lagging replica and see your own edits disappear — a confusing UX even if technically within the eventual consistency contract.
Causal consistency explained: If Alice posts "I'm going to the store" and Bob replies "I'll come with you", causal consistency guarantees Carol always sees Alice's post before Bob's reply — because Bob's reply causally depends on Alice's post. Carol might see Alice's post before or after Dave's unrelated status update — that's fine, they're not causally linked.
This is stronger than eventual (which could show Bob's reply before Alice's post) but weaker than strong (which globally orders every single write). It's the right choice when order matters within a thread or conversation, but not globally.
Conflict resolution — what happens when two concurrent writes conflict: With eventual consistency, two nodes can receive different writes to the same key simultaneously. Systems handle this in three ways:
- Last-write-wins (LWW) — the write with the latest timestamp wins. Simple, but can silently discard data if clocks drift. Default in Cassandra.
- Vector clocks — each write carries a version vector tracking causality. The system detects genuine conflicts and surfaces them to the application to resolve. Used in DynamoDB, Riak.
- CRDTs (Conflict-free Replicated Data Types) — data structures that merge automatically without conflicts by design. Counters, sets, and append-only logs are natural CRDTs. Used in Redis, collaborative editors.
Consistency vs latency — the core tradeoff: Stronger consistency costs latency because it requires cross-node coordination before acking a write. Strong consistency waits for all replicas to confirm — adding replication lag to every write. Eventual consistency acks immediately and propagates asynchronously. You can't have both zero latency and strong consistency across distributed nodes.
Tunable consistency (Cassandra / DynamoDB): Many systems let you choose per operation. QUORUM read (majority of replicas must agree) gives strong-ish consistency at higher latency. ONE read returns as soon as one replica responds — fast, but potentially stale. This lets you use strong consistency only where it matters, and eventual everywhere else.
ACID vs BASE:
- ACID (relational DBs) — Atomic, Consistent, Isolated, Durable. All or nothing, always correct.
- BASE (most NoSQL) — Basically Available, Soft state, Eventually consistent. Always up, eventually right.
Choosing a DB is often choosing between these two philosophies.
Interview signal: "It's fine if the like count is off by a few seconds" → eventual consistency, scale horizontally. "Double-charging a user is unacceptable" → strong consistency, accept the latency cost.
For retry safety and duplicate prevention, see Idempotency below.
Most critical for: Banking and payments (double-spend prevention), inventory systems (Amazon — can't oversell), booking systems (airline seats, hotel rooms).
Idempotency
If a client retries a request, will it cause duplicate side effects? This shapes how you design APIs and payment flows.
Ask:
- Can clients retry failed requests safely?
- Are there operations where duplicates are catastrophic (charges, transfers, order submissions)?
Why duplicates happen — delivery semantics:
| Delivery | Guarantee | Implication |
|---|---|---|
| At-most-once | Delivered 0 or 1 times | May lose messages. No duplicates. |
| At-least-once | Delivered 1 or more times | No loss. May duplicate. Requires idempotent consumers. |
| Exactly-once | Delivered exactly 1 time | No loss, no duplicates. Expensive — requires coordination. |
Most queues and networks use at-least-once — they guarantee delivery but may retry on failure, causing duplicates. The practical pattern: use at-least-once (cheaper) + idempotent consumers (so duplicates are harmless).
The problem: A client sends a payment request. The server processes it, but the response is lost in transit. The client retries. Without idempotency, the user gets charged twice.
The solution — idempotency keys: Client generates a unique key per logical operation (e.g., UUID) and sends it with the request. Server stores (idempotency_key → result) and returns the cached result on any duplicate. Use Redis with a TTL of 24h–7 days (retries happen within a short window, so indefinite storage isn't needed). For payment-critical flows where the guarantee must survive a Redis restart, back it with a DB row as well.
| Operation Type | Naturally Idempotent? | Fix |
|---|---|---|
| GET, DELETE | Yes (GET reads, DELETE on missing is no-op) | Nothing needed |
| PUT (replace entire resource) | Yes | Nothing needed |
| POST (create, charge, transfer) | No | Add idempotency key |
| Message consumer processing | No | Track processed message IDs in DB |
Idempotency in queues: A Kafka consumer that crashes mid-processing will re-receive the same message on restart (at-least-once delivery). Design consumers to be idempotent — check if the event was already processed (by storing the event ID) before acting on it. Kafka does support exactly-once semantics (EOS) via idempotent producers and transactions, but the operational complexity is high — at-least-once delivery with idempotent consumers is almost always the simpler and preferred approach.
Most critical for: Payment APIs (Stripe uses idempotency keys on every charge endpoint), order submission, booking systems, any POST that creates or transfers.
Durability
How much data loss is acceptable if the system crashes or a node goes down?
Ask:
- If we lose a server right now, what's the worst acceptable outcome?
- Can we replay events from a log?
RPO = Recovery Point Objective — how much data can we lose? Measured in time (0ms, 1s, 1hr means you lose up to that much data). RTO = Recovery Time Objective — how long can the system be down during recovery?
RPO and RTO are independent axes. A system can have RPO=0 (zero data loss) but RTO=minutes (takes time to promote a replica). Or RPO=hours (some loss tolerable) but RTO=seconds (instant recovery from a warm standby). Set them separately based on what the business actually needs.
| Term | Definition | Design Impact | Latency Cost |
|---|---|---|---|
| RPO = 0 | Zero data loss. | Synchronous replication: primary waits for all replicas to confirm before acking the write. | +5–20ms per write at DB layer. +10–50ms if two-phase commit across services. |
| RPO = seconds | Tiny loss OK. | Async replication. WAL (write-ahead log) shipped to replica continuously. | No extra latency — write acks immediately, replication happens in background. |
| RPO = hours | Some loss tolerable. | Periodic snapshots or nightly backups. | No latency impact. |
| RTO = seconds | Must recover near-instantly. | Hot standby replica already running, promoted automatically on failure (~30–60s). | No latency impact on normal path. |
| RTO = minutes | Fast recovery needed. | Warm standby: replica exists but not serving traffic. Promoted manually or semi-auto. | — |
| RTO = hours | Slower recovery OK. | Restore from backup. Spin up new instance. | — |
Real examples:
- Banking: RPO = 0. Every transaction written synchronously to multiple replicas before confirmation.
- Social media posts: RPO = seconds is fine. Async replication acceptable.
- Object storage: 11 nines of durability via cross-AZ redundant storage (AWS S3, GCP Cloud Storage).
Most critical for: Banking and financial systems, medical records (Epic, FHIR), legal document storage, payment transaction logs — any system where lost data = legal or financial liability.
Fault Tolerance
How well does the system handle partial failures without going fully down?
Ask:
- What happens when one server crashes?
- What happens when a whole datacenter goes down?
- What if a dependency is slow or unavailable?
| Failure Type | Strategy | Example |
|---|---|---|
| Single node crash | Redundant replicas, auto-failover | DB primary/replica, load balancer health checks |
| Slow dependency | Timeouts + circuit breaker | Stop calling a failing service; return fallback |
| Datacenter outage | Multi-AZ or multi-region active-active | Route traffic to surviving region |
| Data corruption | Checksums, write-ahead logs, point-in-time restore | Detect and roll back bad writes |
| Cascading failures | Bulkheads (isolate failure domains), rate limiting | Don't let one slow service take down everything |
Always set explicit timeouts on external calls. Without a timeout, a slow dependency hangs your thread indefinitely — the thread pool fills up and your whole service stops responding. Set timeouts on every DB query, HTTP call, and queue operation. This is the prerequisite for everything else in this section.
Circuit breaker pattern: Wraps external calls and tracks failure rate. Three states:
- Closed — normal operation. All requests pass through.
- Open — too many failures. All requests blocked immediately; fallback returned. Dependency gets time to recover.
- Half-open — after a cooldown period, let a small number of requests through as a probe. If they succeed, close the circuit. If they fail, reopen.
Return a cached or default response in Open state rather than propagating the error.
Retry with exponential backoff + jitter: When a request fails, wait before retrying — and double the wait each attempt (backoff). Add random jitter so all retrying clients don't slam the service at the same moment (thundering herd). A common sequence: retry after 1s, 2s, 4s, 8s with ±30% jitter, then give up and dead-letter.
Bulkhead pattern: Isolate resources (thread pools, connection pools, memory) by service so one slow dependency can't exhaust shared resources and take down everything else. Named after ship compartments that contain flooding to one section. Example: give the payment service its own thread pool; if payment calls hang, they only exhaust that pool — the order service, running in its own pool, keeps serving normally.
Dead Letter Queue (DLQ): When a message fails processing repeatedly (after N retries), route it to a DLQ instead of blocking the queue. The DLQ holds poisoned messages for inspection and manual replay. Without a DLQ, one bad message can stall an entire consumer group indefinitely. (AWS SQS dead-letter queues, GCP Pub/Sub dead-letter topics, or a separate Kafka topic).
Most critical for: Microservices architectures (each service can fail independently), distributed databases, any system with SLA > 99.9%.
Security
What data does the system handle and who should access it? Drives auth, encryption, and regulatory design.
Ask:
- Does this handle PII, payments, or health data?
- Who are the users — public, internal, B2B?
Key terms:
- Authentication (AuthN) — who are you? Verify the caller's identity. JWT, OAuth2, API keys.
- Authorization (AuthZ) — what can you do? Check permissions after identity is confirmed. RBAC, ACL.
- PII (Personally Identifiable Information) — any data that can identify a person: name, email, phone, SSN, IP address. Triggers GDPR/HIPAA obligations.
- TLS (Transport Layer Security) — encrypts data in transit (the "S" in HTTPS). Prevents interception.
- AES-256 (Advanced Encryption Standard) — standard algorithm for encrypting data at rest. Used in S3, databases, filesystems.
- JWT (JSON Web Token) — signed token the client sends with each request to prove identity. Stateless, server doesn't store sessions.
- OAuth2 — standard for delegated auth. "Sign in with Google" is OAuth2. Separates identity from your app.
- mTLS (Mutual TLS) — both sides verify certificates. Used for service-to-service auth inside your system.
- RBAC (Role-Based Access Control) — users get roles (admin, editor, viewer), roles get permissions. Simpler than per-user rules.
- ACL (Access Control List) — per-resource list of who can do what. More granular than RBAC (e.g. S3 bucket policies).
Security by layer:
| Layer | What Goes Here |
|---|---|
| CDN | DDoS protection, WAF (Web Application Firewall) blocks malicious requests before they reach origin |
| Load Balancer | TLS termination (decrypt HTTPS here, forward HTTP internally), IP whitelisting |
| API Gateway | Authentication (verify JWT/OAuth token), rate limiting (token bucket), request validation |
| App Server | Authorization (RBAC checks — "can this user do this action?"), input validation, business logic security |
| Cache (Redis) | Don't cache raw PII if avoidable. Redis AUTH password. Encrypt sensitive values if stored. |
| Database | AES-256 encryption at rest. Row-level security for multi-tenant data. Least-privilege DB users. Audit log here — append-only table logging who accessed what and when. |
| Object Storage | Signed URLs for private files (time-limited access). Bucket policies. Server-side encryption. |
| Secrets | Never hardcode credentials or put them in env files checked into source control. Inject at runtime from a secrets manager (AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault). Rotate automatically. Least-privilege IAM roles instead of long-lived keys where possible. |
Most critical for: Healthcare (HIPAA — audit every access to patient records), financial systems (PCI-DSS — card data tokenized immediately), auth systems (OAuth provider), any multi-tenant SaaS.
Compliance
Are there legal or regulatory constraints that shape the architecture?
Ask:
- What region are users in?
- Does this handle health, financial, or personal data?
Key terms:
- GDPR (General Data Protection Regulation) — EU law. Applies to any system with EU users, regardless of where the company is located.
- HIPAA (Health Insurance Portability and Accountability Act) — US law governing health data. Applies to any app handling patient records.
- PCI-DSS (Payment Card Industry Data Security Standard) — required for any system that stores, processes, or transmits card data.
- SOC 2 — US auditing standard for SaaS companies. Type I = point-in-time assessment. Type II = 6 months of continuous evidence. Required by enterprise buyers.
| Regulation | Who It Affects | Key Architecture Constraint |
|---|---|---|
| GDPR (EU) | Any system with EU users | Data residency in EU. Right to delete (complicates append-only logs). Breach notification in 72hrs. |
| HIPAA (US healthcare) | Medical records, health apps | Audit log every data access. Encryption in transit and at rest. Business associate agreements with vendors. |
| PCI-DSS (payments) | Any system touching card data | Card data never stored raw — tokenize immediately on receipt. Annual third-party audits. Network segmentation. |
| SOC 2 | B2B SaaS | Documented security controls. Access reviews. Incident response plan. |
GDPR complicates event-sourcing: Append-only logs make "right to delete" hard — you can't erase a past event. Solve with tombstone records or keep PII in a separate deletable store and only store user IDs in the event log.
Most critical for: Healthcare apps, payment processors, social platforms with EU users, any enterprise B2B SaaS sold to regulated industries.
Monitoring & Observability
How do you know the system is healthy in production? Drives logging, metrics, and alerting design.
Ask:
- Do you need real-time alerting?
- How quickly must the team detect and diagnose production issues?
| Signal | What It Covers | Tools |
|---|---|---|
| Metrics | QPS, latency, error rate, CPU/memory/disk | Prometheus, Datadog, AWS CloudWatch, GCP Cloud Monitoring |
| Logs | What happened and in what order | ELK stack, Splunk, AWS CloudWatch Logs, GCP Cloud Logging |
| Traces | Where time was spent across services | Jaeger, Zipkin, AWS X-Ray, GCP Cloud Trace |
| Alerts | Notify when SLA is breached | PagerDuty, Opsgenie |
The four golden signals (Google SRE): Latency, Traffic, Errors, Saturation. Build monitoring around these first.
- Latency — how long requests take (track p99, not average)
- Traffic — how much load the system is under (QPS, requests/sec)
- Errors — rate of failed requests (5xx errors, timeouts, exceptions)
- Saturation — how "full" a resource is. CPU at 95%, disk at 98%, connection pool nearly exhausted — saturation predicts future failure before users feel it. Monitor: CPU %, memory %, disk I/O, DB connection pool usage, queue depth.
Distributed tracing — finding the slow hop. In microservices, a high p99 could come from any service in the call chain. Distributed tracing gives you a waterfall view across all hops. The mechanism: attach a correlation ID (UUID) to every incoming request and propagate it in headers through every downstream call. Each service logs its span (start time, duration, service name) tagged with that ID. Tools like Jaeger or AWS X-Ray stitch spans into a full trace — you can see exactly which service added the latency.
Alert on burn rate, not just thresholds. A threshold alert (e.g. "error rate > 1%") fires after you've already breached. Burn rate alerting asks how fast you're consuming the error budget — a 10× burn rate means you'll exhaust the month's budget in 3 days. Alert early and act before most users are affected.
Structured logging. Log in JSON, not free text. Structured logs can be queried, filtered, and aggregated programmatically (level=error service=payments userId=42). Free-text logs are hard to analyse at scale and require fragile regex parsing.
Most critical for: Any system with a strict SLA, microservices (failures are hard to trace without correlation IDs), financial systems where bugs cost real money.
Environment Constraints
Are there non-standard constraints on the environment the system runs in?
Ask:
- Are clients on mobile or constrained devices?
- Are there low-bandwidth or offline scenarios to handle?
| Constraint | Design Impact |
|---|---|
| Mobile clients | Minimize payload size. Compress responses. Offline-first with local cache. |
| Low bandwidth (3G/rural) | Adaptive bitrate streaming (YouTube, Netflix). Delta sync instead of full sync. |
| Limited battery | Batch network calls. Avoid polling — use push (WebSockets, FCM). |
| Edge/IoT devices | Lightweight protocols (MQTT). Local processing before cloud sync. |
| Offline-first | Local DB (SQLite), sync on reconnect, conflict resolution strategy. |
Most critical for: Uber driver app (poor network in some cities), Google Maps offline, WhatsApp (works on 2G), IoT sensor pipelines, healthcare apps in hospitals with restricted networks.
Quick Reference — Which NFR Matters Most
| System | Top NFRs to Prioritize |
|---|---|
| Banking / payments | Consistency, Idempotency, Durability, Security, Compliance |
| Social feed (Twitter, Instagram) | Scale, Availability, Latency |
| Healthcare records | Durability, Security, Compliance, Availability |
| Search / autocomplete (Yelp, Google) | Latency, Scale |
| Ride-sharing (Uber) | Availability, Latency, Fault Tolerance, Environment |
| Video streaming (Netflix, YouTube) | Scale, Availability, Latency, Environment |
| Chat / messaging (WhatsApp) | Availability, Durability, Environment |
| Ticketing / booking (Airbnb, airlines) | Consistency, Availability, Scale |
| Enterprise SaaS | Security, Compliance, Availability |
| IoT / sensor pipeline | Scale, Fault Tolerance, Environment, Durability |