DEV Community

# distributedsystems

Topics related to systems where components are on different networked computers.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Distributed Database Internals: The Engineering Behind Log-Structured Merge (LSM) Trees

Distributed Database Internals: The Engineering Behind Log-Structured Merge (LSM) Trees

1
Comments
4 min read
A System Design Deep Dive — Question by Question

A System Design Deep Dive — Question by Question

1
Comments
5 min read
Building TitanMQ: Designing a Message Queue That Learns from Kafka, RabbitMQ, and ZeroMQ

Building TitanMQ: Designing a Message Queue That Learns from Kafka, RabbitMQ, and ZeroMQ

1
Comments 2
10 min read
ELI25: Apache Kafka Quick Notes for Interviews

ELI25: Apache Kafka Quick Notes for Interviews

Comments
4 min read
Distributed Tracing in ML Pipelines: From Preprocessing to Inference

Distributed Tracing in ML Pipelines: From Preprocessing to Inference

1
Comments
12 min read
We Tried to Break a Production IoT State Arbitration API With the Most Extreme Payloads We Could Design. It Didn't Break.

We Tried to Break a Production IoT State Arbitration API With the Most Extreme Payloads We Could Design. It Didn't Break.

1
Comments
19 min read
Why Your "Fail-Fast" Strategy is Killing Your Distributed System (and How to Fix It)

Why Your "Fail-Fast" Strategy is Killing Your Distributed System (and How to Fix It)

1
Comments
9 min read
The Worlds of Distributed Systems — Align Your Team’s Mental Model

The Worlds of Distributed Systems — Align Your Team’s Mental Model

Comments
5 min read
Chapter 1 — Thinking About Rollback in Distributed Systems Through Three Worlds (RML-1/2/3)

Chapter 1 — Thinking About Rollback in Distributed Systems Through Three Worlds (RML-1/2/3)

Comments
6 min read
Why Your Object Storage Is Slow (And How Parallelism Over HDDs Fixes It)

Why Your Object Storage Is Slow (And How Parallelism Over HDDs Fixes It)

1
Comments
5 min read
Temporal Workflow Engine: The Reliability Layer Your Distributed System Is Missing [2026 Guide]

Temporal Workflow Engine: The Reliability Layer Your Distributed System Is Missing [2026 Guide]

1
Comments 2
7 min read
Distributed Transaction Tango: Why Your Microservices Need Sagas

Distributed Transaction Tango: Why Your Microservices Need Sagas

Comments 1
3 min read
Week 1 — When LLM Failures Weren’t About Load, But Timing (ZooKeeper + Distributed Locking)

Week 1 — When LLM Failures Weren’t About Load, But Timing (ZooKeeper + Distributed Locking)

1
Comments
3 min read
A 10% traffic spike took down a stable system in 3 minutes and 47 seconds.

A 10% traffic spike took down a stable system in 3 minutes and 47 seconds.

3
Comments
3 min read
AI Agent Architecture Patterns: Engineering for Autonomy, Resilience, and Control

AI Agent Architecture Patterns: Engineering for Autonomy, Resilience, and Control

Comments
11 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.