DEV Community

Jayesh Pamnani
Jayesh Pamnani

Posted on

The hidden problem with long-running API requests

A request taking 2-3 seconds locally does not feel like a problem.

In production, it becomes one very quickly.

Most backend issues I’ve seen around APIs were not caused by bad logic.

They were caused by requests staying open for too long.


Why this becomes dangerous

Long-running requests hold resources.

Usually:

  • database connections
  • memory
  • worker threads
  • external API sessions

One slow request is manageable.

Hundreds of slow requests at the same time start creating bottlenecks across the entire system.

And the worst part is that it often happens gradually.

Everything works fine in staging.

Production traffic exposes the real problem.


Common causes

1. Too much business logic inside a single request

A request comes in and the API tries to:

  • validate data
  • generate reports
  • process images
  • call external APIs
  • update multiple systems
  • send emails

All before returning a response.

This is one of the biggest architectural mistakes in backend systems.


2. Waiting on third-party APIs

External services are unpredictable.

Even if your own system is optimized, a slow payment gateway or ERP API can keep your request hanging for several seconds.

Now multiply that by hundreds of concurrent users.


3. Database queries that grow over time

A query that works fine with 10,000 rows behaves very differently with 10 million.

This is why APIs suddenly become slow months after deployment.

The code did not change.

The data volume did.


4. File processing during requests

Uploading files is fine.

Processing them synchronously is where problems start.

PDF generation, image optimization, AI processing, video conversion - these should rarely happen inside the request lifecycle.


What long-running requests actually cause

People usually think:
“Worst case, the API is slow.”

The real impact is much worse.

You start seeing:

  • request queues
  • worker exhaustion
  • timeout errors
  • database connection starvation
  • memory spikes
  • cascading failures across services

One slow endpoint can affect unrelated parts of the system.


The fix is usually architectural

The solution is not increasing server size forever.

The real fix is separating:

  • immediate response
  • background execution

A better pattern

Instead of this:

Client → API → Heavy processing → Response

Do this:

Client → API → Queue job → Immediate response

Worker → Process asynchronously

The API should respond quickly.

Heavy operations should happen in workers, queues, or event-driven systems.


Another important fix: timeouts

A surprising number of systems have no proper timeout handling.

Every external request should have:

  • connection timeout
  • read timeout
  • retry strategy
  • failure handling

Otherwise your workers end up waiting forever.


The mindset shift

Fast APIs are not only about speed.

They are about system stability.

A backend that responds quickly under load is usually designed around:

  • short-lived requests
  • async processing
  • isolation between services
  • controlled retries

That architecture matters more than raw server power.


Most production performance problems are not caused by traffic alone.

They come from APIs trying to do too much before returning a response.


How we handle this at BrainPack

At BrainPack, we design backend systems with this in mind from the start.

Long-running operations are separated from the request lifecycle using queues, workers, event-driven flows, and execution layers that keep APIs responsive even under heavy operational load.

The goal is simple:

Keep the system reactive for users while heavy processing happens safely in the background.

Top comments (0)