Request Hedging: Accelerate Your App by Racing Duplicate Calls

Users notice slow requests; even if 99 % finish quickly, that 1 % “long‑tail” latency can make your app feel sluggish. Request hedging solves this by speculatively firing a second duplicate after a short delay, racing to beat out outliers before they ever impact the UI.

Why the slowest 1 % of requests matter

The time it takes for the slowest 1 % of requests to finish is known as P99 latency. (P99.9 is the slowest 0.1 %, and so on.)
Users are sensitive to slowness. One long request is all it takes for an app to feel sluggish.
In an architectures where a page render hits 50 microservices, one bad service can drag the whole page down.

long tail latency

Google’s Bigtable team discovered that firing a second copy of a read after just 10 milliseconds cut their P99.9 latency by 96 % while adding only 2 % extra traffic. That’s cheaper than a single extra VM instance and far more predictable.

What exactly is request hedging?

Send the original request; if no response arrives within a small hedge delay, send a duplicate to another healthy replica. Return whichever finishes first and cancel the other.

request hedging pattern

Why it works:

Outliers are random. Network hiccups don’t hit every server at once.
Cheap insurance. Most requests finish fast, so the duplicate rarely runs long. You pay a small burst of extra load to avoid a big, visible stall.

How to fit hedging into a Next.js + Sitecore Headless + . NET stack

1. Next.js – browser or Vercel Edge

// lightweight helper (TypeScript)
export async function hedgedFetch(urls: string[], delayMs = 50) {
  const controller = new AbortController();
  const timer = setTimeout(() => {
    if (urls.length > 1) fetch(urls[1], { signal: controller.signal });
  }, delayMs);

  try {
    const winner = await Promise.any(
      urls.map(u => fetch(u, { signal: controller.signal }))
    );
    return winner;
  } finally {
    clearTimeout(timer);
    controller.abort();
  }
}

Example: Hedging a front‑end GraphQL fetch

const response = await hedgedFetch([
  "https://edge-usw.example.com/graphql",
  "https://edge-use.example.com/graphql"
]);
const json = await response.json();

This code races two region endpoints, returns the fastest response, and cancels the slower request via AbortController. Adjust delayMs if your P95 latency is lower than the default 50 ms.

2. Next.js API routes or App Router server actions

Same pattern, but tune delayMs lower (20–30 ms) because the call is already inside the data‑center.

3. Envoy / Istio sidecars

An Envoy or Istio sidecar is a small proxy container that runs alongside your application container in the same Kubernetes pod. All inbound and outbound traffic passes through this proxy, so you can add behaviors such as retries, TLS, rate‑limiting, and request hedging by updating proxy settings instead of touching application code.

If you skip sidecars in your Next.js application
You can still hedge browser and server‑side calls by writing helpers (like hedgedFetch) or using Polly/gRPC policies. However, each service must implement and maintain its own logic, and any calls that come into your app from other services will not be hedged, leaving long‑tail spikes unprotected. Over time this scattered approach increases maintenance overhead and risks inconsistent latency behavior across the stack.

route:
  per_filter_config:
    envoy.hedging:
      hedge_on_per_try_timeout: true
      initial_hedge_delay: 0.02s  # 20 ms
      max_requests: 2

Put this in a VirtualService (Istio) or Route Configuration (raw Envoy) to hedge any calls that are safe to repeat without side effects (e.g., GET /product/123), otherwise known as idempotent.

4. .NET back‑end callers

What is gRPC? gRPC (short for Google Remote Procedure Call) is an open‑source framework that lets services invoke functions on other services as though they were local methods. It rides on HTTP/2 for efficient, multiplexed connections, uses Protocol Buffers for small binary messages, and generates type‑safe client and server code in many languages. Built‑in features like deadlines, retries, and hedging policies make it a natural place to enable request hedging without extra plumbing.

gRPC

{
  "methodConfig": [{
    "name": [{ "service": "ProductCatalog" }],
    "hedgingPolicy": {
      "maxAttempts": 2,
      "hedgingDelay": "0.03s",
      "nonFatalStatusCodes": ["UNAVAILABLE", "DEADLINE_EXCEEDED"]
    }
  }]
}

HTTP

builder.Services.AddHttpClient("edge")
    .AddStandardHedgingHandler(o => o.MaxAttempts = 2);

5. Sitecore Experience Edge

Experience Edge already runs in multiple regions. Expose two region‑specific GraphQL URLs to the client and let the hedged fetch pick the fastest.

Roll‑out checklist

Measure first. Capture your current P50, P95, P99, P99.9 latencies per hop.
Pick a hedge delay ≈ P95. Too short wastes capacity, too long misses outliers.
Restrict to idempotent reads. Avoid duplicate writes unless your API supports idempotency keys.
Cap attempts to two. Start small; you rarely need more.
Instrument and watch. Expose metrics like hedged_attempts, cancels, and tail percentiles. Aim for <5 % load overhead.

Risks and how to mitigate them

Risk	Mitigation
Extra traffic / CPU	Monitor overhead; two attempts at most keeps it predictable.
Duplicate side effects on POST / PUT	Keep hedging to GET / GraphQL `query` unless you have idempotency tokens.
Window where both copies run	Cancel losers immediately with `AbortController`, gRPC cancellations, or Envoy resets.

Key takeaway

Request hedging is a tiny change that brings outsized rewards. A few lines of code (or a single header) can erase those embarrassing long‑tail spikes and make your Next.js + Sitecore + . NET experience feel nearly instantaneous.

request hedging pattern

Upwork Freelancers vs Dedicated React.js Teams: What’s Better for Your Project in 2025?

Is Agile dead in the age of AI?

Top 15 Enterprise Use Cases That Justify Hiring Node.js Developers in 2025

The Core Model: Start FROM The Answer, Not WITH The Solution

Finally, a sleek gaming laptop I can take to the office (without sacrificing power)

These jobs face the highest risk of AI takeover, according to Microsoft

Apple’s tariff costs and iPhone sales are soaring – how long until device prices are too?

5 ways to successfully integrate AI agents into your workplace

Enhancing Laravel Queries with Reusable Scope Patterns

Enhancing Laravel Queries with Reusable Scope Patterns

Everything We Know About Livewire 4

Everything We Know About Livewire 4

YouTube wants to use AI to treat “teens as teens and adults as adults” — with the most age-appropriate experiences and protections

YouTube wants to use AI to treat “teens as teens and adults as adults” — with the most age-appropriate experiences and protections

Sam Altman is afraid of OpenAI’s GPT-5 creation — “The Manhattan Project feels very fast, like there are no adults in the room”

9 new features that arrived on the Windows 11 Insider Program during the second half of July 2025

Request Hedging: Accelerate Your App by Racing Duplicate Calls

Why the slowest 1 % of requests matter

What exactly is request hedging?

How to fit hedging into a Next.js + Sitecore Headless + . NET stack

1. Next.js – browser or Vercel Edge

2. Next.js API routes or App Router server actions

3. Envoy / Istio sidecars

4. .NET back‑end callers

5. Sitecore Experience Edge

Roll‑out checklist

Risks and how to mitigate them

Key takeaway

Further reading:

Enhancing Laravel Queries with Reusable Scope Patterns

Everything We Know About Livewire 4

Best early Prime Day smartwatch and fitness tracker deals: My 10 favorite sales live now

CVE-2025-53157 – Apache HTTP Server Cross-Site Request Forgery

Swift Apprentice: Beyond the Basics [SUBSCRIBER]

Agentic AI in the SOC – Dawn of Autonomous Alert Triage

Windows 11 is getting its own version of the Mac’s “Handoff” feature — resume apps across Android and PC!

Onym – Flexible Filename Generator

CVE-2025-46653 – Formidable File Name Guessing Vulnerability

Rilasciata Archcraft 2025.04.24: la distribuzione GNU/Linux minimalista e moderna basata su Arch Linux

Request Hedging: Accelerate Your App by Racing Duplicate Calls

Why the slowest 1 % of requests matter

What exactly is request hedging?

How to fit hedging into a Next.js + Sitecore Headless + . NET stack

1. Next.js – browser or Vercel Edge

2. Next.js API routes or App Router server actions

3. Envoy / Istio sidecars

4. .NET back‑end callers

5. Sitecore Experience Edge

Roll‑out checklist

Risks and how to mitigate them

Key takeaway

Further reading:

Related Posts