Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      CodeSOD: Functionally, a Date

      September 16, 2025

      Creating Elastic And Bounce Effects With Expressive Animator

      September 16, 2025

      Microsoft shares Insiders preview of Visual Studio 2026

      September 16, 2025

      From Data To Decisions: UX Strategies For Real-Time Dashboards

      September 13, 2025

      DistroWatch Weekly, Issue 1139

      September 14, 2025

      Building personal apps with open source and AI

      September 12, 2025

      What Can We Actually Do With corner-shape?

      September 12, 2025

      Craft, Clarity, and Care: The Story and Work of Mengchu Yao

      September 12, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Can I use React Server Components (RSCs) today?

      September 16, 2025
      Recent

      Can I use React Server Components (RSCs) today?

      September 16, 2025

      Perficient Named among Notable Providers in Forrester’s Q3 2025 Commerce Services Landscape

      September 16, 2025

      Sarah McDowell Helps Clients Build a Strong AI Foundation Through Salesforce

      September 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      I Ran Local LLMs on My Android Phone

      September 16, 2025
      Recent

      I Ran Local LLMs on My Android Phone

      September 16, 2025

      DistroWatch Weekly, Issue 1139

      September 14, 2025

      sudo vs sudo-rs: What You Need to Know About the Rust Takeover of Classic Sudo Command

      September 14, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Request Hedging: Accelerate Your App by Racing Duplicate Calls

    Request Hedging: Accelerate Your App by Racing Duplicate Calls

    July 31, 2025

    Users notice slow requests; even if 99 % finish quickly, that 1 % “long‑tail” latency can make your app feel sluggish. Request hedging solves this by speculatively firing a second duplicate after a short delay, racing to beat out outliers before they ever impact the UI.


    Why the slowest 1 % of requests matter

    • The time it takes for the slowest 1 % of requests to finish is known as P99 latency. (P99.9 is the slowest 0.1 %, and so on.)
    • Users are sensitive to slowness. One long request is all it takes for an app to feel sluggish.
    • In an architectures where a page render hits 50 microservices, one bad service can drag the whole page down.

    long tail latency

    Google’s Bigtable team discovered that firing a second copy of a read after just 10 milliseconds cut their P99.9 latency by 96 % while adding only 2 % extra traffic. That’s cheaper than a single extra VM instance and far more predictable.


    What exactly is request hedging?

    Send the original request; if no response arrives within a small hedge delay, send a duplicate to another healthy replica. Return whichever finishes first and cancel the other.

    request hedging pattern

    Why it works:

    1. Outliers are random. Network hiccups don’t hit every server at once.
    2. Cheap insurance. Most requests finish fast, so the duplicate rarely runs long. You pay a small burst of extra load to avoid a big, visible stall.

    How to fit hedging into a Next.js + Sitecore Headless + . NET stack

    1. Next.js – browser or Vercel Edge

    <span class="token comment">// lightweight helper (TypeScript)</span>
    <span class="token keyword">export</span> <span class="token keyword">async</span> <span class="token keyword">function</span> <span class="token function">hedgedFetch</span><span class="token punctuation">(</span>urls<span class="token operator">:</span> <span class="token builtin">string</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">,</span> delayMs <span class="token operator">=</span> <span class="token number">50</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
      <span class="token keyword">const</span> controller <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">AbortController</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">const</span> timer <span class="token operator">=</span> <span class="token function">setTimeout</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">{</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>urls<span class="token punctuation">.</span>length <span class="token operator">></span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token function">fetch</span><span class="token punctuation">(</span>urls<span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> signal<span class="token operator">:</span> controller<span class="token punctuation">.</span>signal <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span><span class="token punctuation">,</span> delayMs<span class="token punctuation">)</span><span class="token punctuation">;</span>
    
      <span class="token keyword">try</span> <span class="token punctuation">{</span>
        <span class="token keyword">const</span> winner <span class="token operator">=</span> <span class="token keyword">await</span> <span class="token builtin">Promise</span><span class="token punctuation">.</span><span class="token function">any</span><span class="token punctuation">(</span>
          urls<span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span>u <span class="token operator">=></span> <span class="token function">fetch</span><span class="token punctuation">(</span>u<span class="token punctuation">,</span> <span class="token punctuation">{</span> signal<span class="token operator">:</span> controller<span class="token punctuation">.</span>signal <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
        <span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">return</span> winner<span class="token punctuation">;</span>
      <span class="token punctuation">}</span> <span class="token keyword">finally</span> <span class="token punctuation">{</span>
        <span class="token function">clearTimeout</span><span class="token punctuation">(</span>timer<span class="token punctuation">)</span><span class="token punctuation">;</span>
        controller<span class="token punctuation">.</span><span class="token function">abort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
    <span class="token punctuation">}</span>

    Example: Hedging a front‑end GraphQL fetch

    <span class="token keyword">const</span> response <span class="token operator">=</span> <span class="token keyword">await</span> <span class="token function">hedgedFetch</span><span class="token punctuation">(</span><span class="token punctuation">[</span>
      <span class="token string">"https://edge-usw.example.com/graphql"</span><span class="token punctuation">,</span>
      <span class="token string">"https://edge-use.example.com/graphql"</span>
    <span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">const</span> json <span class="token operator">=</span> <span class="token keyword">await</span> response<span class="token punctuation">.</span><span class="token function">json</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    

    This code races two region endpoints, returns the fastest response, and cancels the slower request via AbortController. Adjust delayMs if your P95 latency is lower than the default 50 ms.

    2. Next.js API routes or App Router server actions

    Same pattern, but tune delayMs lower (20–30 ms) because the call is already inside the data‑center.

    3. Envoy / Istio sidecars

    An Envoy or Istio sidecar is a small proxy container that runs alongside your application container in the same Kubernetes pod. All inbound and outbound traffic passes through this proxy, so you can add behaviors such as retries, TLS, rate‑limiting, and request hedging by updating proxy settings instead of touching application code.

    If you skip sidecars in your Next.js application
    You can still hedge browser and server‑side calls by writing helpers (like hedgedFetch) or using Polly/gRPC policies. However, each service must implement and maintain its own logic, and any calls that come into your app from other services will not be hedged, leaving long‑tail spikes unprotected. Over time this scattered approach increases maintenance overhead and risks inconsistent latency behavior across the stack.

    <span class="token key atrule">route</span><span class="token punctuation">:</span>
      <span class="token key atrule">per_filter_config</span><span class="token punctuation">:</span>
        <span class="token key atrule">envoy.hedging</span><span class="token punctuation">:</span>
          <span class="token key atrule">hedge_on_per_try_timeout</span><span class="token punctuation">:</span> <span class="token boolean important">true</span>
          <span class="token key atrule">initial_hedge_delay</span><span class="token punctuation">:</span> 0.02s  <span class="token comment"># 20 ms</span>
          <span class="token key atrule">max_requests</span><span class="token punctuation">:</span> <span class="token number">2</span>
    

    Put this in a VirtualService (Istio) or Route Configuration (raw Envoy) to hedge any calls that are safe to repeat without side effects (e.g., GET /product/123), otherwise known as idempotent.

    4. .NET back‑end callers

    What is gRPC? gRPC (short for Google Remote Procedure Call) is an open‑source framework that lets services invoke functions on other services as though they were local methods. It rides on HTTP/2 for efficient, multiplexed connections, uses Protocol Buffers for small binary messages, and generates type‑safe client and server code in many languages. Built‑in features like deadlines, retries, and hedging policies make it a natural place to enable request hedging without extra plumbing.

    • gRPC
      <span class="token punctuation">{</span>
        <span class="token property">"methodConfig"</span><span class="token operator">:</span> <span class="token punctuation">[</span><span class="token punctuation">{</span>
          <span class="token property">"name"</span><span class="token operator">:</span> <span class="token punctuation">[</span><span class="token punctuation">{</span> <span class="token property">"service"</span><span class="token operator">:</span> <span class="token string">"ProductCatalog"</span> <span class="token punctuation">}</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
          <span class="token property">"hedgingPolicy"</span><span class="token operator">:</span> <span class="token punctuation">{</span>
            <span class="token property">"maxAttempts"</span><span class="token operator">:</span> <span class="token number">2</span><span class="token punctuation">,</span>
            <span class="token property">"hedgingDelay"</span><span class="token operator">:</span> <span class="token string">"0.03s"</span><span class="token punctuation">,</span>
            <span class="token property">"nonFatalStatusCodes"</span><span class="token operator">:</span> <span class="token punctuation">[</span><span class="token string">"UNAVAILABLE"</span><span class="token punctuation">,</span> <span class="token string">"DEADLINE_EXCEEDED"</span><span class="token punctuation">]</span>
          <span class="token punctuation">}</span>
        <span class="token punctuation">}</span><span class="token punctuation">]</span>
      <span class="token punctuation">}</span>
      

    HTTP

    builder<span class="token punctuation">.</span>Services<span class="token punctuation">.</span><span class="token function">AddHttpClient</span><span class="token punctuation">(</span><span class="token string">"edge"</span><span class="token punctuation">)</span>
        <span class="token punctuation">.</span><span class="token function">AddStandardHedgingHandler</span><span class="token punctuation">(</span>o <span class="token operator">=></span> o<span class="token punctuation">.</span>MaxAttempts <span class="token operator">=</span> <span class="token number">2</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    

    5. Sitecore Experience Edge

    Experience Edge already runs in multiple regions. Expose two region‑specific GraphQL URLs to the client and let the hedged fetch pick the fastest.

    Roll‑out checklist

    1. Measure first. Capture your current P50, P95, P99, P99.9 latencies per hop.
    2. Pick a hedge delay ≈ P95. Too short wastes capacity, too long misses outliers.
    3. Restrict to idempotent reads. Avoid duplicate writes unless your API supports idempotency keys.
    4. Cap attempts to two. Start small; you rarely need more.
    5. Instrument and watch. Expose metrics like hedged_attempts, cancels, and tail percentiles. Aim for <5 % load overhead.

    Risks and how to mitigate them

    Risk Mitigation
    Extra traffic / CPU Monitor overhead; two attempts at most keeps it predictable.
    Duplicate side effects on POST / PUT Keep hedging to GET / GraphQL query unless you have idempotency tokens.
    Window where both copies run Cancel losers immediately with AbortController, gRPC cancellations, or Envoy resets.

    Key takeaway

    Request hedging is a tiny change that brings outsized rewards. A few lines of code (or a single header) can erase those embarrassing long‑tail spikes and make your Next.js + Sitecore + . NET experience feel nearly instantaneous.

    request hedging pattern

    Further reading:

    • Vector Search Embeddings and Retrieval-Augmented Generation
    • Postgres RAG Stack: Embedding, Chunking & Vector Search
    • Tool‑Augmented RAG Chatbot: GPT‑4, pgVector & Next.js

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleThe Intersection of Agile and Accessibility – Designing Accessible Agile Artifacts
    Next Article AI in Medical Device Software: From Concept to Compliance

    Related Posts

    Development

    Can I use React Server Components (RSCs) today?

    September 16, 2025
    Development

    Perficient Named among Notable Providers in Forrester’s Q3 2025 Commerce Services Landscape

    September 16, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Designing For TV: Principles, Patterns And Practical Guidance (Part 2)

    Tech & Work

    CVE-2025-48375 – Schule Open-Source School Management System OTP Email Flooding Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Terraform State Management: Understanding and Best Practices

    Linux

    Steven Spielberg’s interest in directing ‘Call of Duty’ film ignored by Activision — turning down Hollywood royalty to chart its own course

    News & Updates

    Highlights

    News & Updates

    Microsoft is replacing salespeople with “solutions engineers” amid recent layoffs — promoting Copilot AI while ChatGPT dominates the enterprise sector

    July 4, 2025

    The recent layoffs seem to be part of Microsoft’s strategy to improve its sales of…

    Modernizing on Your Own Terms: A Strategic Guide to Managing Node.js Legacy Systems

    September 11, 2025

    Google set a new durability standard with the Pixel 10 Fold (Samsung: take note)

    August 21, 2025

    Imaging startup Eyeo raises €15M to for colour-splitting sensor tech

    May 7, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.