Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Privacy Implications and Comparisons of Batch Sampling Methods in Differentially Private Stochastic Gradient Descent (DP-SGD)

    Privacy Implications and Comparisons of Batch Sampling Methods in Differentially Private Stochastic Gradient Descent (DP-SGD)

    December 2, 2024

    Differentially Private Stochastic Gradient Descent (DP-SGD) is a key method for training machine learning models like neural networks while ensuring privacy. It modifies the standard gradient descent process by clipping individual gradients to a fixed norm and adding noise to the aggregated gradients of each mini-batch. This approach enables privacy by preventing sensitive information in the data from being revealed during training. DP-SGD has been widely adopted in various applications, including image recognition, generative modeling, language processing, and medical imaging. The privacy guarantees depend on noise levels, dataset size, batch sizes, and the number of training iterations.

    Data is typically shuffled globally and split into fixed-size mini-batches to train models using DP-SGD. However, this differs from theoretical approaches that construct mini-batches probabilistically, leading to variable sizes. This practical difference introduces subtle privacy risks, as some information about data records can leak during batching. Despite these challenges, shuffle-based batching remains the most common method due to its efficiency and compatibility with modern deep-learning pipelines, emphasizing the balance between privacy and practicality.

    Researchers from Google Research examine the privacy implications of different batch sampling methods in DP-SGD. Their findings reveal significant disparities between shuffling and Poisson subsampling. Shuffling, commonly used in practice, poses challenges in privacy analysis, while Poisson subsampling offers clearer accounting but is less scalable. The study demonstrates that using Poisson-based privacy metrics for shuffling implementations can underestimate privacy loss. This highlights the crucial influence of batch sampling on privacy guarantees, urging caution in reporting privacy parameters and emphasizing the need for accurate analysis in DP-SGD implementations.

    Differential Privacy (DP) mechanisms map input datasets to distributions over an output space and ensure privacy by limiting the likelihood of identifying changes in individual records. Adjacent datasets differ by one record, formalized as add-remove, substitution, or zero-out adjacency. The Adaptive Batch Linear Queries (ABLQ) mechanism uses batch samplers and an adaptive query method to estimate data with Gaussian noise for privacy. Dominating pairs probability distributions representing worst-case privacy loss simplify DP analysis for ABLQ mechanisms. For deterministic (D) and Poisson (P) samplers, tightly dominating pairs are established, while shuffle (S) samplers have conjectured dominating pairs, enabling fair privacy comparisons.

    The privacy loss comparisons between different mechanisms show that ABLQS offers stronger privacy guarantees than ABLQD, as shuffling cannot degrade privacy. ABLQD and ABLQP exhibit incomparable privacy loss, with ABLQD having a greater loss for smaller ε, while ABLQP’s loss exceeds ABLQD’s for larger ε. This difference arises from variations in total variation distances and set constructions. ABLQP provides stronger privacy protection than ABLQS, particularly for small ε, because ABLQS is more sensitive to non-differing records. At the same time, ABLQP does not depend on such records, leading to more consistent privacy.

    In conclusion, the work highlights key gaps in the privacy analysis of adaptive batch linear query mechanisms, particularly under deterministic, Poisson, and shuffle batch samplers. While shuffling improves privacy over deterministic sampling, Poisson sampling can result in worse privacy guarantees at large ε. The study also reveals that shuffle batch sampling’s amplification is limited compared to Poisson subsampling. Future work includes developing tighter privacy accounting methods for shuffle batch sampling, extending the analysis to multiple epochs, and exploring alternative privacy amplification techniques like DP-FTRL. More sophisticated privacy analysis is also needed for real-world data loaders and non-convex models.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.

    🎙 🚨 ‘Evaluation of Large Language Model Vulnerabilities: A Comparative Analysis of Red Teaming Techniques’ Read the Full Report (Promoted)

    The post Privacy Implications and Comparisons of Batch Sampling Methods in Differentially Private Stochastic Gradient Descent (DP-SGD) appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleEnsuring Success: The Role of QA in Dynamics 365 Implementation
    Next Article Cohere Evolves Enterprise AI in 2024: Innovations in Generative Models, Multilingual Processing, and Developer Tools

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2024-47893 – VMware GPU Firmware Memory Disclosure

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Malicious PyPI Packages Stole Cloud Tokens—Over 14,100 Downloads Before Removal

    Development

    This Asus Copilot+ PC has one of the best displays I’ve seen on a laptop (and it exudes premium)

    Development

    Build multi-tenant architectures on Amazon Neptune

    Databases

    Invisibility

    Web Development

    Highlights

    Development

    Ten Wild Examples of Llama 3.1 Use Cases

    August 4, 2024

    Meta’s recent release of Llama 3.1 has stirred excitement in the AI community, offering an…

    Who will launch the personalized banking UX age: on-device Apple AI or banks’ cloud-based AI?

    June 26, 2024

    Apple’s iPhone SE 4 could be even better than the iPhone 14 at a fraction of the cost

    December 31, 2024

    New Cuttlefish Malware Hijacks Router Connections, Sniffs for Cloud Credentials

    May 2, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.