Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Announcing New Language Support for PII Text Redaction and Expanding Entity Detection

    Announcing New Language Support for PII Text Redaction and Expanding Entity Detection

    July 27, 2024

    At Assembly, we’re focused on ensuring that you can extract maximum value and insights from voice data, while also keeping privacy and security at the forefront to keep you and your end users safe. Today, we’re announcing updates to our PII Text Redaction and Entity Detection features to give you more power and control in protecting sensitive information.

    What’s New

    PII Text Redaction now available in 47 additional languages16 new entity types added to Entity Detection for a total of 44 types available

    PII Redaction: Safeguarding Sensitive Information Across Languages

    Our latest update brings expanded language support to our PII Text Redaction feature, now available in 47 additional languages. 

    This enhancement ensures that your Personally Identifiable Information (PII) — any information that can be used to identify a person — is safeguarded regardless of location or language, making robust privacy measures more accessible.

    With this feature, you can:

    Securely handle customer service calls containing personal informationSafely share and analyze user-generated content in media applicationsProtect participant privacy in market research studies and surveys

    With PII Redaction, you can identify and remove personal data such as addresses, phone numbers, and credit card details from your transcripts. You can accomplish this in two ways:

    Text Redaction: Generates a transcript with PII removed. Example:  “You can reach me at [phone_number]” or “You can reach me at ###.”Audio Redaction: “Beeps out” sensitive information in your audio file.

    With a few extra lines of code, you can quickly redact PII from your transcripts.

    import assemblyai as aai

    aai.settings.api_key = “YOUR API KEY”

    audio_url = “https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3”

    config = aai.TranscriptionConfig(speaker_labels=True).set_redact_pii(
    policies=[
    aai.PIIRedactionPolicy.person_name,
    aai.PIIRedactionPolicy.organization,
    aai.PIIRedactionPolicy.occupation,
    ],
    substitution=aai.PIISubstitutionPolicy.hash,
    )

    transcript = aai.Transcriber().transcribe(audio_url, config)

    for utterance in transcript.utterances:
    print(f”Speaker {utterance.speaker}: {utterance.text}”)

    print(transcript.text)

    Example output

    Speaker A: Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US. Skylines from Maine to Maryland to Minnesota are gray and smoggy. And in some places, the air quality warnings include the warning to stay inside. We wanted to better understand what’s happening here and why. So we called ##### #######, an ######### ######### in the ########## ## ############# ###### ### ########### at ##### ####### ##########. Good morning, #########.

    Speaker B: Good morning.

    Speaker A: So what is it about the conditions right now that have caused this round of wildfires to affect so many people so far away?

    Speaker B: Well, there’s a couple of things. The season has been pretty dry already, and then the fact that we’re getting hit in the US is because there’s a couple weather systems that…

    Plus, the PII models achieve 99%+ precision, accuracy, and recall in major languages, including English, French, German, Italian, Portuguese, Spanish, Korean, Hindi, Russian, Tagalog, and Ukrainian. 

    This ensures reliable protection of sensitive information. For EU-based operations, we support PII Text Redaction in 13 languages, meeting regional data residency requirements.

    Expanding Entity Detection

    Extracting meaningful insights from large volumes of audio data can be time-consuming and resource-intensive. We enhanced our Entity Detection with 16 new entity types for a total of 44 different entities so you can extract more value from your audio data.

    You can automatically identify and categorize key information in your transcripts, providing  detailed entity lists and timestamps. 

    Here are a few examples of what you can detect:

    Names of peopleOrganizationsAddressesPhone numbersMedical dataSocial security numbers

    Here’s how you can enable Entity Detection within your app:

    import assemblyai as aai

    aai.settings.api_key = “YOUR API KEY”

    audio_url = “https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3”

    config = aai.TranscriptionConfig(entity_detection=True)

    transcript = aai.Transcriber().transcribe(audio_url, config)

    for entity in transcript.entities:
    print(entity.text)
    print(entity.entity_type)
    print(f”Timestamp: {entity.start} – {entity.end}n”)

    You’ll receive precise entity data, such as:

    Canada
    location
    Timestamp: 2548 – 3130

    the US
    location
    Timestamp: 5498 – 6350

    …

    By eliminating manual review, you can efficiently search and categorize audio content across multiple languages and markets. This opens up new possibilities for in-depth analysis and data-driven decision making.

    With Entity Detection, you’ll be able to run key use cases like:

    Rapidly analyze call center interactions to improve customer serviceCategorize and search media content more effectivelyExtract key trends and patterns from market research data

    Entity Detection delivers reliable results with 99% accuracy in major languages. It also supports EU data residency for 13 languages, helping you maintain regional compliance requirements. Quickly unlock the full potential of your audio data, gaining deeper insights into customer interactions and market trends across a broad range of contexts.

    Frequently Asked Questions

    I want to store and process my data in the EU. Will the expanded PII Text Redaction and Entity Detection languages be supported by EU Data Residency?

    Yes, 13 languages in our “Best ASR” offering will be supported by EU Data Residency: English, Finnish, French, German, Italian, Korean, Polish, Portuguese, Russian, Spanish, Turkish, Ukrainian, and Vietnamese.

    What is the quality of PII Text Redaction and Entity Detection across languages?

    The highest quality PII Text Redaction and Entity Detection is found in English, French, German, Italian, Portuguese, Spanish, Korean, Hindi, Dutch, Japanese, Mandarin, Russian, Tagalog, and Ukrainian. These languages have fully trained corpora with verified 99%+ precision, accuracy, and recall results.

    PII Redaction and Entity Detection in other supported languages perform well, but the training corpus is actively being improved to bring it up to the same level as the other languages.

    How secure is my data when using AssemblyAI’s PII Redaction and Entity Detection?

    AssemblyAI prioritizes data security with enterprise-grade encryption both in transit and at rest. We adhere to stringent data security practices to ensure your sensitive information is protected. Additionally, users can request the deletion of their data at any time, and these requests are handled promptly.



    Start building with new security features today.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMachine learning unlocks secrets to advanced alloys
    Next Article Speech-to-Text security: Top foundational security questions to consider for your next project using speech

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Optimizing Factory Data Creation with Laravel’s recycle Method

    Development

    SPRITE (Spatial Propagation and Reinforcement of Imputed Transcript Expression): Enhancing Spatial Gene Expression Predictions and Downstream Analyses Through Meta-Algorithmic Integration

    Development

    WARNING: Expiring Root Certificate May Disable Firefox Add-Ons, Security Features, and DRM Playback

    Development

    What killed innovation?

    Web Development
    Hostinger

    Highlights

    Artificial Intelligence

    Rajeev’s Viral Success: How Opportunities Today Took His Business to the Next Level?

    August 12, 2024

    Start Your Own ChatGPT Office with AI Agents: Revolutionize Your Business with Intelligent Virtual Assistants…

    Quantum Framework (QFw): A Flexible Framework for Hybrid HPC and Quantum Computing

    August 20, 2024

    Tap into Your PHP Potential with Free Projects at PHPGurukul

    May 9, 2025

    Use the AWS InfluxDB migration script to migrate your InfluxDB OSS 2.x data to Amazon Timestream for InfluxDB

    July 26, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.