Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 22, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 22, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 22, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 22, 2025

      Google DeepMind’s CEO says Gemini’s upgrades could lead to AGI — but he still thinks society isn’t “ready for it”

      May 21, 2025

      Windows 11 is getting AI Actions in File Explorer — here’s how to try them right now

      May 21, 2025

      Is The Alters on Game Pass?

      May 21, 2025

      I asked Copilot’s AI to predict the outcome of the Europa League final, and now I’m just sad

      May 21, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Enhance Email Validation with Laravel’s Fluent Email Rule Object

      May 22, 2025
      Recent

      Enhance Email Validation with Laravel’s Fluent Email Rule Object

      May 22, 2025

      Sublime Text Releases Update With Support for Right Sidebar

      May 22, 2025

      Celebrating GAAD by Committing to Universal Design: Equitable Use

      May 21, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      FOSS Weekly #25.21: Oh My Bash, Ubuntu’s New Terminal, Pixelify Android, Fedora’s Wayland Gamble and More

      May 22, 2025
      Recent

      FOSS Weekly #25.21: Oh My Bash, Ubuntu’s New Terminal, Pixelify Android, Fedora’s Wayland Gamble and More

      May 22, 2025

      What are MCP Servers and Why People are Crazy About It?

      May 22, 2025

      Popout3D creates 3D images with a phone or camera

      May 22, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Automating complex document processing: How Onity Group built an intelligent solution using Amazon Bedrock

    Automating complex document processing: How Onity Group built an intelligent solution using Amazon Bedrock

    May 20, 2025

    In the mortgage servicing industry, efficient document processing can mean the difference between business growth and missed opportunities. This post explores how Onity Group, a financial services company specializing in mortgage servicing and origination, used Amazon Bedrock and other AWS services to transform their document processing capabilities.

    Onity Group, founded in 1988, is headquartered in West Palm Beach, Florida. Through its primary operating subsidiary, PHH Mortgage Corporation, and Liberty Reverse Mortgage brand, the company provides mortgage servicing and origination solutions to homeowners, business clients, investors, and others.

    Onity processes millions of pages across hundreds of document types annually, including legal documents such as deeds of trust where critical information is often contained within dense text. The company also had to manage inconsistent handwritten entries and the need to verify notarization and legal seals—tasks that traditional optical character recognition (OCR) and AI and machine learning (AI/ML) solutions struggled to handle effectively. By using foundation models (FMs) provided by Amazon Bedrock, Onity achieved a 50% reduction in document extraction costs while improving overall accuracy by 20% compared to their previous OCR and AI/ML solution.

    Onity’s intelligent document processing (IDP) solution dynamically routes extraction tasks based on content complexity, using the strengths of both its custom AI models and generative AI capabilities provided by Amazon Web Services (AWS) through Amazon Bedrock. This dual-model approach enabled Onity to address the scale and diversity of its mortgage servicing documents more efficiently, driving significant improvements in both cost and accuracy.

    “We needed a solution that could evolve as quickly as our document processing needs,” says Raghavendra (Raghu) Chinhalli, VP of Digital Transformation at Onity Group.

    “By combining AWS AI/ML and generative AI services, we achieved the perfect balance of cost, performance, accuracy, and speed to market,” adds Priyatham Minnamareddy, Director of Digital Transformation & Intelligent Automation.

    Why traditional OCR and ML models fall short

    Traditional document processing presented several fundamental challenges that drove Onity’s search for a more sophisticated solution. The following are key examples:

    • Verbose documents with data elements not clearly identified
      • Issue – Key documents in mortgage servicing contain verbose text with critical data elements embedded without clear identifiers or structure
      • Example – Identifying the exact legal description from a deed of trust, which might be buried within paragraphs of legalese
    • Inconsistent handwritten text
      • Issue – Documents contain handwritten elements that vary significantly in quality, style, and legibility
      • Example – Simple variations in writing formats—such as state names (GA and Georgia) or monetary values (200K or 200,000)—create significant extraction challenges
    • Notarization and legal seal detection
      • Issue – Identifying whether a document is notarized, detecting legal court stamps, verifying if a notary’s commission has expired, or extracting data from legal seals, which come in multiple shapes, requires a deeper understanding of visual and textual cues that traditional methods might miss
    • Limited contextual understanding
      • Issue – Traditional OCR models, although adept at digitizing text, often lack the capacity to interpret the semantic context within a document, hindering a true understanding of the information contained

    These complexities in mortgage servicing documents—ranging from verbose text to inconsistent handwriting and the need for specialized seal detection—proved to be significant limitations for traditional OCR and ML models. This drove Onity to seek a more sophisticated solution to address these fundamental challenges.

    Solution overview

    To address these document processing challenges, Onity built an intelligent solution combining AWS AI/ML and generative AI services.

    Amazon Textract is a ML service that automates the extraction of text, data, and insights from documents and images. By using Amazon Textract, organizations can streamline document processing workflows and unlock valuable data to power intelligent applications.

    Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies. Through a single API, Amazon Bedrock provides access to models from providers such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon, along with a broad set of capabilities to build secure, private, and responsible generative AI applications.

    Amazon Bedrock gives you the flexibility to choose the FM that best suits your needs. For IDP, common solutions use text and vision models such as Amazon Nova Pro or Anthropic’s Claude Sonnet. Beyond model access, Amazon Bedrock provides enterprise-grade security with data processing within your Amazon virtual private cloud (VPC), built-in guardrails for responsible AI use, and comprehensive data protection capabilities that are essential for handling sensitive financial documents. You can select the model that strikes the right balance of accuracy, performance, and cost efficiency for your specific application.

    The following figure shows how the solution works.

    1. Document ingestion – Documents are uploaded to Amazon Simple Storage Service (Amazon S3). Uploading triggers automated processing workflows.
    2. Preprocessing – Before analysis, documents undergo optimization through image enhancement, noise reduction, and layout analysis. These preprocessing steps help facilitate maximum accuracy for subsequent OCR processing.
    3. Classification – Classification occurs through a three-step intelligent workflow orchestrated by Onity’s document classification application. The process outputs each page’s document type and page number in JSON format:
      1. The application uses Amazon Textract to extract document contents.
      2. Extracted content is processed by Onity’s custom AI model. If the model’s confidence score meets the predetermined threshold, classification is complete.
      3. If the document isn’t recognized because the model isn’t trained with that document type, the application automatically routes the document to Anthropic’s Claude Sonnet in Amazon Bedrock. This foundation model, along with other text and vision models such as Anthropic’s Claude and Amazon Nova, can classify documents without additional training, analyzing both text and images. This dual-model approach, using both Onity’s custom model and the generative AI capabilities of Amazon, helps to optimally balance cost efficiency with speed to market.
    4. Extraction – Onity’s document extraction application employs an algorithm-driven approach that queries an internal database to retrieve specific extraction rules for each document type and data element. It then dynamically routes extraction tasks between Amazon Textract and Amazon Bedrock FMs based on the complexity of the content.
      For example, verifying notarization requires complex visual and textual analysis. In these cases, the application uses the capabilities of Amazon Bedrock advanced text and vision models. The solution is built on the Amazon Bedrock API, which allows Onity to use different FMs that provide the optimal balance of cost and accuracy for each document type. This dynamic routing of extraction tasks allows Onity to optimize the balance between cost, performance, and accuracy.
    5. Persistence – The extracted information is stored in a structured format in Onity’s operational databases and in a semi-structured format in Amazon S3 for further downstream processing.

    Security overview

    When processing sensitive financial documents, Onity implements robust data protection measures. Data is encrypted at rest using AWS Key Management Service (AWS KMS) and in transit using TLS protocols. Access to data is strictly controlled using AWS Identity and Access Management (IAM) policies. For architectural best practices building financial services Industry (FSI) applications in AWS, refer to AWS Financial Services Industry Lens. This solution is implemented using AWS Security best practice guidance using Security Pillar – AWS Well-Architected Framework. For AWS security and compliance best practices, refer to Best Practices for Security, Identity, & Compliance.

    Transforming document processing with Amazon Bedrock: Sample use cases

    This section demonstrates how Onity uses Amazon Bedrock to automate the extraction of critical information from complex mortgage servicing documents.

    Deed of trust data extraction

    A deed of trust is a critical legal document that creates a security interest in real property. These documents are typically verbose, containing multiple pages of legal text with critical information including notarization details, legal stamps, property descriptions, and rider attachments. The intelligent extraction solution has reduced data extraction costs by 50% while improving overall accuracy by 20% compared to the previous OCR and AI/ML solution.

    Notarization information extraction

    The following is a sample of a notarized document that combines printed and handwritten text and a notary seal. The document image is passed to the application with a prompt to extract the following information: state, county, notary date, notary expiry date, presence of notary seal, person signed before notary, and notary public name. The prompt also instructs that if a field is manually crossed out or modified, the manually written or modified text should be used for that field in the output.

    Example output:

    {
        "state": "Indiana",
        "county": "Monroe",
        "notary_date": "8/13/2024",
        "notary_expiry_date": "8/24/25",
        "notary_seal": "Present",
        "person_signed": "[Redacted]",
        "notary_public": "[Redacted]"
    }

    Extract rider information

    The following image is of a rider that includes text and a series of check boxes (selected and unselected). The document image is passed to the application with a prompt to extract both checked riders and other riders listed on the document in a provided JSON format.

    Example output:

    {
    "riders_checked": [],
    "Others_listed": ["Manufactured Home Rider", "Manufactured Home Affidavit of Affixation"]
    }

    Automation of the checklist review of home appraisal documents

    Home appraisal reports contain detailed property comparisons and valuations that require careful review of multiple data points, including room counts, square footage, and property features. Traditionally, this review process required manual verification and cross-referencing, making it time-consuming and prone to errors. The automated solution now validates property comparisons and identifies potential discrepancies, significantly reducing review times while improving accuracy by 65% over the manual process.

    The following example shows a document in a grid layout with rows and columns of information. The document image is passed to the application with a prompt to verify if the room counts are identical across the subject and comparables in the appraisal report and if square footages are within a specified percentage of the subject property’s square footage. The prompt also requests an explanation of the analysis results. The application then extracts the required information and provides detailed justification for its findings.

    Example output:

    {
        "Result": "Yes",
        "Explanation": "Both conditions are met. Room counts match at 4-2-2.0 (total-bedrooms-baths) across all properties. Subject property is 884 sq ft, and all comparable (884 sq ft, 884 sq ft, and 1000 sq ft) fall within 15% variance range (751.4-1016.6 sq ft). Comparable #3 at 1000 sq ft is within acceptable 15% range."
    }

    Automated credit report analysis

    Credit reports are essential documents in mortgage servicing that contain critical borrower information from multiple credit bureaus. These reports arrive in diverse formats with scattered information, making manual data extraction time-consuming and error-prone. The solution automatically extracts and standardizes credit scores and scoring models across different report formats, achieving approximately 85% accuracy.

    The following image shows a credit report that combines rows and columns with number and text values. The document image is passed to the application using a prompt instructing it to extract the required information.

    Example output:

     {
        "EFX": {
            "Score": 683,
            "ScoreModel": "Equifax Beacon 5.0"
        },
        "XPN": {
            "Score": 688,
            "ScoreModel": "Experian Fair Isaac V2"
        },
        "TRU": {
            "Score": 691,
            "ScoreModel": "FICO Risk Score Classic 04"
        }
    }

    Conclusion

    Onity’s implementation of intelligent document processing, powered by AWS generative AI services, demonstrates how organizations can transform complex document handling challenges into strategic advantages. By using the generative AI capabilities of Amazon Bedrock, Onity achieved a remarkable 50% reduction in document extraction costs while improving overall accuracy by 20% compared to their previous OCR and AI/ML solution. The impact was even more dramatic in specific use cases—their credit report processing achieved accuracy rates of up to 85%—demonstrating the solution’s exceptional capability in handling complex, multiformat documents.

    The flexible FM selection provided by Amazon Bedrock enables organizations to choose and evolve their AI capabilities over time, helping to strike the optimal balance between performance, accuracy, and cost for each specific use case. The solution’s ability to handle complex documents, including verbose legal documents, handwritten text, and notarized materials, showcases the transformative potential of modern AI technologies in financial services. Beyond the immediate benefits of cost savings and improved accuracy, this implementation provides a blueprint for organizations seeking to modernize their document processing operations while maintaining the agility to adapt to evolving business needs. The success of this solution proves that thoughtful application of AWS AI/ML and generative AI services can deliver tangible business results while positioning organizations for continued innovation in document processing capabilities.

    If you have similar document processing challenges, we recommend starting with Amazon Textract to evaluate if its core OCR and data extraction capabilities meet your needs. For more complex use cases requiring advanced contextual understanding and visual analysis, use Amazon Bedrock text and vision foundation models, such as Amazon Nova Lite, Nova Pro, Anthropic’s Claude Sonnet, and Anthropic’s Claude. Using an Amazon Bedrock model playground, you can quickly experiment with these multimodal models and then compare the best foundation models across different metrics such as accuracy, robustness, and cost using Amazon Bedrock model evaluation. Through this process, you can make informed decisions about which model provides the best balance of performance and cost-effectiveness for your specific use case.


    About the author

    Ramesh Eega is a Global Accounts Solutions Architect based out of Atlanta, GA. He is passionate about helping customers throughout their cloud journey.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleBuild a domain‐aware data preprocessing pipeline: A multi‐agent collaboration approach
    Next Article Enhancing Language Model Generalization: Bridging the Gap Between In-Context Learning and Fine-Tuning

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    May 22, 2025
    Machine Learning

    SPD: Sync-Point Drop for Efficient Tensor Parallelism of Large Language Models

    May 22, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Hugging Face Introduces a Free Model Context Protocol (MCP) Course: A Developer’s Guide to Build and Deploy Context-Aware AI Agents and Applications

    Machine Learning

    AI chatbots distort the news, BBC finds – see what they get wrong

    News & Updates

    Tweaking BIOS settings of patched Raptor Lake motherboards could trash your CPU anyway

    Development

    CVE-2025-45489 – Linksys E5600 Command Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    CVE-2025-39392 – Mojoomla WPAMS Cross-site Scripting

    May 19, 2025

    CVE ID : CVE-2025-39392

    Published : May 19, 2025, 8:15 p.m. | 2 hours, 33 minutes ago

    Description : Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’) vulnerability in mojoomla WPAMS allows Reflected XSS.This issue affects WPAMS: from n/a through 44.0 (17-08-2023).

    Severity: 7.1 | HIGH

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    Hades 2 gets a familiar god to thirst over and a new boss battle in its second massive update

    February 20, 2025

    Xbox Cloud Gaming has hit 140 million playtime hours according to Microsoft CEO Satya Nadella

    January 30, 2025

    Beginner’s guide to GitHub repositories: How to create your first repo

    June 24, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.