Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      BrowserStack launches Figma plugin for detecting accessibility issues in design phase

      July 22, 2025

      Parasoft brings agentic AI to service virtualization in latest release

      July 22, 2025

      Node.js vs. Python for Backend: 7 Reasons C-Level Leaders Choose Node.js Talent

      July 21, 2025

      Handling JavaScript Event Listeners With Parameters

      July 21, 2025

      I finally gave NotebookLM my full attention – and it really is a total game changer

      July 22, 2025

      Google Chrome for iOS now lets you switch between personal and work accounts

      July 22, 2025

      How the Trump administration changed AI: A timeline

      July 22, 2025

      Download your photos before AT&T shuts down its cloud storage service permanently

      July 22, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Laravel Live Denmark

      July 22, 2025
      Recent

      Laravel Live Denmark

      July 22, 2025

      The July 2025 Laravel Worldwide Meetup is Today

      July 22, 2025

      Livewire Security Vulnerability

      July 22, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Galaxy Z Fold 7 review: Six years later — Samsung finally cracks the foldable code

      July 22, 2025
      Recent

      Galaxy Z Fold 7 review: Six years later — Samsung finally cracks the foldable code

      July 22, 2025

      Halo and Half-Life combine in wild new mod, bringing two of my favorite games together in one — here’s how to play, and how it works

      July 22, 2025

      Surprise! The iconic Roblox ‘oof’ sound is back — the beloved meme makes “a comeback so good it hurts” after three years of licensing issues

      July 22, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Impel enhances automotive dealership customer experience with fine-tuned LLMs on Amazon SageMaker

    Impel enhances automotive dealership customer experience with fine-tuned LLMs on Amazon SageMaker

    June 4, 2025

    This post is co-written with Tatia Tsmindashvili, Ana Kolkhidashvili, Guram Dentoshvili, Dachi Choladze from Impel.

    Impel transforms automotive retail through an AI-powered customer lifecycle management solution that drives dealership operations and customer interactions. Their core product, Sales AI, provides all-day personalized customer engagement, handling vehicle-specific questions and automotive trade-in and financing inquiries. By replacing their existing third-party large language model (LLM) with a fine-tuned Meta Llama model deployed on Amazon SageMaker AI, Impel achieved 20% improved accuracy and greater cost controls. The implementation using the comprehensive feature set of Amazon SageMaker, including model training, Activation-Aware Weight Quantization (AWQ), and Large Model Inference (LMI) containers. This domain-specific approach not only improved output quality but also enhanced security and operational overhead compared to general-purpose LLMs.

    In this post, we share how Impel enhances the automotive dealership customer experience with fine-tuned LLMs on SageMaker.

    Impel’s Sales AI

    Impel optimizes how automotive retailers connect with customers by delivering personalized experiences at every touchpoint—from initial research to purchase, service, and repeat business, acting as a digital concierge for vehicle owners, while giving retailers personalization capabilities for customer interactions. Sales AI uses generative AI to provide instant responses around the clock to prospective customers through email and text. This maintained engagement during the early stages of a customer’s car buying journey leads to showroom appointments or direct connections with sales teams. Sales AI has three core features to provide this consistent customer engagement:

    • Summarization – Summarizes past customer engagements to derive customer intent
    • Follow-up generation – Provides consistent follow-up to engaged customers to help prevent stalled customer purchasing journeys
    • Response personalization – Personalizes responses to align with retailer messaging and customer’s purchasing specifications

    Two key factors drove Impel to transition from their existing LLM provider: the need for model customization and cost optimization at scale. Their previous solution’s per-token pricing model became cost-prohibitive as transaction volumes grew, and limitations on fine-tuning prevented them from fully using their proprietary data for model improvement. By deploying a fine-tuned Meta Llama model on SageMaker, Impel achieved the following:

    • Cost predictability through hosted pricing, mitigating per-token charges
    • Greater control of model training and customization, leading to 20% improvement across core features
    • Secure processing of proprietary data within their AWS account
    • Automatic scaling to meet the spike in inference demand

    Solution overview

    Impel chose SageMaker AI, a fully managed cloud service that builds, trains, and deploys machine learning (ML) models using AWS infrastructure, tools, and workflows to fine-tune a Meta Llama model for Sales AI. Meta Llama is a powerful model, well-suited for industry-specific tasks due to its strong instruction-following capabilities, support for extended context windows, and efficient handling of domain knowledge.

    Impel used SageMaker LMI containers to deploy LLM inference on SageMaker endpoints. These purpose-built Docker containers offer optimized performance for models like Meta Llama with support for LoRA fine-tuned models and AWQ. Impel used LoRA fine-tuning, an efficient and cost-effective technique to adapt LLMs for specialized applications, through Amazon SageMaker Studio notebooks running on ml.p4de.24xlarge instances. This managed environment simplified the development process, enabling Impel’s team to seamlessly integrate popular open source tools like PyTorch and torchtune for model training. For model optimization, Impel applied AWQ techniques to reduce model size and improve inference performance.

    In production, Impel deployed inference endpoints on ml.g6e.12xlarge instances, powered by four NVIDIA GPUs and high memory capacity, suitable for serving large models like Meta Llama efficiently. Impel used the SageMaker built-in automatic scaling feature to automatically scale serving containers based on concurrent requests, which helped meet variable production traffic demands while optimizing for cost.

    The following diagram illustrates the solution architecture, showcasing model fine-tuning and customer inference.

    AWS ML deployment architecture showing how engineers use SageMaker to serve fine-tuned models to customers via APIs

    Impel’s Sales AI reference architecture.

    Impel’s R&D team partnered closely with various AWS teams, including its Account team, GenAI strategy team, and SageMaker service team. This virtual team collaborated over multiple sprints leading up to the fine-tuned Sales AI launch date to review model evaluations, benchmark SageMaker performance, optimize scaling strategies, and identify the optimal SageMaker instances. This partnership encompassed technical sessions, strategic alignment meetings, and cost and operational discussions for post-implementation. The tight collaboration between Impel and AWS was instrumental in realizing the full potential of Impel’s fine-tuned model hosted on SageMaker AI.

    Fine-tuned model evaluation process

    Impel’s transition to its fine-tuned Meta Llama model delivered improvements across key performance metrics with noticeable improvements in understanding automotive-specific terminology and generating personalized responses. Structured human evaluations revealed enhancements in critical customer interaction areas: personalized replies improved from 73% to 86% accuracy, conversation summarization increased from 70% to 83%, and follow-up message generation showed the most significant gain, jumping from 59% to 92% accuracy. The following screenshot shows how customers interact with Sales AI. The model evaluation process included Impel’s R&D team grading various use cases served by the incumbent LLM provider and Impel’s fine-tuned models.

    Customer service interaction showing automated dealership response offering appointment scheduling for Toyota Highlander XLE

    Example of a customer interaction with Sales AI.

    In addition to output quality, Impel measured latency and throughput to validate the model’s production readiness. Using awscurl for SigV4-signed HTTP requests, the team confirmed these improvements in real-world performance metrics, ensuring optimal customer experience in production environments.

    Using domain-specific models for better performance

    Impel’s evolution of Sales AI progressed from a general-purpose LLM to a domain-specific, fine-tuned model. Using anonymized customer interaction data, Impel fine-tuned a publicly available foundation model, resulting in several key improvements. The new model exhibited a 20% increase in accuracy across core features, showcasing enhanced automotive industry comprehension and more efficient context window utilization. By transitioning to this approach, Impel achieved three primary benefits:

    • Enhanced data security through in-house processing within their AWS accounts
    • Reduced reliance on external APIs and third-party providers
    • Greater operational control for scaling and customization

    These advancements, coupled with the significant output quality improvement, validated Impel’s strategic shift towards a domain-specific AI model for Sales AI.

    Expanding AI innovation in automotive retail

    Impel’s success deploying fine-tuned models on SageMaker has established a foundation for extending its AI capabilities to support a broader range of use cases tailored to the automotive industry. Impel is planning to transition to in-house, domain-specific models to extend the benefits of improved accuracy and performance throughout their Customer Engagement Product suite.Looking ahead, Impel’s R&D team is advancing their AI capabilities by incorporating Retrieval Augmented Generation (RAG) workflows, advanced function calling, and agentic workflows. These innovations can help deliver adaptive, context-aware systems designed to interact, reason, and act across complex automotive retail tasks.

    Conclusion

    In this post, we discussed how Impel has enhanced the automotive dealership customer experience with fine-tuned LLMs on SageMaker.

    For organizations considering similar transitions to fine-tuned models, Impel’s experience demonstrates how working with AWS can help achieve both accuracy improvements and model customization opportunities while building long-term AI capabilities tailored to specific industry needs. Connect with your account team or visit Amazon SageMaker AI to learn how SageMaker can help you deploy and manage fine-tuned models.


    About the Authors

    Nicholas Scozzafava is a Senior Solutions Architect at AWS, focused on startup customers. Prior to his current role, he helped enterprise customers navigate their cloud journeys. He is passionate about cloud infrastructure, automation, DevOps, and helping customers build and scale on AWS.

    Sam Sudakoff is a Senior Account Manager at AWS, focused on strategic startup ISVs. Sam specializes in technology landscapes, AI/ML, and AWS solutions. Sam’s passion lies in scaling startups and driving SaaS and AI transformations. Notably, his work with AWS’s top startup ISVs has focused on building strategic partnerships and implementing go-to-market initiatives that bridge enterprise technology with innovative startup solutions, while maintaining strict adherence with data security and privacy requirements.

    Vivek Gangasani is a Lead Specialist Solutions Architect for Inference at AWS. He helps emerging generative AI companies build innovative solutions using AWS services and accelerated compute. Currently, he is focused on developing strategies for fine-tuning and optimizing the inference performance of large language models. In his free time, Vivek enjoys hiking, watching movies, and trying different cuisines.

    Dmitry Soldatkin is a Senior AI/ML Solutions Architect at AWS, helping customers design and build AI/ML solutions. Dmitry’s work covers a wide range of ML use cases, with a primary interest in generative AI, deep learning, and scaling ML across the enterprise. He has helped companies in many industries, including insurance, financial services, utilities, and telecommunications. Prior to joining AWS, Dmitry was an architect, developer, and technology leader in data analytics and machine learning fields in the financial services industry.

    Tatia Tsmindashvili is a Senior Deep Learning Researcher at Impel with an MSc in Biomedical Engineering and Medical Informatics. She has over 5 years of experience in AI, with interests spanning LLM agents, simulations, and neuroscience. You can find her on LinkedIn.

    Ana Kolkhidashvili is the Director of R&D at Impel, where she leads AI initiatives focused on large language models and automated conversation systems. She has over 8 years of experience in AI, specializing in large language models, automated conversation systems, and NLP. You can find her on LinkedIn.

    Guram Dentoshvili is the Director of Engineering and R&D at Impel, where he leads the development of scalable AI solutions and drives innovation across the company’s conversational AI products. He began his career at Pulsar AI as a Machine Learning Engineer and played a key role in building AI technologies tailored to the automotive industry. You can find him on LinkedIn.

    Dachi Choladze is the Chief Innovation Officer at Impel, where he leads initiatives in AI strategy, innovation, and product development. He has over 10 years of experience in technology entrepreneurship and artificial intelligence. Dachi is the co-founder of Pulsar AI, Georgia’s first globally successful AI startup, which later merged with Impel. You can find him on LinkedIn.

    Deepam Mishra is a Sr Advisor to Startups at AWS and advises startups on ML, Generative AI, and AI Safety and Responsibility. Before joining AWS, Deepam co-founded and led an AI business at Microsoft Corporation and Wipro Technologies. Deepam has been a serial entrepreneur and investor, having founded 4 AI/ML startups. Deepam is based in the NYC metro area and enjoys meeting AI founders.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleBeyond Text Compression: Evaluating Tokenizers Across Scales
    Next Article Mistral AI Introduces Mistral Code: A Customizable AI Coding Assistant for Enterprise Workflows

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    July 22, 2025
    Machine Learning

    Building a Smart Python-to-R Code Converter with Gemini AI-Powered Validation and Feedback

    July 22, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CVE-2025-49603 – Northern.tech Mender Server Unauthenticated Remote Code Execution

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-5656 – PHPGurukul Complaint Management System SQL Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    How to Access Oracle Fusion Cloud Apps Data from Snowflake

    Development

    CVE-2024-56343 – IBM Verify Identity Access Digital Credentials Denial of Service

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    CVE-2025-7455 – Campcodes Online Movie Theater Seat Reservation System SQL Injection Vulnerability

    July 11, 2025

    CVE ID : CVE-2025-7455

    Published : July 11, 2025, 8:15 p.m. | 50 minutes ago

    Description : A vulnerability classified as critical was found in Campcodes Online Movie Theater Seat Reservation System 1.0. Affected by this vulnerability is an unknown functionality of the file /manage_reserve.php. The manipulation of the argument mid leads to sql injection. The attack can be launched remotely. The exploit has been disclosed to the public and may be used.

    Severity: 7.3 | HIGH

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    FOSS Weekly #25.17: Ubuntu 25.04 and Fedora 42 Release Follow-ups, Logseq, ZimaBoard and More

    April 24, 2025

    Grand Theft Auto VI delay causes Take-Two shares to drop, while CEO Strauss Zelnick assures investors that everything is fine

    May 2, 2025

    Transform Your Workflow With These 10 Essential Yet Overlooked Linux Tools You Need to Try

    May 29, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.