Welcome to People Who Ship! In this new video and blog series, we’ll be bringing you behind-the-scenes stories and hard-won insights from developers building and shipping production-grade AI applications using MongoDB.
In each month’s episode, your host—myself, Senior AI Developer Advocate at MongoDB—will chat with developers from both inside and outside MongoDB about their projects, tools, and lessons learned along the way. Are you a developer? Great! This is the place for you; People Who Ship is by developers, for developers. And if you’re not (yet) a developer, that’s great too! Stick around to learn how your favorite applications are built.
In this episode, John Ziegler, Engineering Lead on MongoDB’s internal generative AI (Gen AI) tooling team, shares technical decisions made and practical lessons learned while developing a centralized infrastructure called Central RAG (RAG = Retrieval Augmented Generation), which enables teams at MongoDB to rapidly build RAG-based chatbots and copilots for diverse use cases.
John’s top three insights
During our conversation, John shared a number of insights learned during the Central RAG project. Here are the top three:
1. Enforce access controls across all operations
Maintaining data sensitivity and privacy is a key requirement when building enterprise-grade AI applications. This is especially important when curating data sources and building centralized infrastructure that teams and applications across the organization can use. In the context of Central RAG, for example, users should only be able to select or link data sources that they have access to, as knowledge sources for their LLM applications. Even at query time, the LLM should only pull information that the querying user has access to, as context to answer the user’s query. Access controls are typically enforced by an authentication service using access control lists (ACLs) that define the relationships between users and resources.
In Central RAG, this is managed by Credal’s permissions service. You can check out this article that shows you how to build an authentication layer using Credal’s permissions service, and other tools like OpenFGA.
2. Anchor your evaluations in the problem you are trying to solve
Evaluation is a critical aspect of shipping software, including LLM applications. It is not a one-and-done process—each time you change any component of the system, you need to ensure that it does not adversely impact the system’s performance. The evaluation metrics depend on your application’s specific use cases.
For Central RAG, which aims to help teams securely access relevant and up-to-date data sources for building LLM applications, the team incorporates the following checks in the form of integration and end-to-end tests in their CI/CD pipeline:
Ensure access controls are enforced when adding data sources.
Ensure access controls are enforced when retrieving information from data sources.
Ensure that data retention policies are respected, so that removed data sources are no longer retrieved or referenced downstream.
LLM-as-a-judge to evaluate response quality across various use cases with a curated dataset of question-answer pairs.
If you would like to learn more about evaluating LLM applications, we have a detailed tutorial with code.
3. Educate your users on what’s possible and what’s not
User education is critical yet often overlooked when deploying software. This is especially true for this new generation of AI applications, where explaining best practices and setting clear expectations can prevent data security issues and user frustration.
For Central RAG, teams must review the acceptable use policies, legal guidelines, and documentation on available data sources and appropriate use cases before gaining access to the platform. These materials also highlight scenarios to avoid, such as connecting sensitive data sources, and provide guidance on prompting best practices to ensure users can effectively leverage the platform within its intended boundaries.
John’s AI tool recommendations
The backbone of Central RAG is a tool called Credal. Credal provides a platform for teams to quickly create AI applications on top of their data. As maintainers of Central RAG, Credal allows John’s team to create a curated list of data sources for teams to choose from and manage applications created by different teams.
Teams can choose from the curated list or connect custom data sources via connectors, select from an exhaustive list of large language models (LLMs), configure system prompts, and deploy their applications to platforms like Slack, etc., directly from the Credal UI or via their API.
Surprising and delighting users
Overall, John describes his team’s goal with Central RAG as “making it stunningly easy for teams to build RAG applications that surprise and delight people.” We see several organizations adopting this central RAG model to both democratize the development of AI applications and to reduce the time to impact of their teams.
If you are working on similar problems and want to learn about how MongoDB can help, submit a request to speak with one of our specialists. If you would like to explore on your own, check out our self-paced AI Learning Hub and our gen AI examples GitHub repository.
Source: Read More