Aligning AI with human values

Senior Audrey Lorvo is researching AI safety, which seeks to ensure increasingly intelligent AI models are reliable and can benefit humanity. The growing field focuses on technical challenges like robustness and AI alignment with human values, as well as societal concerns like transparency and accountability. Practitioners are also concerned with the potential existential risks associated with increasingly powerful AI tools.

“Ensuring AI isn’t misused or acts contrary to our intentions is increasingly important as we approach artificial general intelligence (AGI),” says Lorvo, a computer science, economics, and data science major. AGI describes the potential of artificial intelligence to match or surpass human cognitive capabilities.

An MIT Schwarzman College of Computing Social and Ethical Responsibilities of Computing (SERC) scholar, Lorvo looks closely at how AI might automate AI research and development processes and practices. A member of the Big Data research group, she’s investigating the social and economic implications associated with AI’s potential to accelerate research on itself and how to effectively communicate these ideas and potential impacts to general audiences including legislators, strategic advisors, and others.

Lorvo emphasizes the need to critically assess AI’s rapid advancements and their implications, ensuring organizations have proper frameworks and strategies in place to address risks. “We need to both ensure humans reap AI’s benefits and that we don’t lose control of the technology,” she says. “We need to do all we can to develop it safely.”

Her participation in efforts like the AI Safety Technical Fellowship reflect her investment in understanding the technical aspects of AI safety. The fellowship provides opportunities to review existing research on aligning AI development with considerations of potential human impact. “The fellowship helped me understand AI safety’s technical questions and challenges so I can potentially propose better AI governance strategies,” she says. According to Lorvo, companies on AI’s frontier continue to push boundaries, which means we’ll need to implement effective policies that prioritize human safety without impeding research.

Value from human engagement

When arriving at MIT, Lorvo knew she wanted to pursue a course of study that would allow her to work at the intersection of science and the humanities. The variety of offerings at the Institute made her choices difficult, however.

“There are so many ways to help advance the quality of life for individuals and communities,” she says, “and MIT offers so many different paths for investigation.”

Beginning with economics — a discipline she enjoys because of its focus on quantifying impact — Lorvo investigated math, political science, and urban planning before choosing Course 6-14.

“Professor Joshua Angrist’s econometrics classes helped me see the value in focusing on economics, while the data science and computer science elements appealed to me because of the growing reach and potential impact of AI,” she says. “We can use these tools to tackle some of the world’s most pressing problems and hopefully overcome serious challenges.”

Lorvo has also pursued concentrations in urban studies and planning and international development.

As she’s narrowed her focus, Lorvo finds she shares an outlook on humanity with other members of the MIT community like the MIT AI Alignment group, from whom she learned quite a bit about AI safety. “Students care about their marginal impact,” she says.

Marginal impact, the additional effect of a specific investment of time, money, or effort, is a way to measure how much a contribution adds to what is already being done, rather than focusing on the total impact. This can potentially influence where people choose to devote their resources, an idea that appeals to Lorvo.

“In a world of limited resources, a data-driven approach to solving some of our biggest challenges can benefit from a tailored approach that directs people to where they’re likely to do the most good,” she says. “If you want to maximize your social impact, reflecting on your career choice’s marginal impact can be very valuable.”

Lorvo also values MIT’s focus on educating the whole student and has taken advantage of opportunities to investigate disciplines like philosophy through MIT Concourse, a program that facilitates dialogue between science and the humanities. Concourse hopes participants gain guidance, clarity, and purpose for scientific, technical, and human pursuits.

Student experiences at the Institute

Lorvo invests her time outside the classroom in creating memorable experiences and fostering relationships with her classmates. “I’m fortunate that there’s space to balance my coursework, research, and club commitments with other activities, like weightlifting and off-campus initiatives,” she says. “There are always so many clubs and events available across the Institute.”

These opportunities to expand her worldview have challenged her beliefs and exposed her to new interest areas that have altered her life and career choices for the better. Lorvo, who is fluent in French, English, Spanish, and Portuguese, also applauds MIT for the international experiences it provides for students.

“I’ve interned in Santiago de Chile and Paris with MISTI and helped test a water vapor condensing chamber that we designed in a fall 2023 D-Lab class in collaboration with the Madagascar Polytechnic School and Tatirano NGO [nongovernmental organization],” she says, “and have enjoyed the opportunities to learn about addressing economic inequality through my International Development and D-Lab classes.”

As president of MIT’s Undergraduate Economics Association, Lorvo connects with other students interested in economics while continuing to expand her understanding of the field. She enjoys the relationships she’s building while also participating in the association’s events throughout the year. “Even as a senior, I’ve found new campus communities to explore and appreciate,” she says. “I encourage other students to continue exploring groups and classes that spark their interests throughout their time at MIT.”

After graduation, Lorvo wants to continue investigating AI safety and researching governance strategies that can help ensure AI’s safe and effective deployment.

“Good governance is essential to AI’s successful development and ensuring humanity can benefit from its transformative potential,” she says. “We must continue to monitor AI’s growth and capabilities as the technology continues to evolve.”

Understanding technology’s potential impacts on humanity, doing good, continually improving, and creating spaces where big ideas can see the light of day continue to drive Lorvo. Merging the humanities with the sciences animates much of what she does. “I always hoped to contribute to improving people’s lives, and AI represents humanity’s greatest challenge and opportunity yet,” she says. “I believe the AI safety field can benefit from people with interdisciplinary experiences like the kind I’ve been fortunate to gain, and I encourage anyone passionate about shaping the future to explore it.”

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

How Red Hat just quietly, radically transformed enterprise server Linux

OpenAI wants ChatGPT to be your ‘super assistant’ – what that means

The best Linux VPNs of 2025: Expert tested and reviewed

One of my favorite gaming PCs is 60% off right now

`document.currentScript` is more useful than I thought.

`document.currentScript` is more useful than I thought.

Adobe Sensei and GenAI in Practice for Enterprise CMS

Over The Air Updates for React Native Apps

You can now open ChatGPT on Windows 11 with Win+C (if you change the Settings)

You can now open ChatGPT on Windows 11 with Win+C (if you change the Settings)

Microsoft says Copilot can use location to change Outlook’s UI on Android

TempoMail — Command Line Temporary Email in Linux

Aligning AI with human values

Markus Buehler receives 2025 Washington Award

LWiAI Podcast #201 – GPT 4.5, Sonnet 3.7, Grok 3, Phi 4

The Role of ReactJS in Digital Transformation: Why Your Business Needs It

Model Performance Begins with Data: Researchers from Ai2 Release DataDecide—A Benchmark Suite to Understand Pretraining Data Impact Across 30K LLM Checkpoints

CVE-2025-30329 – Adobe Animate NULL Pointer Dereference Denial-of-Service

Laravel 12 Starter Kits: Definite Guide Which to Choose

Debugging Projects in Godot [FREE]

Voice into Music: AI Composition with Python Magenta

How we improved push processing on GitHub

Tipalti vs. Airbase: Which AP automation tool is best?

Aligning AI with human values

Related Posts