Code review in the age of AI: Why developers will always own the merge button

When GitHub first shipped the pull request (PR) back in 2008, it wrapped a plain-text diff in a social workflow: comments, approvals, and a merge button that crucially refused to light up without at least one thumbs up from another developer. That design decision hard-wired accountability into modern software and let maintainers scale far beyond hallway conversations or e-mail patches.

Seventeen years later, just about every “agentic” coding tool, from research demos to enterprise platforms, still funnels its work through that same merge gate. The PR remains the audit log, the governance layer, and the social contract that says nothing ships until a person is willing to own it.

Now that large language models (LLM) can scaffold projects, file PRs, and even reply to review comments they wrote themselves, the obvious next question is, who is accountable for code that ships when part of it comes from a model?

At GitHub, we think the answer hasn’t fundamentally changed: it’s the developer who hits “Merge.” But what has changed is everything that happens before that click.

In this article, we’ll explore how we’re re-thinking code reviews for a world where developers increasingly work with AI (and how your team can, too).

What a code review is (still) for

Before diving into AI-assisted reviews, it’s worth revisiting what makes code reviews effective in the first place. A review is far more than a bug hunt. A good review:

Catches defects and security issues
Ensures high code quality
Shares knowledge across the team and maintains consistency with your codebase’s patterns and standards
Safeguards long-term maintainability

AI changes none of that; it only moves the bottlenecks. A model can quickly spot an unused import, but it can’t decide if a new endpoint undermines your privacy stance or if today is the right day to pay down that gnarly abstraction you’ve been avoiding. The merge button still needs (and, in our view, always will need) a developer fingerprint.

For a deeper dive into effective code review practices, check out our guide on reviewing code effectively.

What we learned from GitHub Copilot’s code review capabilities

Earlier this year, the GitHub Copilot code review team conducted in-depth interviews with developers about their code review process. They also walked us through their code review workflow. These interviews revealed three consistent patterns:

No special treatment for AI: Reviewers grilled model-generated diffs as hard as those from other developers.
Self reviews raised the floor: Developers who ran a Copilot review before opening a PR often wiped out an entire class of trivial nit-picks (i.e., trimmed imports, missing tests), cutting out back-and-forth by roughly a third.
AI was no replacement for human judgement: Programming often involves trade-offs. LLMs can inform you about those trade-offs, but someone has to make the call about what path to take based on your organization’s goals and standards.

An overarching principle quickly became clear: AI augments developer judgment; it can’t replace it. And our findings, from confidence scores to red-flag explanations, are informing how we’re building Copilot’s code review features.

What AI can (and can’t) handle today

LLMs are already great at the “grind” layer of a review:

Mechanical scanning. “Is there a typo?” “Are all arguments used?”
Pattern matching. “This looks like SQL injection” or “You forgot to await that promise.”
Pedantic consistency. “Variable names snake_case here, camelCase there.”

Soon they’ll be able to do even more, such as understand product and domain context. But they still fall short on:

Architecture and trade-offs. Should we split this service? Cache locally?
Mentorship. Explaining why a pattern matters and when to break it.
Values. Should we build this feature at all?

Those gaps keep developers in the loop and in the pilot’s seat. That principle is foundational for us as we continue to develop GitHub Copilot.

A playbook for modern code reviews

The most effective approach to AI-assisted code reviews starts before you even submit your pull request. Think of it as the golden rule of development: Treat code reviewers the way you’d like them to treat you.

Use AI to self review your code in your IDE

Before pushing your code, run GitHub Copilot code review in your IDE to catch the obvious stuff so your teammates can focus on the nuanced issues that require developer insight. Copilot code review can comb your staged diff, suggest docstrings, and flag null dereferences. From there, you can fix everything it finds before you submit your PR so teammates never see the noise.

Take ownership of your code

Just because you used AI to generate code doesn’t mean it’s not your code. Once you commit code, you’re responsible for it. That means understanding what it does, ensuring it follows your team’s standards, and making sure it integrates well with the rest of your codebase.

If an AI agent writes code, it’s on me to clean it up before my name shows up in git blame.

Jon Wiggins, Machine Learning Engineer at Respondology

Run your code through automated CI gates

Your pipeline should already be running unit tests, secret scanning, CodeQL, dependency checks, style linters. Keep doing that. Fail fast, fail loudly.

Practical tips for personal code hygiene:

Review your own code in your IDE.
Ensure variable names, comments, and structure to match your team’s conventions.
Test AI-generated code thoroughly before including it in pull requests.

Use AI to focus on the areas where your judgement is critical

The real power of AI in code reviews isn’t in replacing developers as the reviewers. It’s in handling the routine work that can bog down the review process, freeing developers to focus where their judgment is most valuable.

AI doesn’t replace your existing automated checks.

Make sure tests pass, coverage metrics are met, and static analysis tools have done their work before developer reviews begin. This creates a solid foundation for more meaningful discussion.

You can use an LLM to catch not just syntax issues, but also patterns, potential bugs, and style inconsistencies. Ironically, LLMs are particularly good at catching the sorts of mistakes that LLMs make, which is increasingly relevant as more AI-generated code enters our codebases.

Clearly define roles

Set clear expectations about when AI feedback should be considered versus when human judgment takes precedence. For example, you should rely on other developers for code architecture and consistency with business goals and organizational values. It’s especially useful to use AI to review long repetitive PRs where it can be easy to miss little things.

Implementation tips for building a sustainable AI-assisted review process

Document clear guidelines that specify when to use AI in code reviews, what types of feedback to trust, and how to escalate when developers disagree with an AI code review. With GitHub Copilot, for instance, you can use custom instructions to set clear rules for how Copilot engages with your code.
Update guidelines regularly based on team feedback and evolving AI capabilities. Remember that as your codebase and AI tools evolve, what works today might not work tomorrow.
Encourage open team discussions about the strengths and limitations of AI-assisted reviews. Share both positive and negative experiences to help everyone learn and improve their approach.
Refine automation continuously by using feedback from reviewers to improve your automated testing strategy. Identify patterns where solutions to recurring issues could be automated.

Developer judgement remains crucial

While AI can handle much of the routine work in code reviews, developer judgment remains irreplaceable for architectural decisions, mentoring and knowledge transfer, and context-specific decisions that require understanding of your product and users.

And even as LLMs get smarter, three review tasks remain stubbornly human:

Architecture trade-offs: Should we split this service? Cache locally? Pay tech debt now or later?
Mentorship and culture: PR threads are team classrooms. A bot can’t tell a junior engineer the war story behind that odd regex.
Ethics and product values: “Should we even build this?” is a question AI can’t answer.

The goal is to make developers more effective by letting them focus on what they do best.

Learn more about code reviews with GitHub Copilot >

The post Code review in the age of AI: Why developers will always own the merge button appeared first on The GitHub Blog.

Source: Read MoreÂ

Akka introduces platform for distributed agentic AI

Design Patterns For AI Interfaces

Amazon launches spec-driven AI IDE, Kiro

This week in AI dev tools: Gemini API Batch Mode, Amazon SageMaker AI updates, and more (July 11, 2025)

ChatGPT falls for another Windows license key scam — generating valid codes in a guessing game after a researcher “gives up”

Germany wants Google and Apple to ban China’s “illegal” DeepSeek AI — after it failed to comply with data protection laws

Microsoft’s extra year of free Windows 10 security updates feels like a last-minute snooze button — while groups like “The Restart Project” still want to help users

The Xbox Ally and Xbox Ally X prices may have leaked — and if true, it’s not as bad as I thought

The details of TC39’s last meeting

The details of TC39’s last meeting

Modern async iteration in JavaScript with Array.fromAsync()

Vite vs Webpack: A Guide to Choosing the Right Bundler

ChatGPT falls for another Windows license key scam — generating valid codes in a guessing game after a researcher “gives up”

ChatGPT falls for another Windows license key scam — generating valid codes in a guessing game after a researcher “gives up”

Germany wants Google and Apple to ban China’s “illegal” DeepSeek AI — after it failed to comply with data protection laws

Microsoft’s extra year of free Windows 10 security updates feels like a last-minute snooze button — while groups like “The Restart Project” still want to help users

Code review in the age of AI: Why developers will always own the merge button

What we learned from GitHub Copilot’s code review capabilities

What AI can (and can’t) handle today

A playbook for modern code reviews

Use AI to self review your code in your IDE

Take ownership of your code

Run your code through automated CI gates

Practical tips for personal code hygiene:

Use AI to focus on the areas where your judgement is critical

AI doesn’t replace your existing automated checks.

Clearly define roles

Implementation tips for building a sustainable AI-assisted review process

Developer judgement remains crucial

ChatGPT falls for another Windows license key scam — generating valid codes in a guessing game after a researcher “gives up”

Germany wants Google and Apple to ban China’s “illegal” DeepSeek AI — after it failed to comply with data protection laws

CVE-2025-7067 – HDF5 Heap-Based Buffer Overflow

RARE (Retrieval-Augmented Reasoning Modeling): A Scalable AI Framework for Domain-Specific Reasoning in Lightweight Language Models

CVE-2025-5613 – PHPGurukul Online Fire Reporting System SQL Injection Vulnerability

NVIDIA became the first $4 trillion company — here’s how the tech giant beat Microsoft and Apple

CVE-2025-49029 – Bitto Kazi Custom Login And Signup Widget Code Injection Vulnerability

Microsoft’s premium Xbox Elite Series 2 Wireless Controller is on sale with a rare 20% discount

Zyxel RCE Vulnerability Allows Arbitrary Query Execution Without any Authentication

Hire the Best Shopify Experts in Houston for Your Online Store

Code review in the age of AI: Why developers will always own the merge button

What we learned from GitHub Copilot’s code review capabilities

What AI can (and can’t) handle today

A playbook for modern code reviews

Use AI to self review your code in your IDE

Take ownership of your code

Run your code through automated CI gates

Practical tips for personal code hygiene:

Use AI to focus on the areas where your judgement is critical

AI doesn’t replace your existing automated checks.

Clearly define roles

Implementation tips for building a sustainable AI-assisted review process

Developer judgement remains crucial

Related Posts