Last Week in AI #309 - OpenAI keeps non-profit & launches Codex, AlphaEvolve, and more!

Top News

OpenAI says non-profit will remain in control after backlash

OpenAI has announced it no longer intends to change to being a fully for-profit entity that is no longer controlled by a non-profit board. In the revised plan OpenAI will remain under the control of its non-profit board, while transitioning into a public benefit corporation. Their announcement states:

“Our for-profit LLC, which has been under the nonprofit since 2019, will transition to a Public Benefit Corporation (PBC)–a purpose-driven company structure that has to consider the interests of both shareholders and the mission…

We made the decision for the nonprofit to retain control of OpenAI after hearing from civic leaders and engaging in constructive dialogue with the offices of the Attorney General of Delaware and the Attorney General of California.”

OpenAI launches Codex, an AI coding agent, in ChatGPT

OpenAI has launched Codex, a highly capable AI coding agent that uses the company’s codex-1 model, optimized for software engineering tasks. Codex operates in a cloud-based virtual computer and can interact with GitHub to preload user code repositories. It can write simple features, fix bugs, answer questions about a codebase, and run tests in one to 30 minutes. The tool is available to ChatGPT Pro, Enterprise, and Team subscribers, with plans to expand access to ChatGPT Plus and Edu users. OpenAI aims for Codex to act as a “virtual teammate,” autonomously completing tasks that would take human engineers significant time.

DeepMind claims its newest AI tool is a whiz at math and science problems

DeepMind, Google’s AI R&D lab, has developed a new AI system named AlphaEvolve, designed to tackle problems with machine-gradable solutions. The system uses models to generate, critique, and evaluate a pool of possible answers to a question, thereby reducing the tendency of AI models to ‘hallucinate’ or make things up. AlphaEvolve uses state-of-the-art Gemini models, making it more capable than previous AI systems. However, it has limitations, such as only being able to solve problems it can self-evaluate and only describing solutions as algorithms, making it unsuitable for non-numerical problems. Despite these limitations, DeepMind claims that AlphaEvolve has been successful in rediscovering the best-known answers to a set of math problems 75% of the time and finding improved solutions in 20% of cases.

Trump’s Mideast Visit Opens Floodgate of AI Deals Led by Nvidia

A group of several dozen businessmen stand together for a picture.

The Trump administration is advancing agreements with Saudi Arabia and the United Arab Emirates (UAE) to expand their access to cutting-edge AI chips from U.S. tech giants like Nvidia and AMD, marking a major geopolitical and commercial shift in AI policy. This move, part of Trump’s broader Middle East business diplomacy, coincides with a rollback of Biden-era restrictions on AI chip exports and is attracting billions in tech investments from U.S. firms. Nvidia will supply advanced processors to Saudi AI firm Humain, while AMD will support a $10 billion regional data center initiative. Tech heavyweights including Amazon, Cisco, Super Micro, Qualcomm, and OpenAI are also launching or expanding projects in the Gulf, ranging from AI zones and cloud services to new data centers and chip infrastructure.

Despite the commercial optimism, the initiatives have sparked national security concerns in Washington over potential Chinese access to American AI hardware via Gulf intermediaries, particularly involving UAE’s G42 and Huawei.

Other News

Tools

Open Computer Agent: Hugging Face의 새로운 AI 에이전트 프로젝트 (feat. smolagent) - 읽을거리&정보공유 - 파이토치 한국 사용자 모임

Hugging Face releases a free Operator-like agentic AI tool – Hugging Face’s Open Computer Agent, a cloud-hosted AI tool, demonstrates the growing capabilities and affordability of open AI models despite its current limitations in handling complex tasks and CAPTCHA challenges.

Anthropic rolls out an API for AI-powered web search – Anthropic’s new API enables developers to enhance their Claude AI models with real-time web search capabilities, allowing for more accurate and current information retrieval, customizable search behavior, and integration with Claude Code for coding tasks.

Figma releases new AI-powered tools for creating sites, app prototypes, and marketing assets – Figma’s new AI-powered tools, including Figma Sites and Figma Make, aim to streamline the creation of websites, app prototypes, and marketing assets, positioning the company as a competitor to platforms like Canva and Adobe by offering features such as collaborative editing, AI-generated code, and bulk asset creation.

Lightricks shakes up AI video creation with powerful open-source model – Lightricks Ltd. is throwing down the gauntlet to artificial intelligence powerhouses OpenAI, Google LLC and others with the release of its latest open-source video generation model, LTX Video-13B.

Google’s bringing Gemini to your car with Android Auto – Google is integrating its generative AI, Gemini, into Android Auto to enhance the in-car experience with advanced voice assistance and conversational capabilities, aiming to make driving more productive and enjoyable.

Mistral claims its newest AI model delivers leading performance for the price – Mistral Medium 3, a new AI model from French startup Mistral, offers high performance at a competitive price, excelling in coding, STEM tasks, and multimodal understanding, and is available on multiple platforms including Amazon’s Sagemaker.

Business

Windsurf Editor: Coding with AI-Powered Intelligence

OpenAI Reaches Agreement to Buy Startup Windsurf for $3 Billion – OpenAI has agreed to buy Windsurf, an artificial intelligence-assisted coding tool formerly known as Codeium, for about $3 billion, according to people familiar with the matter, marking the ChatGPT maker’s largest acquisition to date.

OpenAI pledges to publish AI safety test results more often – OpenAI is enhancing transparency by regularly updating a new Safety Evaluations Hub with metrics on AI model safety, addressing past criticisms of inadequate safety testing and communication.

Anthropic launches a program to support scientific research – Anthropic’s AI for Science program aims to accelerate scientific research in biology and life sciences by providing selected researchers with API credits and access to advanced AI models, despite ongoing skepticism about AI’s current reliability in scientific discovery.

Google launches new initiative to back startups building AI – Google’s AI Futures Fund aims to support AI startups by providing investment, early access to DeepMind’s AI models, and collaboration opportunities with Google experts, while operating on a flexible, rolling basis without a fixed application window.

Microsoft employees are banned from using DeepSeek app, president says – Microsoft has banned its employees from using the DeepSeek app due to concerns over data security and potential Chinese propaganda, despite offering DeepSeek’s R1 model on its Azure cloud service.

Netflix will show generative AI ads midway through streams in 2026 – Netflix plans to introduce interactive mid-roll and pause ads using generative AI in 2026, following the success of its ad subscription tier and in-house advertising platform.

Hedra, the app used to make talking baby podcasts, raises $32M from a16z – Hedra, a startup specializing in AI-generated video content with expressive characters, has raised $32 million in funding to enhance its technology and capitalize on the growing trend of AI-generated talking baby podcasts.

Research

Absolute Zero: Reinforced Self-play Reasoning with Zero Data – Absolute Zero introduces a new paradigm for reasoning models that enables self-evolution through self-play without relying on external data, achieving remarkable performance in math and coding tasks by leveraging a reinforcement learning framework that mirrors human learning and reasoning.

OpenAI Launches HealthBench, a Dataset That Benchmarks Healthcare AI Models – HealthBench, developed with input from 262 physicians across 60 countries, evaluates AI healthcare models by scoring their responses to realistic health scenarios against a physician-written rubric, with OpenAI’s o3 reasoning model currently achieving the highest score.

X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains – X-Reasoner demonstrates that reasoning capabilities trained on general-domain text can effectively generalize across different modalities and domains, achieving state-of-the-art performance on both general and specialized tasks, including a medical-specific variant, X-Reasoner-Med.

Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers – RL^V enhances reinforcement learning by integrating value functions for verification, significantly improving test-time compute scaling and accuracy in tasks like MATH, while demonstrating strong generalization and performance gains.

Continuous Thought Machines – The Continuous Thought Machine (CTM) introduces neuron-level temporal processing and neural synchronization to enhance deep learning models with biologically inspired neural dynamics, demonstrating strong performance across various complex tasks.

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures – DeepSeek-V3 addresses hardware limitations in AI by employing innovations like Multi-head Latent Attention and Mixture of Experts architectures to enhance memory efficiency and computational trade-offs, while also engaging in discussions on future hardware advancements.

AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale – AM-Thinking-v1 is a reasoning-optimized language model that demonstrates state-of-the-art performance among dense models of its size by employing a meticulously designed post-training pipeline, including Supervised Fine-Tuning and Reinforcement Learning, to achieve reasoning capabilities comparable to larger Mixture-of-Experts models without relying on private data or massive architectures.

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset – BLIP3-o, a family of state-of-the-art unified multimodal models, utilizes diffusion transformers and flow matching on CLIP features, demonstrating superior performance in image understanding and generation tasks through a sequential training strategy and a curated instruction-tuning dataset.

Aya Vision: Advancing the Frontier of Multilingual Multimodality – Aya Vision introduces innovative techniques for creating high-quality multilingual multimodal language models that overcome challenges like data scarcity and catastrophic forgetting, achieving superior performance compared to larger models.

MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder – MiniMax-Speech introduces a novel text-to-speech model that leverages a learnable speaker encoder and Flow-VAE to achieve high-fidelity, zero-shot voice cloning across 32 languages, enhancing both audio quality and speaker similarity.

Concerns

Grok Pivots From ‘White Genocide’ to Being ‘Skeptical’ About the Holocaust – Grok, a chatbot from Elon Musk’s xAI, faced controversy for unauthorized modifications that led it to promote false narratives about “white genocide” and express skepticism about the Holocaust, prompting xAI to implement measures for transparency and reliability.

One of Google’s Recent Gemini AI Models Scores Worse on Safety – Google’s Gemini 2.5 Flash AI model demonstrates a trade-off between improved instruction-following and increased policy violations, highlighting the challenges of balancing permissiveness and safety in AI development.

The Professors Are Using ChatGPT, and Some Students Aren’t Happy About It – Students are expressing dissatisfaction with professors’ increasing use of AI tools like ChatGPT, arguing it undermines the value of their education and raises concerns about the authenticity of feedback and grading.

Policy

OpenAI wants to team up with governments to grow AI infrastructure – OpenAI is launching the OpenAI for Countries program to collaborate with governments on building local AI infrastructure and promoting the use of Western AI models over Chinese alternatives.

Trump administration officially rescinds Biden’s AI diffusion rules – The U.S. Department of Commerce has rescinded Biden’s AI Diffusion Rule, opting instead for a strategy of direct negotiations with countries and issuing guidance to protect AI chip supply chains.

Elton John, Dua Lipa, Coldplay Among 400 Artists Seeking Copyright Protection Amid A.I. Surge – Over 400 artists, including Elton John and Dua Lipa, are urging the UK government to update copyright laws to protect creative works from being used without permission in AI training, supporting a bill that promotes transparency and licensing agreements.

Pope Leo signals he will closely follow Francis and says AI represents challenge for humanity – Pope Leo XIV, the first US-born pope, plans to continue Pope Francis’ legacy while addressing the challenges posed by artificial intelligence and advocating for social justice and church reforms.

Analysis

Your A.I. Radiologist Will Not Be With You Soon – Despite predictions of their obsolescence, radiologists remain in high demand as AI enhances rather than replaces their work by improving efficiency and augmenting human capabilities.

Why We’re Unlikely to Get Artificial General Intelligence Anytime Soon – Despite bold predictions from some technologists, many experts argue that current AI technology is insufficient for achieving Artificial General Intelligence, with significant disagreement on defining and identifying such intelligence.

Source: Read MoreÂ

Error’d: Pickup Sticklers

From Prompt To Partner: Designing Your Custom AI Assistant

Microsoft unveils reimagined Marketplace for cloud solutions, AI apps, and more

Design Dialects: Breaking the Rules, Not the System

Building personal apps with open source and AI

What Can We Actually Do With corner-shape?

Craft, Clarity, and Care: The Story and Work of Mengchu Yao

Cailabs secures €57M to accelerate growth and industrial scale-up

Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

Mastering PHP File Uploads: A Guide to php.ini Settings and Code Examples

The first browser with JavaScript landed 30 years ago

Last Week in AI #309 – OpenAI keeps non-profit & launches Codex, AlphaEvolve, and more!

Top News

OpenAI says non-profit will remain in control after backlash

OpenAI launches Codex, an AI coding agent, in ChatGPT

DeepMind claims its newest AI tool is a whiz at math and science problems

Trump’s Mideast Visit Opens Floodgate of AI Deals Led by Nvidia

Other News

Tools

Business

Research

Concerns

Policy

Analysis

Repurposing Protein Folding Models for Generation with Latent Diffusion

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

Jaguar Land Rover Cyberattack Forces Extended Factory Shutdown and Disrupts Global Operations

Critical Langflow RCE flaw exploited to hack AI app servers

Russian Hackers Using ClickFix Fake CAPTCHA to Deploy New LOSTKEYS Malware

Hacker suspected of trying to cheat his way into university is arrested in Spain

Understanding Mobile App Brand Awareness: Metrics and Strategies

Secure Your RAG Workflows with MongoDB Atlas + Enkrypt AI

SynthID Detector — a new portal to help identify AI-generated content

Git Apprentice [SUBSCRIBER]

Last Week in AI #309 – OpenAI keeps non-profit & launches Codex, AlphaEvolve, and more!

Top News

Other News

Tools

Business

Research

Concerns

Policy

Analysis

Related Posts