Last Week in AI #297 - QwQ-32B-Preview, DeepSeek-R1-Lite-Preview, OLMo 2, Luma Photon

Top News

Alibaba releases an â€˜openâ€™ challenger to OpenAIâ€™s o1 reasoning model

Alibaba’s Qwen team has developed a new AI model, QwQ-32B-Preview, which rivals OpenAI’s o1 model in reasoning capabilities. The model, which contains 32.5 billion parameters and can consider prompts up to 32,000 words in length, outperforms OpenAI’s o1-preview and o1-mini models on certain benchmarks, including the AIME and MATH tests. However, it has some limitations, such as switching languages unexpectedly and underperforming on tasks requiring common sense reasoning. The model is available for download under an Apache 2.0 license, but only certain components have been released, making it impossible to fully replicate or understand its inner workings.

DeepSeek Introduces DeepSeek-R1-Lite-Preview with Complete Reasoning Outputs Matching OpenAI o1

DeepSeek has launched a new AI model, DeepSeek-R1-Lite-Preview, designed to address the reasoning gaps in current AI models by providing complete reasoning outputs. The model matches OpenAI’s o1 preview-level performance and is available for testing through DeepSeek’s chat interface. It incorporates Chain-of-Thought (CoT) reasoning capabilities, allowing the AI to present its thought process in real time, which is crucial for users who require detailed insight into how an AI model arrives at its conclusions. The model has demonstrated its capabilities through benchmarks like AIME and MATH, and is set to be open-sourced, making it accessible to the broader community for experimentation and integration.

Ai2 releases new language models competitive with Metaâ€™s Llama

A scatter plot comparing language models by performance (y-axis, measured in average performance on 10 benchmarks) versus training computational cost (x-axis, in approximate FLOPs). The plot shows OLMo 2 models (marked with stars) achieving Pareto-optimal efficiency among open models, with OLMo-2-13B and OLMo-2-7B sitting at the performance frontier relative to other open models like DCLM, Llama 3.1, StableLM 2, and Qwen 2.5. The x-axis ranges from 4x10^22 to 2x10^24 FLOPs, while the y-axis ranges from 35 to 70 benchmark points.

The nonprofit AI research organization Ai2 has released a new family of AI models, OLMo 2, which meets the Open Source Initiative’s definition of open source AI. This means that the tools and data used to develop it are publicly available. The OLMo 2 family includes two models, one with 7 billion parameters (OLMo 7B) and one with 13 billion parameters (OLMo 13B), which can perform a range of text-based tasks. Ai2 used a dataset of 5 trillion tokens to train the models, resulting in models that are competitive with other open models like Meta’s Llama 3.1. The OLMo 2 models can be downloaded from Ai2’s website and used commercially under the Apache 2.0 license.

Creating AI video just got easier â€” Luma Labs gives Dream Machine a huge upgrade

Luma Labs has announced a significant upgrade to its Dream Machine generative AI platform, including a new image model called Photon and a more collaborative approach to AI video creation. The upgrade, the largest since Dream Machine’s launch in June, offers faster video generation and improved natural language understanding. Photon, the new text-to-image model, is touted to be up to 800% faster than similar models, with accurate text rendering and easy character creation. The Dream Machine platform, available on both web and iOS, is also getting a new user interface and will be able to understand instructions and context, allowing users to brainstorm ideas without needing to learn prompt engineering.

Other News

Tools

OpenAI gives ChatGPT an upgrade â€” reclaims top spot in LLM leaderboard – OpenAI’s latest update to the GPT-4o model has significantly enhanced ChatGPT’s creative writing abilities, allowing it to surpass Google’s Gemini in the LLM leaderboard and become more engaging and insightful in its responses.

Google is prepping Gemini to take action inside of apps – Google’s Gemini Assistant is set to gain agentic-like abilities through a new “app functions” API in Android 16, allowing it to perform tasks within apps, similar to Apple’s upcoming enhancements for Siri in iOS 18.

NVIDIA Introduces Hymba 1.5B: A Hybrid Small Language Model Outperforming Llama 3.2 and SmolLM v2 – NVIDIA’s Hymba 1.5B model combines transformer attention and state space models in a hybrid-head parallel architecture, achieving superior performance and efficiency on smaller devices compared to other sub-2B models.

Anthropic launches tool to connect AI systems directly to datasets – Anthropic’s Model Context Protocol (MCP) allows AI systems to universally connect to various data sources, streamlining integration and enhancing performance across different platforms.

Rabbit now lets you teach the R1 to perform tasks for you – Rabbit’s new “teach mode” for R1 devices allows users to create AI agents that learn and perform demonstrated tasks, though it remains experimental and may face challenges with CAPTCHA-protected sites.

Introducing AI Backgrounds, HD Video Calls, Noise Suppression and More for Messenger Calling – Messenger has introduced new features such as AI-generated backgrounds, HD video calls, noise suppression, and hands-free calling to enhance the user experience during video and audio calls.

Microsoft will soon let you clone your voice for Teams meetings – Microsoft’s new Interpreter tool for Teams will allow users to clone their voices for real-time multilingual speech-to-speech translation, raising both opportunities for enhanced communication and concerns about potential misuse and security risks.

AI coding tool Cursor adds autonomous coding agents in latest update – Cursor’s latest update introduces AI agents capable of autonomously handling coding tasks and terminal operations, enhancing automation and project management within the modified Visual Studio Code environment.

Nvidia claims a new AI audio generator can make sounds never heard before – Nvidia’s Fugatto AI music editor can generate unprecedented sounds and transform audio inputs into unique compositions, including altering voices and creating novel sound effects.

Runway launches Frames â€” a new AI image generator that creates custom worlds – Runway’s new AI image generator, Frames, offers enhanced stylistic control and visual fidelity, allowing users to create consistent and unique worlds across video generations.

Anthropic says Claude AI can match your unique writing style – Anthropic’s Claude AI now allows users to customize the chatbot’s writing style to match their own or choose from preset options, enhancing personalization and appropriateness for various communication tasks.

ElevenLabsâ€™ new feature is a NotebookLM competitor for creating GenAI podcasts – ElevenLabs’ new GenFM feature allows users to create AI-generated multispeaker podcasts by uploading various content types, incorporating natural human elements like “ums” and “ahs” to enhance conversational flow, and supports 32 languages.

Google has a new chess game that lets you design the pieces with AI – GenChess, a free game by Google, allows players to use AI to design custom chess pieces and play against computer-generated opponents.

Business

Nvidia Doubles Profit as A.I. Chip Sales Soar – Nvidia’s significant profit increase and optimistic revenue forecast highlight the strong demand for its new A.I. chip, Blackwell, despite concerns about the sustainability of its market dominance.

OpenAI moves to trademark its â€˜reasoningâ€™ models – OpenAI is seeking to trademark its new “reasoning” AI model, o1, to protect its intellectual property, while also expanding its series of models designed for complex tasks.

Baidu says self-driving vehicle costs drop to US$34,525 as mass production ramps up – Baidu has significantly reduced the production cost of its Apollo RT6 self-driving vehicle, positioning it as the world’s only mass-produced Level-4 autonomous vehicle and enhancing its competitive edge in the autonomous driving market.

AI chip startup MatX, founded by Google alums, raises Series A at $300M+ valuation, sources say – MatX, co-founded by former Google engineers, has raised $80 million in a Series A round led by Spark Capital to develop chips optimized for large AI workloads, aiming to outperform Nvidia’s GPUs.

Google’s DeepMind and YouTube built and shelved ‘Orca,’ a ‘mind-blowing’ music AI tool that hit a copyright snag – Orca, an AI music tool developed by Google’s DeepMind and YouTube, was shelved due to copyright concerns despite its ability to generate authentic-sounding music by mimicking artists using simple prompts.

OpenAIâ€™s Sora video generator appears to have leaked – A group leaked access to OpenAI’s unreleased Sora video generator to protest the company’s alleged exploitation of artists and lack of transparency, leading to a temporary shutdown of the early access program.

Chinese Driverless-Tech Firm Pony AI Said to Raise $260 Million in US IPO – Pony AI Inc.’s successful $260 million US IPO highlights robust investor enthusiasm for both autonomous-driving technology and Chinese companies listing in New York.

PlayAI Clones Voices on Command – PlayAI, a voice cloning and text-to-speech platform, has raised $21 million in seed funding to enhance its AI voice models and address ethical concerns, despite facing criticism over safety measures and potential misuse of its technology.

Amazon develops video AI model, The Information reports – Amazon’s new AI model, Olympus, can process images and videos to enhance search capabilities and reduce dependency on Anthropic’s Claude chatbot.

New updates give Mistral AIâ€™s Le Chat an edge over ChatGPT – French generative artificial intelligence startup Mistral AI is taking the fight to OpenAI with a host of updates announced today.

Research

Meet The Matrix: A New AI Approach to Infinite-Length and Real-Time Video Generation – The Matrix, developed by a team from Alibaba, the University of Hong Kong, and the University of Waterloo, is a groundbreaking AI model that generates infinite-length, high-quality video simulations with real-time interactivity, using advanced diffusion techniques and learning from both game and real-world data.

Meet LLaVA-o1: The First Visual Language Model Capable of Spontaneous, Systematic Reasoning Similar to GPT-o1 – LLaVA-o1, developed by a team of researchers, introduces a structured four-stage reasoning process and stage-level beam search to significantly enhance systematic reasoning in vision-language models, outperforming larger models with improved accuracy and efficiency.

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions – Marco-o1 explores the potential of large reasoning models to generalize across domains lacking clear standards by utilizing advanced techniques like Chain-of-Thought fine-tuning and Monte Carlo Tree Search.

Multimodal Autoregressive Pre-training of Large Vision Encoders – AIMV2, a new family of vision encoders, demonstrates exceptional performance in both multimodal and vision-specific tasks by using a multimodal autoregressive pre-training approach that integrates image and text data.

Bringing robot skills from simulation to the real world – Simulation is increasingly being used to generate diverse and high-quality data for training general-purpose robot policies, addressing challenges in real-world data collection and enabling advancements in sim-to-real transfer for tasks like navigation and manipulation.

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models – Investigating the pretraining data of large language models reveals that procedural knowledge significantly influences their reasoning capabilities, distinguishing it from mere retrieval of factual information.

Large Language Models as Markov Chains – The paper establishes an equivalence between large language models and Markov chains, providing insights into their performance through theoretical analysis and experimental validation.

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models – VBench++ offers a comprehensive and versatile benchmark suite for evaluating video generative models by dissecting video generation quality into specific dimensions, aligning with human perception, and supporting both text-to-video and image-to-video evaluations.

Concerns

OpenAI sued by Canadaâ€™s biggest media outlets – Canadian media companies are suing OpenAI for allegedly using their journalism without permission to train its GPT model, seeking damages and an injunction against future use.

OpenAI accidentally erases potential evidence in training data lawsuit – OpenAI engineers inadvertently erased crucial evidence in a lawsuit over AI training data, complicating efforts to trace the use of news articles in building AI models, despite attempts to recover the lost data.

Study of ChatGPT citations makes dismal reading for publishers – A study by the Tow Center for Digital Journalism reveals that ChatGPT frequently produces inaccurate citations for publishers’ content, raising concerns about the reliability and transparency of its sourcing, regardless of whether publishers have licensing deals with OpenAI.

Netflix removes AI art poster for Arcane after an outcry from creators – Netflix removed an AI-generated poster for Arcane’s second season after backlash from fans and creators, highlighting ongoing debates about AI’s role in art and its impact on artistic integrity.

Analysis

A Revolution in How Robots Learn – Robots are increasingly learning to perform complex tasks through AI-driven imitation learning, marking a significant shift from traditional programming to self-teaching capabilities.

Fun

AI created a Minecraft AI village with up to 1,000 inhabitants â€” Project Sid sees AI bots implement a taxation system and spread Pastafarianism religion – AI startup Altera’s Project Sid successfully created a dynamic AI society within Minecraft, where AI agents autonomously developed roles, implemented a taxation system, and even spread a parody religion, showcasing emergent human-like behaviors.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Last Week in AI #297 – QwQ-32B-Preview, DeepSeek-R1-Lite-Preview, OLMo 2, Luma Photon

Top News

Alibaba releases an â€˜openâ€™ challenger to OpenAIâ€™s o1 reasoning model

DeepSeek Introduces DeepSeek-R1-Lite-Preview with Complete Reasoning Outputs Matching OpenAI o1

Ai2 releases new language models competitive with Metaâ€™s Llama

Creating AI video just got easier â€” Luma Labs gives Dream Machine a huge upgrade

Other News

Tools

Business

Research

Concerns

Analysis

Fun

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

Spotifatius – simple Spotify CLI

The Benefits of Using RTK Query: A Scalable and Efficient Solution

New Webinar: Using APIs to Add Images to Your Apps [FREE]

How to use ChatGPT to analyze PDFs for free

Meta Advances AI Capabilities with Next-Generation MTIA Chips

I demoed Samsung’s new Galaxy AI features – these 3 made my iPhone look bad

Integration timelines: How to plan for success without surprises

Simple to do app with ng-repeat

Last Week in AI #297 – QwQ-32B-Preview, DeepSeek-R1-Lite-Preview, OLMo 2, Luma Photon

Top News

Other News

Tools

Business

Research

Concerns

Analysis

Fun

Related Posts