Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 18, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 18, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 18, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 18, 2025

      New Xbox games launching this week, from May 19 through May 25 — Onimusha 2 remaster arrives

      May 18, 2025

      5 ways you can plug the widening AI skills gap at your business

      May 18, 2025

      I need to see more from Lenovo’s most affordable gaming desktop, because this isn’t good enough

      May 18, 2025

      Gears of War: Reloaded — Release date, price, and everything you need to know

      May 18, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      YTConverter™ lets you download YouTube videos/audio cleanly via terminal — especially great for Termux users.

      May 18, 2025
      Recent

      YTConverter™ lets you download YouTube videos/audio cleanly via terminal — especially great for Termux users.

      May 18, 2025

      NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

      May 17, 2025

      Big Changes at Meteor Software: Our Next Chapter

      May 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      New Xbox games launching this week, from May 19 through May 25 — Onimusha 2 remaster arrives

      May 18, 2025
      Recent

      New Xbox games launching this week, from May 19 through May 25 — Onimusha 2 remaster arrives

      May 18, 2025

      Windows 11 KB5058411 install fails, File Explorer issues (May 2025 Update)

      May 18, 2025

      Microsoft Edge could integrate Phi-4 mini to enable “on device” AI on Windows 11

      May 18, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Last Week in AI #304 – OpenAI Audio, Ernie 4.5, Claude Websearch

    Last Week in AI #304 – OpenAI Audio, Ernie 4.5, Claude Websearch

    March 24, 2025

    Top News

    OpenAI Unveils New Audio Models to Make AI Agents Sound More Human Than Ever

    OpenAI has introduced a suite of new audio models aimed at making AI voice agents sound more human-like and responsive. The release includes two new speech-to-text models, GPT-4o-transcribe and GPT-4o-mini-transcribe, which outperform previous models in transcription accuracy across multiple languages, even in challenging scenarios such as understanding different accents and filtering background noise. The new GPT-4o-mini-tts text-to-speech model allows developers to control the tone and delivery of the AI’s speech, a feature OpenAI refers to as “steerability”. Additionally, an updated Agents SDK simplifies the conversion of text agents into voice agents.

    Baidu launches two new versions of its AI model Ernie

    Baidu's ERNIE Bot now available on App Store in China

    Chinese tech giant Baidu has introduced two new versions of its artificial intelligence model, Ernie – Ernie 4.5 and Ernie X1. The company claims that Ernie X1 performs at the same level as DeepSeek R1 but at half the cost, while Ernie 4.5 has been enhanced to understand memes and satire due to its “high EQ”. Both models possess multimodal capabilities, meaning they can process video, images, audio, and text. Despite being an early competitor to OpenAI’s ChatGPT, Baidu has faced challenges in achieving widespread adoption. The company plans to launch Ernie 5 later this year, promising further multimodal enhancements.

    Anthropic adds web search to its Claude chatbot

    Claude AI now supports online search: Here's how to get it

    Anthropic’s AI chatbot, Claude, has been upgraded with a web search feature, allowing it to scour the internet for information to inform its responses. The feature is currently available for paid users in the U.S., with plans to extend it to free users and other countries. The web search function works with the latest model, Claude 3.7 Sonnet, and provides direct citations for fact-checking. However, the feature has been inconsistent in triggering for current events-related questions. This update brings Claude in line with other AI chatbots like OpenAI’s ChatGPT, Google’s Gemini, and Mistral’s Le Chat, despite previous claims that Claude was designed to be self-contained.

    Meta AI is finally coming to the EU, but with limitations

    Meta has announced the launch of its AI-powered virtual assistant, Meta AI, in the European Union, despite ongoing regulatory issues with European privacy authorities. The tool, which has been available in the U.S. since 2023, will be rolled out across Meta’s social platforms, including WhatsApp in the U.K., but with a more limited feature set due to EU’s stringent privacy regulations. Meta AI, capable of chatting, answering questions, and generating images, has not been trained on local users’ data in the EU, hence it won’t be notifying users or seeking their consent. The launch represents Meta’s first step in bringing more AI to Europe, despite the company’s criticism of Europe’s AI regulations.

    Other News

    Tools

    Example objects created by Roblox’s Cube AI model.

    Roblox’s new AI model can generate 3D objects – Roblox’s Cube 3D model, which is open-sourced, aims to enhance 3D creation efficiency by generating 3D models from text prompts and will eventually support multimodal inputs like images and videos.

    Allen Institute for AI (AI2) Releases OLMo 32B: A Fully Open Model to Beat GPT 3.5 and GPT-4o mini on a Suite of Multi-Skill Benchmarks – OLMo 2 32B, released by the Allen Institute for AI, is a fully open large language model that surpasses GPT-3.5 Turbo and GPT-4o mini

    NVIDIA Launches Family of Open Reasoning AI Models for Developers and Enterprises to Build Agentic AI Platforms – NVIDIA’s Llama Nemotron models, enhanced for reasoning and decision-making

    Stability AI’s new AI model turns photos into 3D scenes – Stability AI’s Stable Virtual Camera model allows users to create immersive 3D videos from 2D images by generating novel views and dynamic camera paths, although it may struggle with complex scenes and certain textures.

    Google brings a ‘canvas’ feature to Gemini, plus Audio Overview – Google has introduced a new Canvas feature to its Gemini chatbot, allowing users to collaboratively create and refine writing and coding projects, alongside an Audio Overview feature that generates podcast-style audio summaries of documents.

    Canopy Labs Releases Orpheus, a Permissively-Licensed LLM for Convincing Text to Speech – Canopy Labs has launched Orpheus, a family of large language models for text-to-speech generation, capable of conveying emotions and performing zero-shot voice cloning, with the three-billion-parameter model available under an open-source license.

    xAI launches an API for generating images – xAI’s new image generation API, featuring the “grok-2-image-1212” model, offers competitive pricing and limited customization options as the company seeks to expand its revenue streams and investor interest.

    Business

    1X Robotics Unveils Neo Gamma: The Future of Home Automation? - Convergence  Now

    1X will test humanoid robots in ‘a few hundred’ homes in 2025 – 1X plans to test its humanoid robot, Neo Gamma, in homes by 2025, using teleoperators to assist with its current limitations, while addressing privacy concerns and collecting data to improve its AI capabilities.

    Mark Zuckerberg says that Meta’s Llama models have hit 1B downloads – Meta’s Llama models have reached 1 billion downloads despite facing legal and competitive challenges, with plans for new model releases and significant investment in AI development.

    Elon Musk’s AI company, xAI, acquires a generative AI video startup – xAI’s acquisition of Hotshot suggests plans to develop competitive video generation models, potentially integrating them into its Grok chatbot platform.

    Perplexity is reportedly in talks to raise up to $1B at an $18B valuation – Perplexity, an AI-powered search startup, is reportedly in early talks to raise $1 billion, doubling its valuation to $18 billion, amid increasing competition and expansion into new areas like enterprise solutions and an “agentic” browser.

    Apple Shuffles AI Executive Ranks in Bid to Turn Around Siri – Apple is restructuring its AI leadership by appointing Vision Pro creator Mike Rockwell to lead Siri development, aiming to address delays and improve its AI technology, which has been lagging behind competitors.

    OpenAI’s o1-pro is the company’s most expensive AI model yet – OpenAI’s o1-pro model, despite its high cost and increased computational power, has received mixed reviews for its performance improvements over the standard o1 model, particularly in solving complex problems.

    BotQ: US firm’s factory where humanoids will build robots, deliver 12,000 units a year – BotQ’s factory will utilize vertical integration and advanced software systems like MES, PLM, and ERP to ensure high-quality, efficient production and management of humanoid robots.

    Research

    Measuring AI Ability to Complete Long Tasks – AI performance, measured by the length of tasks it can complete, has been exponentially increasing with a doubling time of around 7 months, suggesting that within a few years, AI could autonomously handle tasks currently requiring weeks of human effort.

    EXAONE Deep: Reasoning Enhanced Language Models – EXAONE Deep models, developed by LG AI Research, are fine-tuned for enhanced reasoning tasks using techniques like Supervised Fine-Tuning, Direct Preference Optimization, and Online Reinforcement Learning, outperforming several existing models across different scales.

    Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers – Vamba, a hybrid Mamba-Transformer model, enhances hour-long video understanding by reducing computational complexity and memory usage through efficient modules like Mamba-2 blocks and cross-attention layers, achieving superior performance on benchmarks such as LVBench.

    FlowTok: Flowing Seamlessly Across Text and Image Tokens – FlowTok introduces a streamlined framework for seamless flow matching between text and image tokens, achieving efficient and state-of-the-art multimodal generation without complex conditioning mechanisms.

    CoRe^2: Collect, Reflect and Refine to Generate Better and Faster – CoRe^2 is a novel, plug-and-play sampling framework that enhances generative models’ performance by efficiently refining image quality and semantic faithfulness without being architecture-specific, achieving superior results across various benchmarks.

    Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification – Scaling up sampling-based search with random sampling and self-verification enhances model performance, revealing that larger response pools improve verification accuracy and highlighting the need for better out-of-box verification capabilities in frontier models.

    Concerns

    ChatGPT hit with privacy complaint over defamatory hallucinations – OpenAI faces a privacy complaint in Europe over ChatGPT’s generation of false and defamatory information, highlighting concerns about compliance with GDPR’s accuracy requirements and the potential reputational damage caused by AI hallucinations.

    Policy

    Ben Stiller, Mark Ruffalo and More Than 400 Hollywood Names Urge Trump to Not Let AI Companies ‘Exploit’ Copyrighted Works – Hollywood creative leaders are urging the Trump administration to maintain strong copyright protections against AI companies like OpenAI and Google, which seek to use copyrighted works for AI training without permission or compensation.

    A.I. Art Generated With Text Prompts Cannot Be Copyrighted, U.S. Rules – Art generated by artificial intelligence (A.I.) from a text prompt cannot be copyrighted even if an artist uses long, targeted inputs or creates multiple iterations of a work before they are satisfied with the final output, according to new guidance from the U.S. Copyright Office.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMy Adobe Summit 2025 Takeaways
    Next Article Got a suspicious E-ZPass text? Don’t click the link (and what to do if you already did)

    Related Posts

    Artificial Intelligence

    Markus Buehler receives 2025 Washington Award

    May 18, 2025
    Artificial Intelligence

    LWiAI Podcast #201 – GPT 4.5, Sonnet 3.7, Grok 3, Phi 4

    May 18, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    CVE-2025-48127 – “App Cheap Push Notification Authorization Bypass”

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-4463 – iSourcecode Gym Management System SQL Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    wholesale hats | otto hat | bulk hats | wholesale caps

    Development

    Smart Data & AI Summit Saudi Arabia 2024

    Artificial Intelligence

    Highlights

    Infortrend NAS CS4000U Storage Cost and Price in India – Affordable and Reliable

    May 16, 2025

    Post Content Source: Read More 

    Advancements in Knowledge Distillation and Multi-Teacher Learning: Introducing AM-RADIO Framework

    May 15, 2024

    pfl-research: Simulation Framework for Accelerating Research in Private Federated Learning

    May 14, 2024

    Is that image real or AI? Now Adobe’s got an app for that – here’s how to use it

    April 24, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.