DAI#45 â€“ New top model, lawsuit blues, and puzzled AI

Welcome to this weekâ€™s roundup of hand-assembled bespoke AI news.

This week Anthropic knocked OpenAI off pole position.

AI audio generators face the music in court.

And the top LLMs struggle with a puzzle your kids can solve.

Letâ€™s dig in.

Claude vs GPT-4o

After months of AI models claiming to be â€˜almost as good as GPT-4â€™, weâ€™ve finally got a model that pushes OpenAI off its top spot on the leaderboards.

Anthropic released Claude Sonnet 3.5, an upgraded version of its mid-size Claude model. The MMLU benchmark tests show it beating GPT-4o and Googleâ€™s Gemini 1.5 Pro in almost every test.

With an even more powerful Claude Opus 3.5 expected soon, what will OpenAIâ€™s response be?

Claude 3.5 Sonnet is not like the other LLMs

11 impressive demos of the new model: pic.twitter.com/2oHZdArz6J

â€” Proper (@ProperPrompter) June 26, 2024

After Meta called off its launch of Meta AI in the EU, Apple is doing the same due to strict laws in the region.

Apple has delayed the rollout of its Apple Intelligence features there as EU tech fans watch the rest of the world get first dibs.

Sounds familiarâ€¦

AI companies are getting sued, and for a change, itâ€™s not OpenAI or Meta.

Text-to-audio platforms Suno and Udio generate impressive music, but how did they get so good?

The Recording Industry Association of America is suing the companies, saying they â€œstole copyrighted sound recordingsâ€ to train their AI. When the judge listens to these sample clips it might be a short day in court.

An AI company using copyrighted material to train its models without paying the creators? Weâ€™re as unsurprised as you are.

Recreating copyrighted music isnâ€™t the worst thing AI is being used for though. A DeepMind study says that the leading form of AI misuse is bad guys creating deep fakes for opinion manipulation.

The rest of the AI misuse list makes for interesting reading.

Are you sure thatâ€™s right?

AI models are really good at generating very plausible but completely wrong information.

AI scientists say hallucinations canâ€™t be fixed but a University of Oxford study identified when AI hallucinations are more likely to occur.

â€œSemantic entropyâ€ checks the AI modelâ€™s confidence level and itâ€™s also my new polite way to say someone is talking BS.

via GIPHY

Even the most advanced LLMs make stuff up when presented with surprisingly simple puzzles. This week users on X posted examples of how the smartest models canâ€™t solve a simple river crossing puzzle.

Is it evidence that LLMs arenâ€™t good at reasoning, or is something else happening here?

AI might struggle with some riddles but it knows you better than you think. A new study found that an AI system can predict how anxious you are from how you react to photos.

The ability of these models to infer human emotions could be very helpful, but might be a source of human anxiety too.

AI open season

When AI companies use the word â€œopenâ€ to describe their models it rarely means what you think it does.

How â€œopenâ€ are these AI models? Sam took a closer look at which AI models are truly open and why some companies keep certain aspects very much closed.

This week saw an exciting development in the open model space. EvolutionaryScaleâ€™s ESM3 is a generative model for biology that turns prompts into proteins.

Previously, scientists looking for a novel protein would have to wait for nature to come up with it or try a hit-or-miss approach in the lab.

Now ESM3 enables scientists to program biology and create proteins beyond nature.

AI events

If you want to level up your marketing efforts then check out the MarTech Summit Hong Kong 2024 happening on 9 July.

The AI Accelerator Institute presents the Generative AI Summit Austin 2024 on 10 July. The agenda sees industry leaders discuss the latest trends in real-world generative AI applications.

In other newsâ€¦

Here are some other clickworthy AI stories we enjoyed this week:

Meta is incorrectly marking real photos as â€˜Made by AIâ€™.
SoftBank CEO says AI that is 10,000 times smarter than humans will come out in 10 years.
OpenAI delays the launch of GPT-4oâ€™s voice assistant to address safety issues.
Anthropic debuts collaboration tools for its Claude AI assistant.
Chinese AI firms woo OpenAI users as the US company plans API restrictions.
OpenAI acquires collaborative screen sharing tool creator Multi.
Toys â€œRâ€ Us sparks an online backlash after releasing an ad created with OpenAIâ€™s Sora.

this toys r us commercial is made entirely with AI which means the kid is disgusting and ghoulish, the sentiment hollow, and the toys r us brand is dead for at least the third time pic.twitter.com/IRprWZKN8O

â€” Chris Alsikkan (@AlsikkanTV) June 25, 2024

And thatâ€™s a wrap.

Have you tried out the upgraded Claude? The Artifacts window is seriously cool. Itâ€™s a sure bet that ChatGPT will get a similar feature very soon.

I love playing with Udio and Suno but thereâ€™s no denying they rip off copyrighted music. Is this the price of progress or is it a showstopper?

Iâ€™m still surprised that AI models struggle with a simple river crossing puzzle. We should probably fix that before letting AI control really important stuff like power grids or hospitals.

Let us know what you think and keep sending us links to interesting AI news and research we may have missed.

The post DAI#45 â€“ New top model, lawsuit blues, and puzzled AI appeared first on DailyAI.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

DAI#45 â€“ New top model, lawsuit blues, and puzzled AI

Claude vs GPT-4o

Sounds familiarâ€¦

Are you sure thatâ€™s right?

AI open season

AI events

In other newsâ€¦

LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks

This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational Efficiency

Kindle Colorsoft users reporting screen discoloration issues. Here’s Amazon’s response (for now)

Services.msc Remembers Last Computer: How to Forget it

Medusa Ransomware Group Claims Cyberattack on Organizations in USA, Canada

You can make a photo come alive with TikTok’s new AI tool – here’s how

CVE-2025-46652 – IZArc Mark-of-the-Web Bypass Information Disclosure Vulnerability

GSMA Confirms End-to-End Encryption for RCS, Enabling Secure Cross-Platform Messaging

A faster, better way to prevent an AI chatbot from giving toxic responses

Evaluate Amazon Bedrock Agents with Ragas and LLM-as-a-judge

DAI#45 â€“ New top model, lawsuit blues, and puzzled AI

Claude vs GPT-4o

Sounds familiarâ€¦

Are you sure thatâ€™s right?

AI open season

AI events

In other newsâ€¦

Related Posts