DAI#57 â€“ Tricky AI, exam challenge, and conspiracy cures

Welcome to this weekâ€™s roundup of AI news made by humans, for humans.

This week, OpenAI told us that itâ€™s pretty sure o1 is kinda safe.

Microsoft gave Copilot a big boost.

And a chatbot can cure your belief in conspiracy theories.

Letâ€™s dig in.

Itâ€™s pretty safe

We were caught up in the excitement of OpenAIâ€™s release of its o1 models last week until we read the fine print. The modelâ€™s system card offers interesting insight into the safety testing OpenAI did and the results may raise some eyebrows.

It turns out that o1 is smarter but also more deceptive with a â€œmediumâ€ danger level according to OpenAIâ€™s rating system.

Despite o1 being very sneaky during testing, OpenAI and its red teamers say theyâ€™re fairly sure itâ€™s safe enough to release. Not so safe if youâ€™re a programmer looking for a job.

If OpenAIâ€˜s o1 can pass OpenAIâ€˜s research engineer hiring interview for coding â€” 90% to 100% rateâ€¦

â€¦â€¦then why would they continue to hire actual human engineers for this position?

Every company is about to ask this question. pic.twitter.com/NIIn80AW6f

â€” Benjamin De Kraker (@BenjaminDEKR) September 12, 2024

Copilot upgrades

Microsoft unleashed Copilot â€œWave 2â€ which will give your productivity and content production an additional AI boost. If you were on the fence over Copilotâ€™s usefulness these new features may be the clincher.

The Pages feature and the new Excel integrations are really cool. The way Copilot accesses your data does raise some privacy questions though.

More strawberries

If all the recent talk about OpenAIâ€™s Strawberry project gave you a craving for the berry then youâ€™re in luck.

Researchers have developed an AI system that promises to transform how we grow strawberries and other agricultural products.

This open-source application could have a huge impact on food waste, harvest yields, and even the price you pay for fresh fruit and veg at the store.

Too easy

AI models are getting so smart now that our benchmarks to measure them are just about obsolete. Scale AI and CAIS launched a project called Humanityâ€™s Last Exam to fix this.

They want you to submit tough questions that you think could stump leading AI models. If an AI can answer PhD-level questions then weâ€™ll get a sense of how close we are to achieving expert-level AI systems.

If you think you have a good one you could win a share of $500,000. Itâ€™ll have to be really tough though.

Source: X

Curing conspiracies

I love a good conspiracy theory, but some of the things people believe are just crazy. Have you tried convincing a flat-earther with simple facts and reasoning? It doesnâ€™t work. But what if we let an AI chatbot have a go?

Researchers built a chatbot using GPT-4 Turbo and they had impressive results in changing peopleâ€™s minds about the conspiracy theories they believed in.

It does raise some awkward questions about how persuasive AI models are and who decides what â€˜truthâ€™ is.

Just because youâ€™re paranoid, doesnâ€™t mean theyâ€™re not after you.

Stay cool

Is having your body cryogenically frozen part of your backup plan? If so, youâ€™ll be happy to hear AI is making this crazy idea slightly more plausible.

A company called Select AI used AI to accelerate the discovery of cryoprotectant compounds. These compounds stop organic matter from turning into crystals during the freezing process.

For now, the application is for better transport and storage of blood or temperature-sensitive medicines. But if AI helps them find a really good cryoprotectant, cryogenic preservation of humans could go from a moneymaking racket to a plausible option.

AI is contributing to the medical field in other ways that might make you a little nervous. New research shows that a surprising amount of doctors are turning to ChatGPT for help to diagnose patients. Is that a good thing?

If youâ€™re excited about whatâ€™s happening in medicine and considering a career as a doctor you may want to rethink that according to this professor.

This is the final warning for those considering careers as physicians: AI is becoming so advanced that the demand for human doctors will significantly decrease, especially in roles involving standard diagnostics and routine treatments, which will be increasingly replaced by AI.â€¦ pic.twitter.com/VJqE6rvkG0

â€” Derya Unutmaz, MD (@DeryaTR_) September 13, 2024

In other newsâ€¦

Here are some other clickworthy AI stories we enjoyed this week:

Googles Notebook LM turns your written content into a podcast. This is crazy good.
When Japan switches the worldâ€™s first zeta-class supercomputer on in 2030 it will be 1,000 times faster than the worldâ€™s current fastest supercomputer.
SambaNova challenges OpenAIâ€™s o1 model with an open-source Llama 3.1-powered demo.
More than 200 tech industry players sign an open letter asking Gavin Newsom to veto the SB 1047 AI safety bill.
Gavin Newsom signed two bills into law to protect living and deceased performers from AI cloning.
Sam Altman departs OpenAIâ€™s safety committee to make it more â€œindependentâ€.
OpenAI says the signs of life shown by ChatGPT in initiating conversations are just a glitch.
RunwayML launches Gen-3 Alpha Video to Video feature to paid users of its app.

Gen-3 Alpha Video to Video is now available on web for all paid plans. Video to Video represents a new control mechanism for precise movement, expressiveness and intent within generations. To use Video to Video, simply upload your input video, prompt in any aesthetic directionâ€¦ pic.twitter.com/ZjRwVPyqem

â€” Runway (@runwayml) September 13, 2024

And thatâ€™s a wrap.

Itâ€™s not surprising that AI models like o1 present more risk as they get smarter, but the sneakiness during testing was weird. Do you think OpenAI will stick to its self-imposed safety level restrictions?

The Humanityâ€™s Last Exam project was an eye-opener. Humans are struggling to find questions tough enough for AI to solve. What happens after that?

If you believe in conspiracy theories, do you think an AI chatbot could change your mind? Amazon Echo is always listening, the government uses big tech to spy on us, and Mark Zuckerberg is a robot. Prove me wrong.

Let us know what you think, follow us on X, and send us links to cool AI stuff we may have missed.

The post DAI#57 â€“ Tricky AI, exam challenge, and conspiracy cures appeared first on DailyAI.

Source: Read MoreÂ

Error’d: Infallabella

CodeSOD: Ready Xor Not

CodeSOD: A Set of Mistakes

CodeSOD: While This Works

AI agents might be the new workforce, but they still need a manager

Best of…: Best of 2024: Check Your Email

Razer’s new cooling pad really does let you push your laptop to its limit, but wow, it’s loud!

Square Enix: ‘Final Fantasy VII Rebirth’ “cannot be exclusive to one console,” again implying an eventualXboxlaunch

Why Checking response.ok in Fetch API Matters for Reliable Code

Why Checking response.ok in Fetch API Matters for Reliable Code

Debugging Selenium Tests with Pytest: Common Pitfalls and Solutions

Leadership Summit: A Day of Vision & Growth

Chimera Linux: Un’Innovativa Distribuzione Arriva in Fase Beta

Chimera Linux: Un’Innovativa Distribuzione Arriva in Fase Beta

Kdenlive 25.04 Introduce la Rimozione dello Sfondo per un Editing Video Professionale

Rilasciato Amarok 3.2: Supporto per Qt 5 e Qt 6 ed altre Novità

DAI#57 â€“ Tricky AI, exam challenge, and conspiracy cures

Itâ€™s pretty safe

Copilot upgrades

More strawberries

Too easy

Curing conspiracies

Stay cool

In other newsâ€¦

Virtual Personas for Language Models via an Anthology of Backstories

Modeling Extremely Large Images with xT

OSI releases latest draft of Open Source AI Definition

AI devs can now make use of GitHub Models, new alternative to Hugging Face

OpenAI Japan CEO talks about â€œGPT Nextâ€ with a reported 100x more power than GPT-4

KDE neon Users Can Now Upgrade to Ubuntu 24.04

Google Cloud and Stanford Researchers Propose CHASE-SQL: An AI Framework for Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL

Last Week in AI #297 – QwQ-32B-Preview, DeepSeek-R1-Lite-Preview, OLMo 2, Luma Photon

Old Photo Colorizing Software â€“ Top 6 Tools for Quick Edits

COSMIC Epoch 1 (Alpha 3): rilasciate le immagini ISO

DAI#57 â€“ Tricky AI, exam challenge, and conspiracy cures

Itâ€™s pretty safe

Copilot upgrades

More strawberries

Too easy

Curing conspiracies

Stay cool

In other newsâ€¦

Related Posts