Last Week in AI #276 - Claude 3.5 and Artifacts, Perplexity Bots, Sycophancy to subterfuge

Top News

Anthropic has a fast new AI model â€” and a clever new way to interact with chatbots

Anthropic has launched its latest AI model, Claude 3.5 Sonnet, which it claims can match or surpass the performance of OpenAIâ€™s GPT-4o or Googleâ€™s Gemini across a broad range of tasks. The new model, which is available to Claude users on the web and iOS, and to developers, is said to be twice as fast as its predecessor and outperforms the previous top model, 3 Opus. Claude 3.5 Sonnet excelled in benchmark tests, outscoring GPT-4o, Gemini 1.5 Pro, and Metaâ€™s Llama 3 400B in most categories. Alongside the new model, Anthropic is introducing a feature called Artifacts, which allows users to interact with the results of their Claude requests, enhancing the model’s utility beyond a simple chatbot.

Publishers Are Increasingly Irked at Perplexity Bots Circumventing Blocks

AI search startup Perplexity, backed by Jeff Bezos and other tech giants, is facing backlash from publishers like The New York Times, The Guardian, CondÃ© Nast, and Forbes for allegedly circumventing blocks to access and repurpose their content, potentially costing publishers billions in ad revenue. Perplexity, valued at $1 billion, plans to serve ads later this year and offers a $20 monthly subscription. Unlike other AI models like OpenAI and Google Overview, Perplexity does not seem to acknowledge publishers’ attempts to block its crawlers. Publishers are demanding commercial licenses for the use of their intellectual property and reimbursement for ad revenue earned by Perplexity from displaying copies of their content. The publishing industry is expected to lose over $10 billion due to such practices, according to Ameet Shah, partner and SVP of publisher operations and strategy at Prohaska Consulting.

Other News

Tools

Leo, Braveâ€™s in-browser AI assistant, now incorporates real-time Brave Search results for even better answers – Brave’s in-browser AI assistant, Leo, now incorporates real-time Brave Search results, providing more accurate and up-to-date answers, and offering a privacy-preserving AI experience.

Why Anthropicâ€™s Artifacts may be this yearâ€™s most important AI feature: Unveiling the interface battle – Anthropic’s Artifacts feature redefines AI interaction, bridging the gap between AI as a tool and AI as a collaborative partner, potentially revolutionizing knowledge work across industries.

TikTok launches Symphony â€” an AI assistant that makes it easier to create amazing content – TikTok has launched Symphony, an AI assistant that simplifies content creation for marketers and creators by providing tools for script production, creative ideas, auto-diagnostics, and video generation in less than 60 seconds.

Google’s DeepMind Brings V2A To Generate Soundtracks and Dialogues For Videos: Here’s How – Google’s DeepMind is developing a new AI model, V2A, that can generate soundtracks and dialogue for videos, aiming to enhance the audio-visual experience and make videos more immersive and engaging.

Fireworks AI Releases Firefunction-v2 : An Open Weights Function Calling Model with Function Calling Capability on Par with GPT4o at 2.5x the Speed and 10% of the Cost – Fireworks AI releases Firefunction-v2, an open-source function-calling model designed to excel in real-world applications, rivaling high-end models like GPT-4o at a fraction of the cost and with superior speed and functionality.

Factory wants to use AI to automate the software dev lifecycle – AI-powered platform Factory aims to automate software development tasks, such as code review, documentation, and testing, to improve developer velocity and efficiency, despite potential challenges and limitations of third-party AI models.

Introducing Claudette, a new friend that makes Claude 3.5 Sonnet even nicer – Today, Anthropic launched the most powerful language model available: Claude 3.5 Sonnet. And today, we are making it ever better, with the launch of Claudette. Claudette makes Anthropicâ€™s SDK, which is used for working with Claude, much more convenient.

Meta releases flurry of new AI models for audio, text and watermarking – It’s time to celebrate the incredible women leading the way in AI! Nominate your inspiring leaders for VentureBeatâ€™s Women in AI Awards today before June 18. Learn More Metaâ€™sÂ Fundamental AI Research (FAIR) team, is releasing several new AI models and tools for researchers to use.

ElevenLabs unveils open-source creator tool for adding sound effects to videos – It’s time to celebrate the incredible women leading the way in AI! Nominate your inspiring leaders for VentureBeatâ€™s Women in AI Awards today before June 18. Learn More

Business

NVIDIA Research Wins CVPR Autonomous Grand Challenge for End-to-End Driving – NVIDIA wins the CVPR Autonomous Grand Challenge for End-to-End Driving with its Hydra-MDP model, showcasing accelerated computing and generative AI breakthroughs for autonomous vehicle development.

The Worldâ€™s Largest Music Company Is Helping Musicians Make Their Own AI Voice Clones – Universal Music Group partners with AI music tech startup SoundLabs to offer AI voice model tech to its artists, allowing them to create voice clones and use AI tools ethically in music creation.

Clearview AI Used Your Face. Now You May Get a Stake in the Company. – Facial recognition start-up Clearview AI settles invasion of privacy lawsuit by offering a 23 percent stake in the company to Americans whose faces are in its database, instead of cash payments.

Cruise clears key hurdle to getting robotaxis back on roads in California – Cruise reaches a settlement with the California Public Utilities Commission, agreeing to pay a fine and take corrective measures to restore public trust and restart its robotaxi operations in the state.

The A.I. Influencer Ads Are Coming – AI-generated avatars are being introduced on TikTok for brands to use in ads, allowing for customization and dubbing in multiple languages.

Former Snap engineer launches Butterflies, a social network where AIs and humans coexist – A former Snap engineer has launched Butterflies, a social network where humans and AIs coexist, allowing users to create AI personas that interact with each other through posts, comments, and DMs.

Research

Sycophancy to subterfuge: Investigating reward tampering in language models – AI models trained using reinforcement learning can exhibit specification gaming, where they find ways to “game” the system to obtain rewards, and even more concerning behavior like reward tampering, which can lead to unpredictable and potentially harmful actions, as demonstrated in a new study.

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs – LongRAG proposes a new framework for retrieval-augmented generation, using longer retrieval units and achieving remarkable performance improvements in answer recall and zero-shot answer extraction.

Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities – Unlocking visual reasoning capabilities of large language models across modalities through a simple method called whiteboard-of-thought prompting, which provides a metaphorical ‘whiteboard’ for drawing out reasoning steps as images, leading to state-of-the-art results on difficult natural language tasks involving visual and spatial reasoning.

Refusal in Language Models Is Mediated by a Single Direction – Language models are fine-tuned to obey benign requests but refuse harmful ones, with refusal mediated by a single direction, leading to a proposed method to disable refusal with minimal impact on other capabilities.

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning – Training in-the-wild device control agents with autonomous reinforcement learning using a novel approach called DigiRL, which significantly outperforms prior best agents and establishes a new state-of-the-art for digital agents for in-the-wild device control.

Instruction Pre-Training: Language Models are Supervised Multitask Learners – Supervised multitask pre-training using Instruction Pre-Training framework enhances language models and enables better generalization.

Jailbreaking as a Reward Misspecification Problem – Large language models’ vulnerability to adversarial attacks is attributed to reward misspecification during the alignment process, leading to the proposal of a metric to quantify this and a system for automated red teaming to generate adversarial prompts.

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models – Improving instruction-following capabilities of large language models through self-play and execution feedback.

Transcendence: Generative Models Can Outperform The Experts That Train Them – Generative models can achieve capabilities that surpass the abilities of the experts generating their data, as demonstrated by training an autoregressive transformer to play chess from game transcripts and achieving better performance than all players in the dataset.

Depth Anything V2 – Depth Anything V2 presents a new approach to monocular depth estimation, using synthetic images and large-scale pseudo-labeled real images to achieve significantly faster and more accurate models.

Concerns

Amazon-Powered AI Cameras Used to Detect Emotions of Unwitting UK Train Passengers – AI-powered cameras in UK train stations, including London’s Euston and Waterloo, used Amazon software to scan faces and predict emotions, age, and gender for potential advertising and safety purposes, raising concerns about privacy and reliability.

London premiere of movie with AI-generated script cancelled after backlash – London premiere of AI-generated script film cancelled after backlash from audience and industry, highlighting ongoing debate over AI’s role in the film industry.

What the Arrival of A.I. Phones and Computers Means for Our Data – The arrival of A.I. phones and computers means a new era of automation and personalized services, but it also raises significant privacy concerns about the data these devices require.

Source: Read MoreÂ

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

Last Week in AI #276 – Claude 3.5 and Artifacts, Perplexity Bots, Sycophancy to subterfuge

Top News

Anthropic has a fast new AI model â€” and a clever new way to interact with chatbots

Publishers Are Increasingly Irked at Perplexity Bots Circumventing Blocks

Other News

Tools

Business

Research

Concerns

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

Benchmarking AWS Lambda Cold Starts Across JavaScript Runtimes

Create Immersive 3D Earth with Three.js: Interactive WebGL

How JavaScript’s Temporal Proposal Will Change Date/Time Functions

ROBOSHOT by University of Wisconsin-Madison Enhancing Zero-Shot Learning Robustness: A Novel Machine Learning Approach to Bias Mitigation

What’s new in TensorFlow 2.17

Best Free and Open Source Alternatives to Microsoft Photos

Filter profanity from audio files using Node.js

Building Azure DevOps CD Processes for SPFx

Last Week in AI #276 – Claude 3.5 and Artifacts, Perplexity Bots, Sycophancy to subterfuge

Top News

Other News

Tools

Business

Research

Concerns

Related Posts