Note: apologies for the newsletter being late this week, sickness delayed preparation of this post.
Top News
Apple Intelligence: every new AI feature coming to the iPhone and Mac
Apple has announced “Apple Intelligence,” a suite of AI features for iPhone, Mac, and more at WWDC 2024. Key features include a more conversational Siri, AI-generated “Genmoji,” and integration with OpenAI’s GPT-4o for handling complex requests. Available on iPhone 15 Pro and 15 Pro Max, and M1 or later iPads and Macs, these features will roll out in iOS 18, iPadOS 18, and macOS Sequoia. Siri will gain capabilities like managing notifications, summarizing texts, and cross-app actions with “onscreen awareness.” Apple emphasizes privacy, ensuring many features work on-device and using “Private Cloud Compute” for complex tasks without storing user data. Additionally, AI enhancements include the “Image Playground” for dynamic image creation, improved photo search and editing in the Photos app, and seamless integration of ChatGPT for enhanced AI-driven responses and content creation.
OpenAI and Apple announce partnership to integrate ChatGPT into Apple experiences
Introducing Apple’s On-Device and Server Foundation Models
Apple stock surges to record high after AI announcements
Why Apple is taking a small-model approach to generative AI
Luma AI’s Dream Machine expands access to generative AI video creation
Luma AI has has launched the public beta of its new AI video generation model, Dream Machine, which has garnered overwhelming user interest. Dream Machine enables users to create high-quality videos from simple text prompts such as “a cute Dalmatian puppy running after a ball on the beach at sunset.” The model boasts the capability to generate videos of up to 120 frames in 120 seconds, equivalent to one frame per second. Unlike competitors such as OpenAI’s Sora and Lightricks Inc.’s LTX Studio, Luma AI’s platform is open-source, accessible to all users immediately, and promises future integrations with creative tools like Adobe. Early beta testers praise Dream Machine’s ability to render detailed objects and coherent stories, though it struggles with natural movements and morphing effects.
Runway unveils new hyper realistic AI video model Gen-3 Alpha, capable of 10-second-long clips
Runway ML, a New York City-based startup, has launched Gen-3 Alpha, an advanced generative AI video creation model. Following their previous Gen-1 and Gen-2 models, the Gen-3 Alpha aims to re-establish Runway’s prominence in the market dominated by competitors like OpenAI and Luma AI. The Gen-3 Alpha can generate high-quality, realistic 10-second video clips with rapid generation times of 45 seconds for 5-second clips and 90 seconds for 10-second clips. Initially available to paid subscribers, the model will eventually be accessible to free-tier users. Gen-3 Alpha is part of Runway’s effort to build General World Models capable of simulating diverse real-world interactions, powered by large-scale multimodal training infrastructure.
Other News
Tools
Announcing the Open Release of Stable Diffusion 3 Medium, Our Most Sophisticated Image Generation Model to Date – Collaboration with NVIDIA and AMD has enhanced the performance of Stable Diffusion 3 Medium, which is now available for download and use under specific licenses, with plans for continuous improvement.
NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models – NVIDIA introduces Nemotron-4 340B, a family of open models for generating synthetic data to train large language models across various industries, offering a scalable and cost-effective solution for developers.
Google brings Gemini Nano to more Pixel devices and enhances Recorder summaries – Google introduces new features and upgrades for Pixel devices, including expanded access to the Gemini Nano generative AI model, enhanced Recorder summaries, and car crash detection on Pixel Watch 2.
Leonardo AI image generator adds new video mode – Leonardo AI has launched a new image-to-video tool called Motion, which can turn a generated image into short video clips with impressive results, and is available to all users.
Leo, Brave’s in-browser AI assistant, now incorporates real-time Brave Search results for even better answers – Brave’s in-browser AI assistant, Leo, now incorporates real-time Brave Search results, providing more accurate and up-to-date answers, especially for current events or topics where the initial language model training may be outdated or lack full context.
AI music startups Suno and Udio both add audio uploads feature – AI music startups Suno and Udio have added a new feature allowing users to upload their own audio to use as seeds for creation, with strict copyright and usage guidelines in place.
LinkedIn leans on AI to do the work of job hunting – LinkedIn is launching new AI tools to help users with job hunting, cover letter and job application writing, personalized learning, and a new search experience.
Business
Report: OpenAI Doubled Annualized Revenue in 6 Months – OpenAI’s annualized revenue has more than doubled in the last six months, reaching $3.4 billion, and the company has announced new executive appointments and a partnership with Apple.
Report: OpenAI Could Become a For-Profit Business – OpenAI is considering a shift to become a for-profit company, potentially leading to an IPO and increased regulatory oversight, while also collaborating with Apple to enhance AI capabilities.
OpenAI to use Oracle’s chips for more AI compute – OpenAI partners with Oracle and Microsoft to increase AI compute capacity for ChatGPT, utilizing Oracle’s chips and Microsoft Azure AI platform on Oracle’s infrastructure.
OpenAI welcomes Sarah Friar (CFO) and Kevin Weil (CPO) – OpenAI welcomes Sarah Friar as CFO and Kevin Weil as CPO, both bringing extensive experience from prominent companies and a shared commitment to advancing AI responsibly.
Mistral closes €600m at €5.8bn valuation with new lead investor – Mistral, a Paris-based AI company, has closed a €600m funding round at a €5.8bn valuation, with new lead investor DST Global and existing backers General Catalyst, Lightspeed, and Andreessen Horowitz, among others.
Elon Musk drops suit against OpenAI and Sam Altman – Elon Musk withdraws lawsuit against OpenAI and co-founders Sam Altman and Greg Brockman, following public criticism of OpenAI’s partnership with Apple.
China Is Testing More Driverless Cars Than Any Other Country – China is leading the way in testing driverless cars, with a fleet of 500 taxis navigating the busy streets of Wuhan and plans to add 1,000 more, while the government provides significant support and limits public discussion of safety incidents.
Picsart teams up with Getty to take on Adobe’s ‘commercially-safe’ AI – Picsart and Getty Images collaborate to launch a commercially-safe AI image generator trained exclusively on licensed content, addressing concerns about AI-generated content violating copyright laws.
Meta pauses AI models launch in Europe – Meta Platforms delays launch of AI models in Europe due to Irish privacy regulator’s request, following complaints and advocacy group’s call to data protection authorities in multiple countries.
Japan’s Sakana AI by Google alums to become unicorn in under a year – Former Google researchers’ startup Sakana AI is set to become a unicorn in under a year, with plans for a funding round that would value it at over $1 billion.
Tesla investors sue Elon Musk for launching a rival AI company – Elon Musk and Tesla board accused of diverting resources to rival AI company xAI, leading to a lawsuit by shareholders.
Tesla shareholders sue Musk for starting competing AI company – Tesla shareholders sue Musk for starting competing AI company and diverting talent and resources from Tesla, alleging breach of fiduciary duties and unjust enrichment.
Adobe overhauls terms of service to say it won’t train AI on customers’ work – Adobe is updating its terms of service to clarify that it won’t train AI on customers’ work, aiming to win back trust after facing backlash from users who feared their work would be used for AI training.
Tesla claims it has 2 Optimus humanoid robots working autonomously in factory – Tesla claims to have two Optimus humanoid robots working autonomously in a factory, marking a significant step forward in their development and potential future sale.
Google DeepMind Shifts From Research Lab to AI Product Factory – Google DeepMind’s shift from a research lab to an AI product factory is evident as two companies announce AI products built using Google’s breakthroughs.
Research
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models – Multimodal language models are enhanced with a visual sketchpad, enabling them to draw and reason visually, resulting in significant performance improvements across various tasks.
An Empirical Study of Mamba-based Language Models – Mamba-based language models are compared to Transformers, showing that while pure SSMs match or exceed Transformers on many tasks, they lag behind on tasks requiring strong copying or in-context learning abilities, while the 8B Mamba-2-Hybrid exceeds the 8B Transformer on all standard tasks and is predicted to be up to 8x faster when generating tokens at inference time.
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling – Efficiently modeling sequences with infinite context length is achieved through Samba, a hybrid architecture combining Mamba and Sliding Window Attention, which outperforms state-of-the-art models and allows for efficient extrapolation to 256K context length with perfect memory recall.
Introducing Lamini Memory Tuning: 95% LLM Accuracy, 10x Fewer Hallucinations – Lamini Memory Tuning is a breakthrough method that improves factual accuracy and reduces hallucinations in AI language models, achieving 95% accuracy for a Fortune 500 customer compared to 50% with other approaches.
Mixture-of-Agents Enhances Large Language Model Capabilities – Leveraging the collective strengths of multiple large language models through a Mixture-of-Agents (MoA) methodology achieves state-of-the-art performance in natural language understanding and generation tasks.
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models – A novel thought-augmented reasoning approach called Buffer of Thoughts (BoT) enhances the accuracy, efficiency, and robustness of large language models (LLMs) by using meta-buffer to store informative high-level thoughts and dynamically updating the meta-buffer to solve reasoning-intensive tasks.
Generative AI takes robots a step closer to general purpose – Generative AI in robotics, particularly the use of diffusion models, is a promising approach to training general-purpose humanoid robots to perform a wide range of tasks, improving task performance by 20% and enabling them to adapt to unfamiliar tasks.
BERTs are Generative In-Context Learners – BERTs, specifically DeBERTa, can operate as generative models without additional training, matching and even surpassing GPT-3 in in-context learning capabilities.
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities – A vision model called 4M-21 is trained on diverse modalities and tasks, expanding its capabilities to solve at least 3x more tasks/modalities than existing models without a loss in performance.
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers – VALL-E 2 is a neural codec language model that achieves human parity in zero-shot text-to-speech synthesis, surpassing previous systems in speech robustness, naturalness, and speaker similarity.
Can Language Models Serve as Text-Based World Simulators? – Language models can potentially serve as text-based world simulators, offering a new way to understand and interact with the world.
The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models – Concise Chain-of-Thought (CCoT) prompting reduces response length without significantly impacting problem-solving performance in large language models, with practical implications for AI systems engineers and researchers.
“Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models – Large language models are being manipulated by jailbreak prompts, which are evolving to bypass safeguards and elicit harmful content, posing new challenges for vendors in proactive detection and defense.
OpenVLA: An Open-Source Vision-Language-Action Model – OpenVLA is an open-source vision-language-action model trained on diverse robot demonstrations, demonstrating strong results for generalist manipulation and efficient fine-tuning for new tasks.
Matching Anything by Segmenting Anything – A novel method called MASA uses object segmentation to learn instance-level correspondence and achieve robust instance association learning across diverse video domains without tracking labels.
Concerns
It Looked Like a Reliable News Site. It Was an A.I. Chop Shop. – AI-generated news article falsely implicates Irish broadcaster in misconduct trial, causing outrage and highlighting the dangers of AI misinformation.
Waymo issues software and mapping recall after robotaxi crashes into a telephone pole – Waymo issues a voluntary software recall after a driverless vehicle collides with a telephone pole, prompting increased regulatory scrutiny of the autonomous vehicle industry.
Buzzy AI Search Engine Perplexity Is Directly Ripping Off Content From News Outlets – AI-powered search startup Perplexity appears to be plagiarizing journalists’ work through its newly launched feature, Perplexity Pages, which lets people curate content on a particular topic.
Microsoft Delays Release of Its Controversial Recall AI Feature – Microsoft delays release of its controversial Recall AI feature, moving it to a beta program to address security concerns and restricting it to new Copilot+ PCs.
25-year-old Anthropic employee says she may only have 3 years left to work because AI will replace her – AI is predicted to replace jobs across all levels, with a 25-year-old employee at Anthropic stating that she may only have 3 years left to work, as AI is expected to excel at any kind of online work and could lead to mass unemployment.
Analysis
Indian election was awash in deepfakes – but AI was a net positive for democracy – AI played a significant role in the Indian election, with political parties using deepfakes and AI-generated content for targeted communication, translation of speeches, and personalized voter outreach, demonstrating the potential for AI to positively impact participatory democracy.
Copyright © 2024 Skynet Today, All rights reserved.
Source: Read MoreÂ