Top News
OpenAI closes the largest VC round of all time
OpenAI has raised $6.6 billion in a new funding round, led by Thrive Capital, valuing the company at $157 billion. Major investors include Microsoft, Khosla Ventures, Nvidia, and SoftBank. Microsoft contributed $750 million on top of its previous $13 billion investment. The funding will drive AI research and expand OpenAI’s computing capacity as the company builds on its success with tools like ChatGPT, which has gained 250 million weekly active users.
This deal positions OpenAI among the largest venture-backed startups, alongside SpaceX and ByteDance, and highlights the tech industry’s belief in AI’s potential. OpenAI is also exploring a transition from a nonprofit to a for-profit structure, potentially awarding CEO Sam Altman equity. The company faces growing competition from rivals such as Google and Amazon, but has discouraged investors from supporting competitors like Anthropic and xAI.
OpenAI’s DevDay brings Realtime API and other treats for AI app developers
OpenAI has announced several new tools at its 2024 DevDay, including a public beta of its “Realtime API” for building apps with low-latency, AI-generated voice responses. The Realtime API allows developers to create nearly real-time, speech-to-speech experiences in their apps, with six distinct voices provided by OpenAI. The company also introduced vision fine-tuning in its API, enabling developers to use images and text to fine-tune their applications of GPT-4o, and a model distillation feature to improve the performance of smaller AI models. Despite these advancements, OpenAI did not announce any new AI models during DevDay this year, and developers awaiting the release of OpenAI o1 or the video generation model, Sora, will have to wait longer.
15 Key Takeaways From OpenAI Dev Day
OpenAI Just Announced 4 New AI Features and They’re Available Now
People are using Google study software to make AI podcasts—and they’re weird and amazing
Google’s study software, NotebookLM, is being utilized by users to create AI-generated podcasts, with the tool generating a podcast called Deep Dive that features realistic male and female voices discussing uploaded content. The AI system is designed to create engaging audio in an “upbeat, hyper-interested tone,” according to Raiza Martin, the product lead for NotebookLM. The company is now working on adding more customization options, such as changing the length, format, voices, and languages. Despite its success, the tool is not immune from issues that affect generative AI, such as hallucinations and bias. Notably, Andrej Karpathy, a member of OpenAI’s founding team and former director of AI at Tesla, used NotebookLM to create his own AI podcast series, Histories of Mysteries, in just two hours.
Black Forest Labs releases Flux 1.1 Pro and an API
Black Forest Labs (BFL), the startup behind the Stable Diffusion AI image generation model, has released a new, faster text-to-image model called Flux 1.1 Pro, along with a paid application programming interface (API). The new model, which is six times faster than its predecessor, Flux 1.0 Pro, improves image quality, prompt adherence, and diversity, and is available through partners like together.ai, Replicate, fal.ai, and Freepik. The BFL API allows developers to integrate the company’s generative capabilities into their own applications, with pricing starting at 4 cents per image. This release comes amid a contentious legal landscape, with generative AI companies like Stability AI and Midjourney facing lawsuits over their training datasets, and positions BFL as a major player in the AI-driven media space.
Other News
Tools
OpenAI launches new ‘Canvas’ ChatGPT interface tailored to writing and coding projects – The interface opens a separate workspace window where users can generate writing or code, and then select sections for the AI model to edit. Canvas is currently in beta and is being rolled out to ChatGPT Plus and Teams users, with Enterprise and Edu users to follow.
Microsoft gives Copilot a voice and vision in its biggest redesign yet – Microsoft’s Copilot AI assistant is undergoing a major redesign, adding voice and vision capabilities to create a more personalized and interactive experience.
Meta announces Movie Gen, an AI-powered video generator – Meta announces Movie Gen, an AI-powered video generator that produces high-definition footage complete with sound and can edit existing footage or still images using text inputs.
Pika 1.5 is now live — AI video generator just got major upgrades – Pika Labs has launched Pika 1.5, an advanced AI video generation model with a strong focus on hyper-realism, offering lifelike human and creature movements and sophisticated camera techniques to enhance video creation.
Google Photos is rolling out AI-powered search now – and it could be its biggest upgrade in years – Google Photos is rolling out a Gemini-powered upgrade that allows users to search their photo library with natural language questions, potentially replacing traditional search methods and raising privacy concerns.
Pinterest rolls out genAI tools for product imagery to advertisers – Pinterest introduces genAI tools for advertisers to enhance product imagery, attract more clicks, and create campaigns with less input, resulting in higher clickthrough rates and lower cost-per-click.
Google’s Visual Search Can Now Answer Even More Complex Questions – Google’s visual search tool, Google Lens, has evolved to support multimodal searches, expanded shopping features, and real-time video capture, potentially paving the way for a new kind of smart glasses.
Hacking Generative AI for Fun and Profit – Generative AI can be powerful for prototyping new tools, as demonstrated by a group at Sundai Club using it to build a tool to help journalists identify interesting research papers.
Business
What the Heck Is Going On At OpenAI? – Executives leaving OpenAI express concerns about the company’s accelerationist approach to AI development and its shift to a for-profit model, raising worries about safety and control of artificial general intelligence.
Anthropic hires OpenAI co-founder Durk Kingma – Durk Kingma, co-founder of OpenAI, has announced his move to Anthropic, expressing excitement to contribute to the development of responsible AI systems.
Cerebras, an A.I. Chipmaker Trying to Take On Nvidia, Files for an I.P.O. – Cerebras, a chip company, is set to debut on the stock market, aiming to challenge Nvidia in the artificial intelligence industry.
AI coding startup Poolside raises $500M from eBay, Nvidia and others – AI coding startup Poolside raises $500M from eBay, Nvidia, and others, bringing its total raised to $626 million and its valuation to $3 billion, with plans to use the funding to train future AI models and bolster go-to-market and R&D efforts.
We have to build a very strong revenue engine’: Poolside AI gets ready to release product after a year of secrecy – AI startup Poolside, after securing significant funding, is preparing to release its GenAI model to the market and focus on building a strong revenue engine, while also expanding its team across Europe and the US.
Y Combinator is being criticized after it backed an AI startup that admits it basically cloned another AI startup – Y Combinator-backed AI startup PearAI faces criticism for cloning another AI project and initially using a closed license, sparking controversy over open source principles and YC’s selection process.
The Race to Block OpenAI’s Scraping Bots Is Slowing Down – AI companies are striking deals with publishers to prevent their web crawlers from being blocked, with OpenAI scoring a clear win as its crawlers are no longer getting blocked at the rate they once were.
Waymo to add Hyundai EVs to robotaxi fleet under new multiyear deal – Waymo and Hyundai have formed a strategic partnership, with Waymo adding Hyundai’s Ioniq 5 electric vehicle to its robotaxi fleet and integrating its autonomous technology, with initial on-road testing to begin by late 2025.
Waymo hires Tesla’s head of vehicle programs ahead of Robotaxi unveiling – Waymo hires Tesla’s former head of vehicle programs, Daniel Ho, just ahead of Tesla’s Robotaxi unveiling, as Ho joins Waymo to accelerate autonomous vehicle technology.
Character.ai leaves LLM building behind due to expense – Character.ai shifts focus from building larger LLMs to enhancing consumer products.
India to fabricate its first chip in two years as Nvidia, AMD and Micron pledge to expand to the country – Goyal added that Indian behemoth Tata and other domestic companies are working to make India’s semiconductor dream a reality.
Research
AI simulation gives people a glimpse of their potential future self – AI simulation allows users to interact with a virtual version of their future selves, providing insights and advice based on the user’s input, ultimately helping to improve their sense of future self-continuity and reduce anxiety about the future.
Apple releases Depth Pro, an AI model that rewrites the rules of 3D vision – Apple’s new AI model, Depth Pro, revolutionizes 3D vision by generating high-resolution depth maps from single 2D images in just 0.3 seconds, without relying on camera data, and excels in accuracy and detail, making it versatile for applications like augmented reality and autonomous vehicles.
‘In awe’: scientists impressed by latest ChatGPT model o1 – OpenAI’s latest large language model, OpenAI o1, impresses scientists with its detailed and coherent responses in the field of quantum physics, its ability to beat PhD-level scholars in challenging tests, and its potential to accelerate scientific research.
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions – EMOVA empowers large language models with end-to-end speech capabilities and achieves state-of-the-art performance on vision-language and speech benchmarks while supporting omni-modal spoken dialogue with vivid emotions.
Were RNNs All We Needed? – Decade-old RNNs like LSTMs and GRUs can be efficiently trained in parallel by removing their hidden state dependencies, leading to minimal versions that achieve comparable performance to recent sequence models.
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models – Large language models struggle to generate responses of specific lengths, but a new model-agnostic approach called Ruler, using Meta Length Tokens, enhances their ability to adhere to length constraints, demonstrating versatility and generalization.
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning – New family of multimodal large language models (MLLMs) MM1.5 is designed to enhance capabilities in text-rich image understanding, visual referring, and multi-image reasoning through careful data curation and training strategies.
MIO: A Foundation Model on Multimodal Tokens – Introducing MIO, a foundation model capable of understanding and generating speech, text, images, and videos in an end-to-end, autoregressive manner, showcasing competitive and superior performance compared to previous baselines.
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation – Flex3D is a two-stage framework that leverages a fine-tuned multi-view image diffusion model and a video diffusion model to generate a pool of candidate views, curating high-quality and reliable views for 3D reconstruction, and employing a Flexible Reconstruction Model (FlexRM) to achieve superior performance in 3D generation tasks.
Concerns
AI Safety Culture Confronts Capitalism – AI Safety culture confronts capitalism as leading AI labs grapple with the challenge of prioritizing safety over profit in the development and deployment of AI systems, leading to cultural clashes, high-profile exits, and regulatory debates.
OpenAI’s newest creation is raising shock, alarm, and horror among staffers: a new logo – OpenAI’s staff is shocked and alarmed by the proposed new logo, preferring to keep the current hexagonal flower symbol, as the company undergoes a redesign and corporate restructuring.
Policy
A Narrow Path – To ensure the safe development of artificial intelligence, “A Narrow Path” proposes a three-phase approach involving international governance, safety measures, and scientific understanding, aiming to prevent the emergence of superintelligent AI within the next two decades.
Judge blocks California’s new AI law in case over Kamala Harris deepfake Musk reposted – California’s new AI law, AB 2839, which aimed to target the spread of AI deepfakes on social media, has been temporarily blocked by a federal judge due to concerns about its broad and potentially unconstitutional nature.
Artificial intelligence presented in California does not come to European soil: severely limited options for Google Pixel 9 and iPhone 16 – European legislation severely limits the advanced AI features of the iPhone 16 and Google Pixel 9, restricting access to personal data and hindering functions like Siri Suggestions and Google Assistant Gemini, making it questionable whether these devices are worth investing in for European users.
Analysis
What Kind of Writer Is ChatGPT? – Using ChatGPT for writing assistance raises questions about originality and plagiarism, as it serves more as a sounding board than a perfect plagiarism tool, allowing writers to explore ideas and sharpen prose.
Source: Read MoreÂ