Machine Learning

Understanding Language Model Distillation

August 11, 2024

Knowledge Distillation (KD) has become a key technique in the field of Artificial Intelligence, especially in the context of Large…

Development

Revolutionizing AI with Mamba: A Survey of Its Capabilities and Future Directions

August 11, 2024

Deep learning has revolutionized various domains, with Transformers emerging as a dominant architecture. However, Transformers must improve the processing of…

Development

Integrating Stereoelectronic Effects into Molecular Graphs: A Novel Approach for Enhanced Machine Learning Representations and Molecular Property Predictions

August 11, 2024

Traditional molecular representations, primarily focused on covalent bonds, have neglected crucial aspects like delocalization and non-covalent interactions. Existing machine learning…

Development

CMU-MATH Teamâ€™s Innovative Approach Secures 2nd Place at the AIMO Prize

August 11, 2024

Recently, our CMU-MATH team proudly clinched 2nd place in the Artificial Intelligence Mathematical Olympiad (AIMO) out of 1,161 participating teams,…

Development

Crab Framework Released: An AI Framework for Building LLM Agent Benchmark Environments in a Python-Centric Way

August 11, 2024

The development of autonomous agents capable of performing complex tasks across various environments has gained significant traction in artificial intelligence…

Development

This AI Paper from Shanghai AI Laboratory Introduces Lumina-mGPT: A High-Resolution Text-to-Image Generation Model with Multimodal Generative Pretraining

August 11, 2024

Multimodal generative models represent an exciting frontier in artificial intelligence, focusing on integrating visual and textual data to create systems…

Development

Worldâ€™s First Major Artificial Intelligence AI Law Enters into Force in EU: Hereâ€™s What It Means for Tech Giants

August 11, 2024

The European Artificial Intelligence Act came into force on August 1, 2024. It is a significant milestone in the global…

Development

TestART: Achieving 78.55% Pass Rate and 90.96% Coverage with a Co-Evolutionary Approach to LLM-Based Unit Test Generation and Repair

August 11, 2024

Unit testing aims to identify and resolve bugs at the earliest stages by testing individual components or units of code.…

Development

Meet Reducto: An AI-Powered Startup Building Vision Models to Turn Complex Documents into LLM-Ready Inputs

August 11, 2024

Unstructured file types include about 80% of all company data, such as spreadsheets and PDFs. PDFs constitute the de facto…

Development

LiteLLM: Call 100+ LLMs Using the Same Input/Output Format

August 11, 2024

Managing and optimizing API calls to various Large Language Model (LLM) providers can be complex, especially when dealing with different…

Development

BiomedGPT: A Versatile Transformer-Based Foundation Model for Biomedical AI with Enhanced Multimodal Capabilities and Performance

August 11, 2024

Traditional biomedical AI models are often specialized and need more flexibility, making them less effective for real-world applications requiring integrating…

Development

CodexGraph: An Artificial Intelligence AI System that Integrates LLM Agents with Graph Database Interfaces Extracted from Code Repositories

August 11, 2024

Large Language Models (LLMs) have demonstrated exceptional performance on isolated code tasks, such as HumanEval and MBPP, but they struggle…

Development

DistillGrasp: A Unique AI Method for Integrating Features Correlation with Knowledge Distillation for Depth Completion of Transparent Objects

August 11, 2024

RGB-D cameras have a difficult time accurately capturing the depth of transparent objects because of the optical effects of reflection…

Development

LLaVA-OneVision: A Family of Open Large Multimodal Models (LMMs) for Simplifying Visual Task Transfer

August 11, 2024

A key goal in the development of AI is the creation of general-purpose assistants utilizing Large Multimodal Models (LMMs). Building…

Development

Researchers at FPT Software AI Center Introduce AgileCoder: A Multi-Agent System for Generating Complex Software, Surpassing MetaGPT and ChatDev

August 10, 2024

Introduction: Code Large Language Models (CodeLLMs) have demonstrated remarkable proficiency in generating code. However, they struggle with complex software engineering…

Development

Qwen2-Math Released: A Comprehensive AI Suite Featuring Models Ranging from 1.5B to 72B Parameters, Transforming Mathematical Computation

August 10, 2024

The Qwen Team has recently released the Qwen 2-Math series. This release, encompassing several model variants tailored for distinct applications,…

Development

Unraveling Human Reward Learning: A Hybrid Approach Combining Reinforcement Learning with Advanced Memory Architectures

August 10, 2024

Human reward-guided learning is often modeled using simple RL algorithms that summarize past experiences into key variables like Q-values, representing…

Development

Parler-TTS Released: A Fully Open-Sourced Text-to-Speech Model with Advanced Speech Synthesis for Complex and Lightweight Applications

August 10, 2024

Parler-TTS has emerged as a robust text-to-speech (TTS) library, offering two powerful models: Parler-TTS Large v1 and Parler-TTS Mini v1.…

Development

Abacus AI Introduces LiveBench AI: A Super Strong LLM Benchmark that Tests all the LLMs on Reasoning, Math, Coding and more

August 10, 2024

Abacus.AI, a prominent player in AI, has recently unveiled its latest innovation: LiveBench AI. This new tool is designed to…

Development

Trinity-2-Codestral-22B and Tess-3-Mistral-Large-2-123B Released: Pioneering Open Source Advances in Computational Power and AI Integration

August 10, 2024

Migel Tissera has recently unveiled two groundbreaking projects on Hugging Face: Trinity-2-Codestral-22B and Tess-3-Mistral-Large-2-123B. These projects represent a leap forward…