Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Instruct-MusicGen: A Novel Artificial Intelligence AI Approach to Text-to-Music Editing that Fosters Joint Musical and Textual Controls

    Instruct-MusicGen: A Novel Artificial Intelligence AI Approach to Text-to-Music Editing that Fosters Joint Musical and Textual Controls

    June 12, 2024

    Researchers from C4DM, Queen Mary University of London, Sony AI, and Music X Lab, MBZUAI, have introduced Instruct-MusicGen to address the challenge of text-to-music editing, where textual queries are used to modify music, such as changing its style or adjusting instrumental components. Current methods are required to train specific models from scratch, are resource-intensive, and need some approaches to reconstruct edited audio, leading to subpar results precisely. The study aims to develop a more efficient and effective method that leverages pre-trained models to perform high-quality music editing based on textual instructions.

    Current methods for text-to-music editing include training specialized models from scratch, which is inefficient and resource-heavy, and using large language models to interpret and edit music, often resulting in imprecise audio reconstruction. These methods are either too costly or fail to deliver accurate results. To overcome these challenges, the researchers propose Instruct-MusicGen, a novel approach that fine-tunes a pre-trained MusicGen model to follow editing instructions efficiently. This approach introduces a text fusion module and an audio fusion module to the original MusicGen architecture, allowing it to process instruction texts and audio inputs concurrently. Instruct-MusicGen significantly reduces the need for extensive training and additional parameters while achieving superior performance across various tasks.

    Instruct-MusicGen enhances the original MusicGen model by incorporating two new modules: the audio fusion module and the text fusion module. The audio fusion module allows the model to accept and process external audio inputs, enabling precise audio editing. This is achieved by duplicating self-attention modules and incorporating cross-attention between the original music and the conditional audio. The text fusion module modifies the behavior of the text encoder to handle instruction inputs, allowing the model to follow text-based editing commands effectively. The combined modules enable Instruct-MusicGen to add, separate, and remove stems from music audio based on textual instructions.

    The model was trained using a synthesized dataset created from the Slakh2100 dataset, which includes high-quality audio tracks and corresponding MIDI files. The training process was optimized to require only 8% additional parameters compared to the original MusicGen model and completed within 5,000 steps, significantly reducing resource usage. The performance of Instruct-MusicGen was evaluated on two datasets: the Slakh test set and the out-of-domain MoisesDB dataset. The model outperformed existing baselines in various tasks, demonstrating its efficiency and effectiveness in text-to-music editing. It achieved superior audio quality, alignment with textual descriptions, and signal-to-noise ratio improvements.

    In conclusion, Instruct-MusicGen addresses the limitations of existing methods in text-to-music editing by leveraging pre-trained models and proposing efficient training techniques. The proposed approach significantly reduces the computational resources required and achieves high-quality results in music editing tasks. While it performs well across various metrics, some limitations remain, such as relying on synthetic training data and potential inaccuracies in signal-level precision. The development of Instruct-MusicGen marks a meaningful step forward in the field of AI-assisted music creation, combining efficiency with high performance.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 44k+ ML SubReddit

    The post Instruct-MusicGen: A Novel Artificial Intelligence AI Approach to Text-to-Music Editing that Fosters Joint Musical and Textual Controls appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleInspectus: An Open-Sourced Large Language Model LLM Attention Visualization Library
    Next Article A New Era AI Databases: PostgreSQL with pgvectorscale Outperforms Pinecone and Cuts Costs by 75% with New Open-Source Extensions

    Related Posts

    Machine Learning

    LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks

    May 17, 2025
    Machine Learning

    This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational Efficiency

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    6 Best Free and Open Source JavaScript-Based Web Content Management Systems

    Development

    What is penetration testing? | Unlocked 403 cybersecurity podcast (ep. 10)

    Development

    Ransomware reaches a record high, but payouts are dwindling

    Development

    AWS Systems Manager (SSM)

    Development

    Highlights

    CVE-2025-4649 – Centreon Web Privilege Escalation Vulnerability

    May 13, 2025

    CVE ID : CVE-2025-4649

    Published : May 13, 2025, 12:15 p.m. | 4 hours, 9 minutes ago

    Description : Improper Privilege Management vulnerability in Centreon web allows Privilege Escalation.
    ACL are not correctly taken into account in the display of the “event logs” page. This page requiring, high privileges, will display all available logs.

    This issue affects web: from 24.10.3 before 24.10.4, from 24.04.09 before 24.04.10, from 23.10.19 before 23.10.21, from 23.04.24 before 23.04.26.

    Severity: 4.9 | MEDIUM

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    Xbox’s South of Midnight weaves a dark yet empathetic tale while showing why “that kind of representation matters”

    February 11, 2025

    Twilio Unveils Next-Generation Customer Engagement Platform Built for an AI and Data-Powered World at SIGNAL 2025

    May 14, 2025

    Turning Data into Decisions: How CVE Management Is Changing

    January 22, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.