Salesforce AI Research Introduces Moirai-MoE: A MoE Time Series Foundation Model that Achieves Token-Level Model Specialization Autonomously

Time series forecasting has long been integral to finance, healthcare, meteorology, and supply chain management. Its main objective is to predict future data points based on historical observations, which can be challenging due to the complex and varying nature of time series data. Recent advancements in machine learning, particularly foundation models, have transformed this domain by creating generalized models capable of handling various time series without specialized, case-specific training. These foundation models mark a significant shift from traditional approaches that required multiple models tailored to specific datasets. However, the diversity in time series characteristics, such as variations in frequency, seasonality, and underlying patterns, continues to present substantial challenges for unified model training.

A key problem in time series forecasting is handling data heterogeneity effectively. Time series data from different sources vary significantly regarding frequency, distribution, and structure. Current forecasting models often rely on human-defined frequency-based specialization to address this diversity. However, frequency alone is not a reliable indicator of a time series pattern, as data with similar frequencies may exhibit distinct behaviors. Conversely, data with different frequencies may display similar patterns. This approach must capture the complexity and diversity inherent in real-world time series. Another challenge lies in the non-stationary nature of time series data, where the statistical properties of the data change over time, making it difficult to model accurately with frequency-based grouping.

Existing time series forecasting methods attempt to address data variability with varied approaches. For instance, models such as TEMPO and UniTime incorporate language-based prompts to help the model discern different data sources, achieving limited dataset-level specialization. Other models, like TimesFM, maintain frequency-specific embedding dictionaries to aid in distinguishing between data types based on frequency. However, many models, including the widely recognized Chronos series, opt for a generalized structure without specialized modules, increasing model complexity and large parameter demands. The challenge with these methods is their inability to fully capture the diverse nature of time series data, as frequency alone only sometimes correlates with underlying data patterns, leading to inefficiencies and compromised model accuracy.

Researchers from Salesforce AI Research, the National University of Singapore, and the Hong Kong University of Science and Technology introduced an innovative model called MOIRAI-MoE. MOIRAI-MoE integrates a sparse mixture of experts (MoE) within its Transformer architecture, allowing token-level specialization without human-defined frequency heuristics. This data-driven approach minimizes dependency on predefined frequency-based layers and uses a single input/output projection layer, enabling the model to automatically capture and represent diverse patterns. By achieving token-level specialization, MOIRAI-MoE provides a more flexible and efficient solution capable of better representing the unique characteristics of varied time series data without requiring distinct models for each frequency category.

MOIRAI-MoEâ€™s architecture leverages a gating function that assigns each token to an appropriate expert within the Transformer layers based on token clustering derived from a pretrained model. This clustering approach is guided by the Euclidean distance to centroids, allowing tokens with similar patterns to be processed by the same expert while specialized experts handle diverse tokens. By incorporating 32 expert networks, each focusing on unique time series characteristics, MOIRAI-MoE effectively reduces computational overhead while enhancing its ability to generalize across different data types. This approach enables MOIRAI-MoE to excel in representing non-stationary time series data by dynamically adapting to pattern shifts within the data.

Extensive testing across 39 datasets demonstrated the superior performance of MOIRAI-MoE in both in-distribution and zero-shot forecasting scenarios. For in-distribution forecasting, MOIRAI-MoE outperformed its dense model counterpart by up to 17%, showcasing a significant improvement in accuracy while utilizing up to 65 times fewer activated parameters than other leading models, including TimesFM and Chronos. In zero-shot forecasting, where the model was tested on datasets not included in the training data, MOIRAI-MoEâ€™s performance surpassed traditional models. In these tests, MOIRAI-MoE achieved a 3-14% improvement in continuous ranked probability score (CRPS) and an 8-16% improvement in mean absolute scaled error (MASE) over prior models. These results underscore the modelâ€™s robust generalization ability without requiring task-specific training.

This research presents key takeaways that highlight the advancements MOIRAI-MoE brings to time series forecasting:

Data-Driven Specialization: By achieving token-level specialization through a sparse mixture of experts, MOIRAI-MoE overcomes the limitations of human-defined frequency specialization, allowing for a more nuanced representation of time series diversity.
Computational Efficiency: The modelâ€™s sparse expert activation drastically reduces computational demands, achieving up to 65 times fewer activated parameters while maintaining high accuracy.
Performance Gains: Testing on diverse datasets confirmed that MOIRAI-MoE surpasses dense models and foundational models like TimesFM and Chronos, achieving a 17% improvement over dense counterparts in in-distribution tests.
Scalability and Generalization: MOIRAI-MoE demonstrates strong zero-shot performance, making it highly applicable to real-world forecasting tasks without requiring specialized training for each application, which is critical in diverse applications like finance, healthcare, and climate modeling.

In conclusion, MOIRAI-MoE represents a major advancement in time series forecasting by introducing a flexible, data-driven approach that overcomes the limitations of frequency-based specialization. With its sparse mixture of expert architecture, MOIRAI-MoE addresses the diverse and non-stationary nature of time series data and achieves significant computational efficiency and performance gains. This novel approach underscores the potential of token-level specialization, paving the way for future improvements in time series foundation models and expanding the utility of zero-shot forecasting across various industries and applications.

Check out the Paper. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter and join ourÂ Telegram Channel andÂ LinkedIn Group. If you like our work, you will love ourÂ newsletter.. Donâ€™t Forget to join ourÂ 55k+ ML SubReddit.

[AI Magazine/Report] Read Our Latest Report on â€˜SMALL LANGUAGE MODELSâ€˜

The post Salesforce AI Research Introduces Moirai-MoE: A MoE Time Series Foundation Model that Achieves Token-Level Model Specialization Autonomously appeared first on MarkTechPost.

Source: Read MoreÂ

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

Salesforce AI Research Introduces Moirai-MoE: A MoE Time Series Foundation Model that Achieves Token-Level Model Specialization Autonomously

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

Rilasciata PorteuX 1.9: Novità e Miglioramenti per la Distribuzione Portatile Basata su Slackware

Alibaba Speech Lab Releases ClearerVoice-Studio: An Open-Sourced Voice Processing Framework Supporting Speech Enhancement, Separation, and Target Speaker Extraction

Debugging and Error Handling in VBA for Excel

Save 33% with a free Xbox Game Pass: Don’t miss this Fire TV Stick bundle Labor Day sale

5 Best Websites for Free Angular Templates

OpenTelemetry in N|Solid

Ignition: Una Soluzione Moderna per Gestire le Applicazioni all’Avvio in GNU/Linux

Dark Mode Design: Best Practices and User Preferences

Salesforce AI Research Introduces Moirai-MoE: A MoE Time Series Foundation Model that Achieves Token-Level Model Specialization Autonomously

Related Posts