Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Microsoft Researchers Introduce Syntheseus: A Machine Learning Benchmarking Python Library for End-to-End Retrosynthetic Planning

    Microsoft Researchers Introduce Syntheseus: A Machine Learning Benchmarking Python Library for End-to-End Retrosynthetic Planning

    May 14, 2024

    A resurgence of interest in the computer automation of molecular design has occurred throughout the last five years, thanks to advancements in machine learning, especially generative models. While these methods assist in finding compounds with the right properties more quickly, they often produce molecules that are difficult to synthesize in a wet lab since they don’t consider synthesizability. This is the driving force behind efficient CASP algorithms, verifying an input molecule’s synthesizability by retrosynthesis—specifically creating synthesis paths.

    In recent years, the intersection of chemistry and machine learning has been a focal point of attention. However, the practical implementation of state-of-the-art reaction models poses significant challenges. These models are notoriously difficult to run due to their diverse assumptions and dependencies on inputs and outputs. Moreover, the lack of readily callable entry points in the codebases, which are primarily designed to replicate benchmark results, further complicates the process.

    In more detail, researchers from Microsoft, the University of Cambridge, Jagiellonian University, and Johannes Kepler University examine the widely used metrics for both one-step and multi-step retrosynthesis. It is unclear how end-to-end retrosynthesis pipeline measurements relate to those used for single-step and multi-step benchmarking in isolation. Previous research has shown uneven model comparison and metric use. By thoroughly re-evaluating and analyzing previous work, this research aims to define best practices for evaluating retrosynthesis algorithms. The team introduces a Python library, SYNTHESEUS, making it easy for researchers to consistently assess their methods in this regard.

    There are two main constraints on evaluation in retrosynthesis. First, although experimental validation is vital, it should not be required that academics working on algorithm development undertake synthesis in the lab because it is costly, time-consuming, and needs significant expertise. The second issue is that most studies only look at one step of the retrosynthesis pipeline rather than the whole thing because of the split between single-step and multi-step. However, the real-world adoption hinges on how well it works from beginning to end.

    The team integrated eight free and open-source reaction models into one consistent interface, seven sharing the same conda environment. Now that the intricacies of these codebases are neatly tucked away, comparing different sorts of models is as simple as a for a loop.

    To compare the published figures with those generated from this evaluation, the team used the USPTO-50K dataset. This is because all the models they investigate provide results on this dataset. Due to its modest size, USPTO-50K may not provide a true picture of the distribution of all data. Consequently, the team assessed the out-of-distribution generalization of the model checkpoints trained on USPTO-50K using the proprietary Pistachio dataset, which contains over 15.6 million raw reactions and 3.4 million samples after preprocessing. Individuals new to SYNTHESEUS Default weights trained on USPTO-50K are immediately downloaded and cached by Syntheseus, so there’s no need to search for model weights when you start. You can return to a previous time to retrain using a bigger and/or internal dataset.

    Chemformer, GLN, Graph2Edits, LocalRetro, MEGAN, MHNreact, and RootAligned are some of the well-established single-step models that are re-evaluated in this work. In the case of RetroKNN, the researchers were able to receive the code directly from the developers. They trained a new model using the original training code if no available checkpoint with the proper data split was found and used the specified checkpoint for all models otherwise.

    They calculated the Average Reciprocal Rank (MRR) and top-k accuracy (k ≥ 50) while evaluating every model with an output of n = 100. All of the models were run with a consistent batch size of 1. Although any model could easily manage bigger batches, the batch size used for the search is normally fixed at one since the search is not usually parallelized and cannot be freely set. Consequently, the maximum number of model calls executed during a search with a particular time budget is directly related to speed under a batch size of 1.

    It should be noted that while two models (RootAligned and Chemformer) use a Transformer decoder to predict the reactants’ SMILES from the beginning, the other models forecast the graph rewrite that will be applied to the result. While the former type of models performs well for top-1 accuracy across datasets and metrics, they are outperformed for greater k by graph-transformation-based models. Findings suggest that transformation-based models offer more comprehensive coverage of the data distribution because they are more explicitly rooted in the set of changes happening in the training data. Furthermore, when considering top-k accuracy for k > 1, which is impacted by deduplication, many of the USPTO-50K values that are presented outperform the figures seen in the literature. This also affects some of the model rankings; for instance, GLN has worse top-1 accuracy than LocalRetro, which was previously claimed. Pistachio retains a surprising level of model ranking compared to USPTO-50K, even if all results are significantly worse. For example, when it comes to top-50 accuracy, none of the models improve above 55%, whereas USPTO achieves nearly 100%. This is due to inadequate coverage for template-based models, but it was observed that some of the models without templates that were evaluated here also do not generalize better than their template-based equivalents. In conclusion, RetroKNN ranks first or near-first on all metrics across both datasets and is among the fastest models in re-evaluation. Current single-step metrics give a helpful but insufficient picture of how well single-step models perform. Therefore, the researchers warn the reader not to take this as a definitive suggestion.

    The researchers also conducted search experiments combining several single-step models and search algorithms. Their main focus is correcting existing data, outlining best practices, and showcasing SYNTHESEUS. Therefore, they only present preliminary multi-step results. However, the future holds great promise as the framework developed in this research will pave the way for determining the optimum end-to-end pipeline, a prospect that is sure to spark anticipation and hope.

    Results regarding tracking the first solution’s discovery and the maximum number of non-overlapping routes recovered from the search graph are presented. With the exception of Chemformer, GLN, and MHNreact, any search technique may serve the vast majority of models by discovering multiple independent paths to the bulk of targets. RootAligned achieves encouraging outcomes with an average of less than 30 calls (because of its high processing cost). 

    Check out the Paper and GitHub. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 42k+ ML SubReddit

    The post Microsoft Researchers Introduce Syntheseus: A Machine Learning Benchmarking Python Library for End-to-End Retrosynthetic Planning appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleAdvances and Challenges in Drone Detection and Classification Techniques
    Next Article Breaking Down Barriers: Scaling Multimodal AI with CuMo

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    CVE-2025-47424 – Retool Host Header Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    What Are the 7 Essential Cybersecurity Skills You Need for 2025?

    Development

    Mice Robots on Mars: Welcome to the Year 2849

    Artificial Intelligence

    Top 5 Chinese OCR Tools

    Artificial Intelligence

    Highlights

    CVE-2025-2543 – WordPress Advanced Accordion Gutenberg Block Stored Cross-Site Scripting

    April 24, 2025

    CVE ID : CVE-2025-2543

    Published : April 24, 2025, 9:15 a.m. | 1 hour, 28 minutes ago

    Description : The Advanced Accordion Gutenberg Block plugin for WordPress is vulnerable to Stored Cross-Site Scripting via SVG File uploads in all versions up to, and including, 5.0.1 due to insufficient input sanitization and output escaping. This makes it possible for authenticated attackers, with Author-level access and above, to inject arbitrary web scripts in pages that will execute whenever a user accesses the SVG file.

    Severity: 6.4 | MEDIUM

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    Updated Debian 11: 11.11 released

    January 9, 2025

    Atomfall reviews and Metacritic scores are in: Here’s a roundup of what everyone’s saying about this new Game Pass survival game

    March 26, 2025

    Samsung confirms Unpacked date for Galaxy S25 series – and $1,250 off preorder deal

    January 6, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.