Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»A Simple Open-loop Model-Free Baseline for Reinforcement Learning Locomotion Tasks without Using Complex Models or Computational Resources

    A Simple Open-loop Model-Free Baseline for Reinforcement Learning Locomotion Tasks without Using Complex Models or Computational Resources

    July 4, 2024

    The field of deep reinforcement learning (DRL) is expanding the capabilities of robotic control. However, there has been a growing trend of increasing algorithm complexity. As a result, the latest algorithms need many implementation details to perform well on different levels, causing issues with reproducibility. Moreover, even state-of-the-art DRL models have simple problems, like the Mountain Car environment or the Swimmer task. However, several works have gone against finding simpler baselines and scalable alternatives for RL tasks, so these efforts emphasized the need for simplicity in the field. Complex RL algorithms often require detailed task design in the form of slow reward engineering. 

    To address these issues, this paper discusses related works like the quest for simpler RL baselines and Periodic policies for locomotion.  In the first approach, simpler parametrizations such as linear function or radial basis functions (RBF) are proposed, highlighting the fragility of RL. The second approach involves periodic policies for locomotion, integrating rhythmic movements into robotic control. Recent work has focused on using oscillators to manage locomotion tasks in quadruped robots. However, no prior studies have examined the application of open-loop oscillators in RL locomotion benchmarks.

    Researchers from the German Aerospace Center (DLR) RMC in Germany, Sorbonne Université CNRS in France, and TU Delft CoR in the Netherlands have proposed a simple, open-loop model-free baseline that performs better on standard locomotion tasks without any use of complex models or a lot of computational resources. Although it does not beat RL algorithms in simulation, it provides multiple benefits for real-world applications. These benefits include fast computation, easy deployment on embedded systems, smooth control outputs, and robustness to sensor noise. This method is designed to solve locomotion tasks but is not limited to versatility due to its simplicity. 

    JAX implementations are used from Stable-Baselines3 and the RL Zoo training framework for the RL baselines. The search space is used to optimize the parameters of the oscillators. The effectiveness of the proposed method is tested on the MuJoCo v4 locomotion tasks included in the Gymnasium v0.29.1 library. The approach is compared against three established deep RL algorithms: (a) Proximal Policy Optimization (PPO), (b) Deep Deterministic Policy Gradients (DDPG), and (c) Soft Actor-Critic (SAC). Further, the hyperparameter settings are obtained from the original papers to ensure a fair comparison, except for the swimmer task, where the discount factor (γ = 0.9999) is fine-tuned.

    The proposed baseline and associated experiments highlight the existing limitations of DRL for robotic applications, provide insights on how to address them, and encourage reflection on the costs of complexity and generality. DRL algorithms are compared to the baseline through experiments on locomotion tasks, including simulated tasks, and transfer to a real elastic quadruped. This paper aims to address three key questions: 

    How do open-loop oscillators fare against DRL methods in terms of performance, runtime, and parameter efficiency? 

    How resilient are RL policies to sensor noise, failures, and external disturbances compared to the open-loop baseline? 

    How do learned policies transfer to a real robot when training without randomization or reward engineering? 

    In conclusion, researchers introduced an open-loop model-free baseline that performs well on standard locomotion tasks without needing complex models or computational resources. In this paper, two more experiments are included, which were conducted using open-loop oscillators to detect the current drawback of DRL algorithms. DRL, when compared against the baseline, shows that it is more prone to low performance when faced with sensor noise or failure. However, by design, open-loop control is sensitive to disturbances and cannot recover from potential falls, limiting this baseline. This method produces joint positions without using the robot’s state. So, a PD controller is needed in simulation to transform these positions into torque commands.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. 

    Join our Telegram Channel and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 46k+ ML SubReddit

    The post A Simple Open-loop Model-Free Baseline for Reinforcement Learning Locomotion Tasks without Using Complex Models or Computational Resources appeared first on MarkTechPost.

    Source: Read More 

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMeet Automorphic: An AI Startup that Enables Developers to Build and Improve Custom Fine-Tuned Artificial Intelligence Models Rapidly
    Next Article NASA and IBM Researchers Introduce INDUS: A Suite of Domain-Specific Large Language Models (LLMs) for Advanced Scientific Research

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 16, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2022-4363 – Wholesale Market WooCommerce CSRF Vulnerability

    May 16, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    TRAIN begins its activity in Europe, promising healthcare advancement through AI

    Development

    More_eggs Malware Disguised as Resumes Targets Recruiters in Phishing Attack

    Development

    9 Steps to Get CTEM on Your 2025 Budgetary Radar

    Development

    Signalize.js – Modulable JavaScript Framework

    Development

    Highlights

    CERT-In Flags Info Disclosure Flaw in TP-Link Tapo H200 Smart Hub Development

    CERT-In Flags Info Disclosure Flaw in TP-Link Tapo H200 Smart Hub

    April 9, 2025

    A new vulnerability has been identified in the TP-Link Tapo H200 V1 IoT Smart Hub…

    Cellebrite Android Zero-Day Exploit PoC Released: CVE-2024-53104

    April 20, 2025

    People’s Cyber Army, APT44, and NoName057 Launch DDoS Attacks on Denmark

    July 4, 2024

    Abandon – double-entry accounting system

    March 22, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.