Efficient Function Calling in Small-Scale LLMs: A Game-Changer for AI Reasoning Tasks

Recent advancements in Large Language Models (LLMs) have demonstrated exceptional natural language understanding and generation capabilities. Research has explored the unexpected abilities of LLMs beyond their primary training task of text prediction. These models have shown promise in function calling for software APIs, supported by the launch of GPT-4 plugin features. Integrated tools include web browsers, translation systems, Dialogue State Tracking (DST), and robotics. While LLMs show promising results in general complex reasoning, they still face challenges in mathematical problem-solving and logical capacities. To address this, researchers have proposed techniques like function calls, which allow LLMs to execute provided functions and utilize their outputs to assist in various task completion. These functions vary from basic tools like calculators that perform arithmetic operations to more advanced methods. However, concentrating on specific tasks using only a small portion of available APIs highlights the inefficiency of relying solely on large models, which require major computational power for both training and inference and because of the expensive cost of training. This situation calls for creating smaller, task-specific LLMs that maintain core functionality while reducing operational costs. While promising, the trend toward smaller models introduces new challenges.

Current methods involve using large-scale LLMs for reasoning tasks, which are resource-intensive and costly. Due to their generalized nature, these models often struggle with specific logical and mathematical problem-solving.Â

The proposed research method introduces a novel framework for training smaller LLMs in function calling, focusing on specific reasoning tasks. This approach employs an agent that queries the LLM by injecting descriptions and examples of usable functions into the prompt, creating a dataset of correct and incorrect reasoning chain completions.

To address the drawbacks of oversized LLMs, which incur excessive training and inference costs, a group of researchers introduced a novel framework to train smaller language models starting from the function-calling abilities of large models for specific logical and mathematical reasoning tasks. Given a problem and a set of useful functions for its solution, this framework involves an agent that queries a large-scale LLM by injecting function descriptions and examples into the prompt and managing the proper function calls needed to find the solution, all in a step-by-step reasoning chain. This procedure is used to create a dataset with correct and incorrect completions. The generated dataset then trains a smaller model using a Reinforcement Learning from Human Feedback (RLHF) approach, known as Direct Preference Optimization (DPO). We present this methodology tested on two reasoning tasks, First-Order Logic (FOL) and math, using a custom-built set of FOL problems inspired by the HuggingFace dataset.

The proposed frameworkâ€™s pipeline comprises four stages: first, defining tasks and problems to assess the abilities of large language models (LLMs) in various reasoning domains. Next, functions specific to each task are set up, allowing the LLM to solve reasoning steps, manage the chain flow, and verify results. A pre-trained, large-scale LLM is then chosen to generate a dataset of correct and incorrect completions using a chain-of-thought prompting approach. Finally, a smaller LLM model is fine-tuned using the Direct Policy Optimization (DPO) algorithm on the created dataset. Experimentation involved testing the model on first-order logic (FOL) and mathematical problems, with results generated using an agent-based library, Microchain, which facilitates LLM querying with predefined functions to create a chain-of-thought dataset.Â

Data augmentation was conducted to extend the dataset, and fine-tuning was performed on Mistral-7B using a single GPU. Performance metrics demonstrated the modelâ€™s accuracy improvement in FOL tasks and moderate gains in mathematical tasks, with statistical significance confirmed through a Wilcoxon test.

In conclusion, the researchers proposed a new framework for improving the function-calling abilities of small-scale LLMs, focusing on specific logical and mathematical reasoning tasks. This method reduces the need for large models and boosts the performance on logical and math-related tasks.Â The experimental results demonstrate significant improvements in the performance of the small-scale model on FOL tasks, achieving near-perfect accuracy in most cases. In future work, there is great scope to explore the application of the introduced framework to a broader range of reasoning tasks and function types.Â

Check out the Paper. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter and join ourÂ Telegram Channel andÂ LinkedIn Group. If you like our work, you will love ourÂ newsletter.. Donâ€™t Forget to join ourÂ 55k+ ML SubReddit.

The post Efficient Function Calling in Small-Scale LLMs: A Game-Changer for AI Reasoning Tasks appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Efficient Function Calling in Small-Scale LLMs: A Game-Changer for AI Reasoning Tasks

Salesforce AI Releases BLIP3-o: A Fully Open-Source Unified Multimodal Model Built with CLIP Embeddings and Flow Matching for Image Understanding and Generation

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

IDPs may be how we solve the development complexity problem

Brisa Framework v0.2.11

CVE-2025-43972 – GoBGP FlowSpec Parser Denial of Service

Corporate Presentation Templates for Premiere Pro

Blast-RADIUS Vulnerability Affects Widely-Used RADIUS Authentication Protocol

OpenAI’s “Google killer” AI-powered search experience is now freely available for everyone — you’ll no longer need an account or $20 subscription plan, but don’t ask silly questions either

The AI Fix #41: Can AIs be psychopaths, and why we should be AI optimists

Sprinklr improves performance by 20% and reduces cost by 25% for machine learning inference on AWS Graviton3

Efficient Function Calling in Small-Scale LLMs: A Game-Changer for AI Reasoning Tasks

Related Posts