Advancing Machine Learning with KerasCV and KerasNLP: A Comprehensive Overview

Keras is a widely used machine learning tool known for its high-level abstractions and ease of use, enabling rapid experimentation. Recent advances in CV and NLP have introduced challenges, such as the prohibitive cost of training large, state-of-the-art models. Access to open-source pretrained models is crucial. Additionally, preprocessing and metrics computation complexity has increased due to varied techniques and frameworks like JAX, TensorFlow, and PyTorch. Improving NLP model training performance is also difficult, with tools like the XLA compiler offering speedups but adding complexity to tensor operations.

Researchers from the Keras Team at Google LLC introduce KerasCV and KerasNLP, extensions of the Keras API for CV and NLP. These packages support JAX, TensorFlow, and PyTorch, emphasizing ease of use and performance. They feature a modular design, offering building blocks for models and data preprocessing at a low level and pretrained task models for popular architectures like Stable Diffusion and GPT-2 at a high level. These models include built-in preprocessing, pretrained weights, and fine-tuning capabilities. The libraries support XLA compilation and utilize TensorFlowâ€™s tf. Data API for efficient preprocessing. They are open-source and available on GitHub.

The HuggingFace Transformers library parallels KerasNLP and KerasCV, offering pretrained model checkpoints for many transformer architectures. While HuggingFace uses a â€œrepeat yourselfâ€ approach, KerasNLP adopts a layered approach to reimplement large language models with minimal code. Both methods have their pros and cons. KerasCV and KerasNLP publish all pretrained models on Kaggle Models, which are accessible in Kaggle competition notebooks even in Internet-off mode. Table 1 compares the average time per training or inference step for models like SAM, Gemma, BERT, and Mistral across different versions and frameworks of Keras.

The Keras Domain Packages API adopts a layered design with three main abstraction levels. Foundational Components offer composable modules for building preprocessing pipelines, models, and evaluation logic, which are usable independently of the Keras ecosystem. Pretrained Backbones provide fine-tuning-ready models with matching tokenizers for NLP. Task Models are specialized for tasks like text generation or object detection, combining lower-level modules for a unified training and inference interface. These models can be used with PyTorch, TensorFlow, and JAX frameworks. KerasCV and KerasNLP support the Keras Unified Distribution API for seamless model and data parallelism, simplifying the transition from single-device to multi-device training.

Framework performance varies with the specific model, and Keras 3 allows users to choose the fastest backend for their tasks, consistently outperforming Keras 2, as shown in Table 1. Benchmarks were conducted using a single NVIDIA A100 GPU with 40GB memory on a Google Cloud Compute Engine (a2-highgpu-1g) with 12 vCPUs and 85GB host memory. The same batch size was used across frameworks for the same model and task (fit or predict). Different batch sizes were employed for varying models and functions to optimize memory usage and GPU utilization. Gemma and Mistral used the same batch size due to their similar parameters.

In conclusion, there are plans to enhance the projectâ€™s capabilities in the future, particularly by broadening the range of multimodal models to support diverse applications. Additionally, efforts will focus on refining integrations with backend-specific large model serving solutions to ensure smooth deployment and scalability. KerasCV and KerasNLP present versatile toolkits featuring modular components for quick model prototyping and a variety of pretrained backbones and task models for computer vision and natural language processing tasks. These resources cater to JAX, TensorFlow, or PyTorch users, offering state-of-the-art training and inference performance. Comprehensive user guides for KerasCV and KerasNLP are available on Keras.io.

Check out theÂ Paper. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter.Â Join ourÂ Telegram Channel,Â Discord Channel, andÂ LinkedIn Group.

If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 43k+ ML SubReddit | Also, check out our AI Events Platform

The post Advancing Machine Learning with KerasCV and KerasNLP: A Comprehensive Overview appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Advancing Machine Learning with KerasCV and KerasNLP: A Comprehensive Overview

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

CVE-2025-47916 – Invision Community Themeeditor Remote Code Execution

I replaced my Samsung Galaxy S24 Ultra with the Pixel 9 Pro XL for a week – and didn’t regret it

CVE-2025-2092 – Checkmk GmbH Checkmk Log File Information Disclosure

Sitecore Personalize: Triggering Goals Programmatically

Discover the Benefits of Salesforce Pay Now

Celebrating Motherâ€™s Day with Perficientâ€™s Women in Tech: A Working Moms Career Panel

Top Speech AI projects and winners at 2024 AssemblyAI Hackathon

This AI Paper from UCLA Unveils â€˜2-Factor Retrievalâ€™ for Revolutionizing Human-AI Decision-Making in Radiology

CuerdOS Linux – distribution of Spanish origin

Advancing Machine Learning with KerasCV and KerasNLP: A Comprehensive Overview

Related Posts