Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 4, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 4, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 4, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 4, 2025

      Players aren’t buying Call of Duty’s “error” excuse for the ads Activision started forcing into the game’s menus recently

      June 4, 2025

      In Sam Altman’s world, the perfect AI would be “a very tiny model with superhuman reasoning capabilities” for any context

      June 4, 2025

      Sam Altman’s ouster from OpenAI was so dramatic that it’s apparently becoming a movie — Will we finally get the full story?

      June 4, 2025

      One of Microsoft’s biggest hardware partners joins its “bold strategy, Cotton” moment over upgrading to Windows 11, suggesting everyone just buys a Copilot+ PC

      June 4, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      LatAm’s First Databricks Champion at Perficient

      June 4, 2025
      Recent

      LatAm’s First Databricks Champion at Perficient

      June 4, 2025

      Beyond AEM: How Adobe Sensei Powers the Full Enterprise Experience

      June 4, 2025

      Simplify Negative Relation Queries with Laravel’s whereDoesntHaveRelation Methods

      June 4, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Players aren’t buying Call of Duty’s “error” excuse for the ads Activision started forcing into the game’s menus recently

      June 4, 2025
      Recent

      Players aren’t buying Call of Duty’s “error” excuse for the ads Activision started forcing into the game’s menus recently

      June 4, 2025

      In Sam Altman’s world, the perfect AI would be “a very tiny model with superhuman reasoning capabilities” for any context

      June 4, 2025

      Sam Altman’s ouster from OpenAI was so dramatic that it’s apparently becoming a movie — Will we finally get the full story?

      June 4, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Steps to Build an Interactive Text-to-Image Generation Application using Gradio and Hugging Face’s Diffusers

    Steps to Build an Interactive Text-to-Image Generation Application using Gradio and Hugging Face’s Diffusers

    February 20, 2025

    In this tutorial, we will build an interactive text-to-image generator application accessed through Google Colab and a public link using Hugging Face’s Diffusers library and Gradio. You’ll learn how to transform simple text prompts into detailed images by leveraging the state-of-the-art Stable Diffusion model and GPU acceleration. We’ll walk through setting up the environment, installing dependencies, caching the model, and creating an intuitive application interface that allows real-time parameter adjustments.

    Copy CodeCopiedUse a different Browser
    !pip install diffusers transformers accelerate gradio

    First, we install four essential Python packages using pip. Diffusers provides tools for working with diffusion models, Transformers offers pretrained models for various tasks, Accelerate optimizes performance on different hardware setups, and Gradio enables the creation of interactive machine learning interfaces. These libraries form the backbone of our text-to-image generation demo in Google Colab. Set the runtime to GPU.

    Copy CodeCopiedUse a different Browser
    import torch
    from diffusers import StableDiffusionPipeline
    import gradio as gr
    
    
    # Global variable to cache the pipeline
    pipe = None

    No, we import necessary libraries: torch for tensor computations and GPU acceleration, StableDiffusionPipeline from the Diffusers library for loading and running the Stable Diffusion model, and gradio for building interactive demos. Also, a global variable pipe is initialized to None to cache the loaded model pipeline later, which helps avoid reloading the model on every inference call.

    Copy CodeCopiedUse a different Browser
    print("CUDA available:", torch.cuda.is_available())

    The above code line indicates whether a CUDA-enabled GPU is available. It uses PyTorch’s torch.cuda.is_available() function returns True if a GPU is detected and ready for computations and False otherwise, helping ensure that your code can leverage GPU acceleration.

    Copy CodeCopiedUse a different Browser
    pipe = StableDiffusionPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5",
        torch_dtype=torch.float16
    )
    pipe = pipe.to("cuda")

    The above code snippet loads the Stable Diffusion pipeline using a pretrained model from “runwayml/stable-diffusion-v1-5”. It sets its data type to a 16-bit floating point (torch.float16) to optimize memory usage and performance. It then moves the entire pipeline to the GPU (“cuda”) to leverage hardware acceleration for faster image generation.

    Copy CodeCopiedUse a different Browser
    def generate_sd_image(prompt, num_inference_steps=50, guidance_scale=7.5):
        """
        Generate an image from a text prompt using Stable Diffusion.
    
    
        Args:
            prompt (str): Text prompt to guide image generation.
            num_inference_steps (int): Number of denoising steps (more steps can improve quality).
            guidance_scale (float): Controls how strongly the prompt is followed.
           
        Returns:
            PIL.Image: The generated image.
        """
        global pipe
        if pipe is None:
            print("Loading Stable Diffusion model... (this may take a while)")
            pipe = StableDiffusionPipeline.from_pretrained(
                "runwayml/stable-diffusion-v1-5",
                torch_dtype=torch.float16,
                revision="fp16"
            )
            pipe = pipe.to("cuda")
       
        # Use autocast for faster inference on GPU
        with torch.autocast("cuda"):
            image = pipe(prompt, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale).images[0]
       
        return image
    

    Above function, generate_sd_image, takes a text prompt along with parameters for inference steps and guidance scale to generate an image using Stable Diffusion. It checks if the model pipeline is already loaded in the global pipe variable; if not, it loads and caches the model with half-precision (FP16) and moves it to the GPU. It then utilizes torch.autocast for efficient mixed-precision inference and returns the generated image.

    Copy CodeCopiedUse a different Browser
    # Define the Gradio interface
    demo = gr.Interface(
        fn=generate_sd_image,
        inputs=[
            gr.Textbox(lines=2, placeholder="Enter your prompt here...", label="Text Prompt"),
            gr.Slider(minimum=10, maximum=100, step=5, value=50, label="Inference Steps"),
            gr.Slider(minimum=1, maximum=20, step=0.5, value=7.5, label="Guidance Scale")
        ],
        outputs=gr.Image(type="pil", label="Generated Image"),
        title="Stable Diffusion Text-to-Image Demo",
        description="Enter a text prompt to generate an image using Stable Diffusion. Adjust the parameters to fine-tune the result."
    )
    
    
    # Launch the interactive demo
    demo.launch()
    

    Here, we define a Gradio interface that connects the generate_sd_image function to an interactive web UI. It provides three input widgets, a textbox for entering the text prompt, and sliders for adjusting the number of inference steps and guidance scale. In contrast, the output widget displays the generated image. The interface also includes a title and descriptive text to guide users, and the interactive demo is finally launched.

    App Interface Generated by Code on Public URL

    You can also access the web app through a public URL: https://7dc6833297cf83b160.gradio.live/ (Active for 72 hrs). A similar link will be generated for your code as well.

    In conclusion, this tutorial demonstrated how to integrate Hugging Face’s Diffusers with Gradio to create a powerful, interactive text-to-image application in Google Colab and a web application. From setting up the GPU-accelerated environment and caching the Stable Diffusion model to building an interface for dynamic user interaction, you have a solid foundation to experiment with and further develop advanced generative models.


    Here is the Colab Notebook for the above project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 75k+ ML SubReddit.

    🚨 Recommended Read- LG AI Research Releases NEXUS: An Advanced System Integrating Agent AI System and Data Compliance Standards to Address Legal Concerns in AI Datasets

    The post Steps to Build an Interactive Text-to-Image Generation Application using Gradio and Hugging Face’s Diffusers appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleKGGen: Advancing Knowledge Graph Extraction with Language Models and Clustering Techniques
    Next Article Breaking the Autoregressive Mold: LLaDA Proves Diffusion Models can Rival Traditional Language Architectures

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 4, 2025
    Machine Learning

    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025

    June 4, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    CVE-2025-1985 – Cisco Web-UI Cross-Site Scripting (XSS) Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    10 Ways IT Departments Waste Money (Free Download)

    Development

    Ukrainian Cybercrime Kingpin ‘Tank’ Sentenced to Two Concurrent 9-Year Prison Terms

    Development

    CVE-2024-54779 – Netgate pfSense CE Cross Site Scripting Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    Development

    FCC Launches ‘Cyber Trust Mark’ for IoT Devices to Certify Security Compliance

    January 8, 2025

    The U.S. government on Tuesday announced the launch of the U.S. Cyber Trust Mark, a…

    How to Set Up a Kubernetes Network Policy and Secure Your Cluster

    February 19, 2025

    CVE-2024-13940 – Ninja Forms Webhooks SSRF Vulnerability

    May 14, 2025

    Learn Redux and Redux Toolkit for State Management

    November 21, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.