GLIGEN

Visit Tool

GLIGEN is an open-source tool for grounded text-to-image generation that extends frozen text-to-image models. It enables new capabilities by grounding on various prompts, including boxes, keypoints, and images.

Claim this tool

1View

At a glance

Pricing

Open Source

Free tier

Yes

API

Skill level

Technical

About

What is GLIGEN?

GLIGEN is an open-source tool designed for open-set grounded text-to-image generation, enhancing existing text-to-image models like Stable Diffusion. It allows users to go beyond simple text prompts by incorporating various grounding inputs such as bounding boxes, keypoints, and even other images. This capability enables more precise control over image generation, outperforming existing supervised layout-to-image baselines in zero-shot performance on datasets like COCO and LVIS. GLIGEN supports both grounded generation and inpainting tasks, offering multiple checkpoints for different modalities like box+text, keypoint, HED map, Canny map, depth map, and semantic map. It is suitable for researchers and developers in AI and computer vision.

Best used for

Ideal for researchers and developers who need to generate images with precise spatial control, perform grounded inpainting, and explore advanced image synthesis techniques. Especially valuable for academic projects and applications requiring fine-grained control over generated visual content.

Common actions

generate images with control

ground image generation

inpaint images

research image synthesis

"AI Agents"face swappinggithub copilotlow-code/no-codeautomated workflowdeepfakecollaborationworkflowsopen-source

Capabilities

Key features

Grounded text-to-image generation
Box, keypoint, image grounding
Inpainting capabilities
Multiple pre-trained checkpoints
Integration with Grounding DINO
Supports various modalities

Target Audience

ai researchersmachine learning engineerscomputer vision developers

Integrations

Not yet documented

Pricing & Plans

Open Source

Free

FAQs

What types of grounding inputs does GLIGEN support?

GLIGEN supports various grounding inputs to control image generation, including bounding boxes, keypoints, and other images. It also works with modalities like HED maps, Canny maps, depth maps, and semantic maps for diverse applications.

Can GLIGEN be used for image inpainting?

Yes, GLIGEN supports inpainting tasks in addition to grounded generation. It allows users to modify existing images by providing grounding conditions, offering more control over the editing process.

Is GLIGEN integrated with other AI models?

GLIGEN has been combined with Grounding DINO, which helps localize concepts with bounding boxes from a language prompt. This integration streamlines the process of generating images based on conceptual and spatial information.

Trending

Subcategories trending in Content & Design

AI Writing Assistants Audio & Music Video Generation Photo Editing Graphic Design Video Editing

Trending

Explore

Browse AI tools by category

Content & Design Productivity & Business Coding & Development AI Agents & Automation Research & Education Wellness & Lifestyle Career Development Marketing & Growth Data & Analytics Customer Support & CX Finance E-commerce