GLIGEN
Visit ToolGLIGEN is an open-source tool for grounded text-to-image generation that extends frozen text-to-image models. It enables new capabilities by grounding on various prompts, including boxes, keypoints, and images.
At a glance
Trending
GLIGEN is an open-source tool for grounded text-to-image generation that extends frozen text-to-image models. It enables new capabilities by grounding on various prompts, including boxes, keypoints, and images.
Trending
About
GLIGEN is an open-source tool designed for open-set grounded text-to-image generation, enhancing existing text-to-image models like Stable Diffusion. It allows users to go beyond simple text prompts by incorporating various grounding inputs such as bounding boxes, keypoints, and even other images. This capability enables more precise control over image generation, outperforming existing supervised layout-to-image baselines in zero-shot performance on datasets like COCO and LVIS. GLIGEN supports both grounded generation and inpainting tasks, offering multiple checkpoints for different modalities like box+text, keypoint, HED map, Canny map, depth map, and semantic map. It is suitable for researchers and developers in AI and computer vision.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending