Stars
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".
Fast and memory-efficient exact attention
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting
we propose FlexEdit, an end-to-end image editing method that leverages both free-shape masks and language instructions for Flexible Editing.
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Official implementation of "Single Image Iterative Subject-driven Generation and Editing".
[ICLR2025] A versatile image-to-image visual assistant, designed for image generation, manipulation, and translation based on free-from user instructions.
A minimal and universal controller for FLUX.1.
[Arxiv'25] BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
[AAAI2025] Textualize Visual Prompt for Image Editing via Diffusion Bridge
Training-Free Text-Guided Image Editing Using Visual Autoregressive Model
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing
Official Implementation of Steering Rectified Flow Models in the Vector Field for Controlled Image Generation
MAGI-1: Autoregressive Video Generation at Scale
An 8-step inversion and 8-step editing process works effectively with the FLUX-dev model. (3x speedup with results that are comparable or even superior to baseline methods)
Official implementations for paper: Zero-shot Image Editing with Reference Imitation
Implementation of paper EditCLIP: Representation Learning for Image Editing
Official code of SmartEdit [CVPR-2024 Highlight]