PhotoDoodle open source AI art editing tool

Description:

PhotoDoodle is an open source image editing tool developed by ShowLab that aims to add artistic graffiti elements to photos through artificial intelligence technology. Users only need to enter simple text prompts, such as “Add halos and wings to the cat”, which can generate artistic elements that naturally blend with the original photo background while maintaining the integrity of the background.

Function:

  • Free and open source: Provide open code and data sets to support developers to freely explore and improve.
  • Multi-style support: Support more than six artistic styles, including cartoons, watercolors, etc., to meet diverse needs.
  • Precise editing capabilities: Ability to perform complex editing tasks from fine adjustments to overall style conversions to maintain image consistency.
  • Innovative technology integration: Combine LoRA, EditLoRA and positional coding cloning technology to achieve efficient learning and precise operation.

Highlights:

AI-driven artistic creation: Using advanced diffusion model (Flux.1) and LoRA technology, PhotoDoodle can transform ordinary photos into creative works of art.
Seamless element fusion: With EditLoRA technology, the system can learn artist styles and apply them to new images, ensuring natural transition and visual harmony.
Pixel-level accuracy control: Through position-coding cloning technology, PhotoDoodle can accurately remember the pixel position of the original image, making the new elements perfectly integrated into the background.
Various application scenarios: Whether it is adding interesting effects to pets or designing fantasy scenes, PhotoDoodle can easily handle and show excellent flexibility.

Usage Guide:

Environmental preparation:

Make sure your computer has Git, Python 3.11.10, and Conda installed.
Open the terminal, clone the project and enter the directory:

git clone https://github.com/showlab/PhotoDoodle.git
cd PhotoDoodle

Create and activate a virtual environment:

conda create -n doodle python=3.11.10
conda activate doodle

Installation dependencies:

Install PyTorch (a version that supports CUDA is recommended to take advantage of GPU acceleration):

pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124

Install other dependencies:

pip install --upgrade -r requirements.txt

Download the pre-trained model:

Visit PhotoDoodle’s GitHub Releases or Hugging Face dataset page to download the required pre-trained model files (such as OmniEditor and EditLoRA).

Place the downloaded model files in the specified folder (usually checkpoints/) in the project directory.

Operational reasoning:

Use the following code to reason:

from src.pipeline_pe_clone import FluxPipeline
import torch
from PIL import Image

pretrained_model_name_or_path = "black-forest-labs/FLUX.1-dev"
pipeline = FluxPipeline.from_pretrained(
 pretrained_model_name_or_path,
 torch_dtype=torch.bfloat16,
).to('cuda')

pipeline.load_lora_weights("nicolaus-huang/PhotoDoodle", weight_name="pretrain.safetensors")
pipeline.fuse_lora()
pipeline.unload_lora_weights()

pipeline.load_lora_weights("nicolaus-huang/PhotoDoodle", weight_name="sksmagiceffects.safetensors")

height = 768
width = 512

validation_image = "assets/1.png"
validation_prompt = "add a halo and wings for the cat by sksmagiceffects"
condition_image = Image.open(validation_image).resize((height, width)).convert("RGB")

result = pipeline(
 prompt=validation_prompt,
 condition_image=condition_image,
 height=height,
 width=width,
 guidance_scale=3.5,
 num_inference_steps=20,
 max_sequence_length=512
).images[0]

result.save("output.png")

Or run the inference script directly:

python inference.py

Through the above steps, you can use PhotoDoodle to add artistic graffiti elements to your photos and create unique works of art.

Resources:

Oil tubing:

Scroll to Top