An open source AI fitting model that accurately controls the appearance and posture of characters

AI fitting model: Leffa

Project function: AI fitting model
Project Information: An open source AI fitting model that can accurately control the appearance and posture of characters.

(Translated from the original text)
Controllable character image generation aims to produce a character image conditioned on the reference image, thereby allowing precise control of the character’s appearance or posture. However, while existing methods achieve high overall image quality, they often distort the fine-grained texture details of the reference image. We attribute these distortions to insufficient attention to the corresponding areas in the reference image. To solve this problem, we proposed the Attention Learning Bitstream (Leffa), which explicitly guides purpose queries for the correct reference keys in the attention layer during training. Specifically, it is achieved through regularization losses above the attention graph within diffusion-based baselines. Our extensive experiments have shown that Leffa achieves state-of-the-art performance in controlling appearance (virtual try-on) and posture (pose transfer), significantly reducing fine-grained detail distortion while maintaining high image quality. In addition, we show that our losses are model-independent and can be used to improve the performance of other diffusion models.

We proposed Leffa, a unified framework for generating controllable image of people that allows precise manipulation of appearance (i.e. virtual try-on) and pose (i.e. pose transfer). The images generated by Leffa are of high quality, retain fine details, and have minimal texture distortion. Please zoom in for better viewing.

You can browse more:
Original text:https://arxiv.org/abs/2412.08486
Html format:https://arxiv.org/html/2412.08486v2

Oil tubing: