starsfriday/Qwen-Image-EVA-LoRA

starsfriday
Texto a imagen

LoRA para Qwen-Image orientado a generación de retratos e imágenes de personajes tipo anime, especialmente variaciones de Asuka de la animación japonesa. Está entrenado sobre Qwen/Qwen-Image y se usa cargando pesos LoRA sobre el modelo base.

Como usar

Instalación y uso básico con Diffusers:

pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("Qwen/Qwen-Image", dtype=torch.bfloat16, device_map="cuda")
pipe.load_lora_weights("starsfriday/Qwen-Image-EVA-LoRA")

prompt = "mrx, The image is a digital illustration of an animated female character, likely from an anime or manga series. She has long, flowing orange hair with two small red horns on her head, which could suggest she is a fantasy or supernatural character. Her eyes are blue and expressive, adding to her lively demeanor. The character is wearing a form-fitting bodysuit that is predominantly red with black accents, including stripes along the legs and around the waist. The suit also features a green collar and cuffs, as well as some silver-colored buttons and fastenings, giving it a sleek and tactical appearance.She is posing playfully with one hand on her hip and the other near her face, winking at the viewer. This pose conveys confidence and a sense of fun. The background is minimalistic, featuring what appears to be a metallic wall with rivets, which complements the industrial aesthetic often found in sci-fi or mecha-related genres. There is no explicit context provided within the image itself, but the character's attire and design suggest themes of action, adventure, or science fiction. "
image = pipe(prompt).images[0]

Uso directo más completo:

from diffusers import DiffusionPipeline
import torch

model_name = "Qwen/Qwen-Image"

# Load the pipeline
if torch.cuda.is_available():
    torch_dtype = torch.bfloat16
    device = "cuda"
else:
    torch_dtype = torch.float32
    device = "cpu"

pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
pipe = pipe.to(device)

# Load LoRA weights
pipe.load_lora_weights('starsfriday/Qwen-Image-Shentianyongmei-LoRA/qwen_image_eva.safetensors', adapter_name="lora")

prompt = '''mrx, The image depicts a young female character with long, reddish-brown hair tied in twin tails, wearing a traditional Japanese yukata. She is seated indoors, likely on a tatami mat, enjoying a slice of watermelon. The setting suggests a tranquil environment, possibly a ryokan or a home with Japanese architectural elements such as sliding doors and paper windows. The presence of a lantern indicates it might be late afternoon or early evening. In the background, through the open door, one can see a serene outdoor scene with greenery and a clear sky, which adds to the overall peaceful ambiance of the scene. This image could evoke feelings of relaxation and leisure, often associated with summer holidays in Japan.
'''
negative_prompt = " "

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=1024,
    height=1024,
    num_inference_steps=50,
    true_cfg_scale=5,
    generator=torch.Generator(device="cuda").manual_seed(123456)
)
image = image.images[0]
image.save("output.png")

En ComfyUI, el modelo se usa con una versión modificada del flujo de trabajo de Qwen-Image que añade un nodo LoRA conectado al modelo base.

Funcionalidades

Adaptador LoRA para el modelo base Qwen/Qwen-Image
Especializado en generación de retratos y personajes anime
Usa la frase de activación `mrx`
Compatible con Diffusers
Compatible con un flujo modificado de ComfyUI para Qwen-Image con nodo LoRA
Pesos disponibles en formato Safetensors
Licencia Apache 2.0

Casos de uso

Generar retratos de personajes anime con estética de Asuka
Crear ilustraciones de personajes femeninos de anime o manga
Producir escenas anime con prompts activados por `mrx`
Usar Qwen-Image con pesos LoRA en Diffusers o ComfyUI
Generar imágenes 1024x1024 con estilo anime a partir de descripciones largas