Latent argument in Stable Diffusion Pipeline (Huggingface Diffusers Library) working unexpectedly

91 Views Asked by At

I'm using the Stable Diffussion Pipeline from Huggingface, and been trying to start the diffusion process with a custom latent using the latents param, nevertheless, the result is unexpected.

I took the output (PIL image) of a Stable Diffussion Pipeline and used pil_to_latents() function (shown at the end) to get the latent representation, to later call the second Stable Diffussion Pipeline with the latents param as follows:

pipe(
      latents = pil_to_latents(result_image_from_first_pipe)[0],
      *** other pipe arguments
)

But at the end, I'm getting a weird and blurry output. If I run the pipeline without this argument, the results seems normal.Does anybody has an idea why this happens? Thanks !

Supporting code:

    def pil_to_latents(self, image):
        '''     
        Function to convert image to latents     
        '''     
        init_image = tfms.ToTensor()(image).unsqueeze(0) * 2.0 - 1.0   
        init_image = init_image.to(device="cuda", dtype=torch.float16)
        init_latent_dist = self.vae.encode(init_image).latent_dist.sample() * 0.18215     
        return init_latent_dist  
    
    def latents_to_pil(self, latents):     
        '''     
        Function to convert latents to images     
        '''     
        latents = (1 / 0.18215) * latents     
        with torch.no_grad():         
            image = self.vae.decode(latents).sample     
        
        image = (image / 2 + 0.5).clamp(0, 1)     
        image = image.detach().cpu().permute(0, 2, 3, 1).numpy()      
        images = (image * 255).round().astype("uint8")     
        pil_images = [Image.fromarray(image) for image in images]        
        return pil_images
0

There are 0 best solutions below