Ridiculated Stable Diffusion 3 Edition features AI-generated body horror

Zoom in / AI generated image created using Stable Diffusion 3 of a girl lying in the grass.

On Wednesday, Stability AI released weights for Stable Diffusion 3 Medium, an AI image synthesis model that turns text prompts into AI-generated images. However, its arrival has been derided online for generating images of people in a way that seems like a step back from other state-of-the-art image synthesis models such as the Midjourney or DALL-E 3. As a result, it can pull out wild anatomically incorrect visuals abominations with ease.

A Reddit thread titled “Is this release supposed to be a joke? [SD3-2B],” describes SD3 Medium’s spectacular failures in rendering people, especially human limbs like arms and legs. Another thread titled “Why is SD3 so bad at generating girls lying on grass?” shows similar problems, but for human targets bodies.

Hands have traditionally been a challenge for AI image generators due to the lack of good examples in early training datasets, but recently several image synthesis models seem to have overcome the problem. In that sense, the SD3 seems to be a huge step back for the image fusion enthusiasts who flock to Reddit – especially compared to recent Stability releases like November’s SD XL Turbo.

“It wasn’t too long ago that StableDiffusion was competing with Midjourney, now it just seems like a joke in comparison. At least our datasets are safe and ethical!” wrote one Reddit user.

AI generated image created using Stable Diffusion 3 Medium.
AI generated image created using Stable Diffusion 3 of a woman lying in the grass.
An AI-generated image created using Stable Diffusion 3 showing disfigured hands.
AI generated image created using Stable Diffusion 3 of a woman lying in the grass.
An AI-generated image created using Stable Diffusion 3 showing disfigured hands.
AI generated SD3 Medium image taken by a Reddit user with the prompt “woman wearing a beach dress”.
AI-generated SD3 Medium image taken by a Reddit user with the prompt “photo of a person napping in the living room.”

AI imaging fans have so far blamed Stable Diffusion 3’s anatomy failures on Stability’s insistence on filtering out adult content (often referred to as “NSFW” content) from the SD3 training data that teaches the model how to generate images. “Believe it or not, heavily censoring a model also removes human anatomy, so… this happened,” one Reddit user wrote in the thread.

Basically, whenever the user prompts for a concept that is not well represented in the AI model’s training dataset, the image synthesis model will come up with its best interpretation of what the user wants. And sometimes it can be downright terrifying.

The release of Stable Diffusion 2.0 in 2022 suffered from similar problems in rendering humans well, and AI researchers soon discovered that censoring adult content that contained nudity could severely hamper an AI model’s ability to generate accurate human anatomy . At the time, the stability AI reversed course with SD 2.1 and SD XL, regaining some abilities lost through heavy filtering of NSFW content.

Another problem that can arise during pre-training of the model is that sometimes the NSFW filter that researchers use to remove adult images from the dataset is too picky, accidentally removing images that may not be offensive and deprives the model of images of people in certain situations. “[SD3] it works fine as long as there are no people in the picture, I think their improved nsfw filter for filtering training data decided that anything humanoid is nsfw,” wrote one Redditor on the thread.

Using Hugging Face’s free online SD3 demo, we ran prompts and saw results similar to those reported by others. For example, the prompt “man showing his hands” returned an image of a man holding two giant-sized backwards hands, even though each hand had at least five fingers.

An example of the SD3 Medium we generated with the “Woman Lying on the Beach” prompt.
An example of the nSD3 Medium we generated with the “Man showing his hands” prompt.

AI for stability
An example of the SD3 Medium we generated with the “Woman showing her hands” prompt.

AI for stability
An example of the SD3 Medium we generated with the prompt “muscular barbarian with weapons next to CRT TV, cinematic, 8K, studio lighting”.
An example of the SD3 Medium we generated with the “Cat in a car holding a beer can” prompt.

Stability announced the Stable Diffusion 3 in February, and the company plans to make it available in a variety of model sizes. Today’s release is for the “Medium” version, which is a 2 billion parameter model. In addition to the weights being available on Hugging Face, they are also available to experiment with through the company’s stability platform. Weights are available to download and use for free under a non-commercial license only.

Soon after its announcement in February, a delay in the release of the weight of the SD3 model inspired rumors that the release was held up due to technical problems or mismanagement. Stability AI as a company recently hit a tailspin with the resignation of its founder and CEO, Emad Mostaque, in March and a series of layoffs afterward. Just before that, three key engineers – Robin Rombach, Andreas Blattmann and Dominik Lorenz – left the company. And its problems go back even further, with news of the company’s dire financial situation lingering since 2023.

For some Stable Diffusion fans, the failures with Stable Diffusion 3 Medium are a visual manifestation of the company’s mismanagement and a clear sign that things are falling apart. Although the company hasn’t filed for bankruptcy, some users made dark jokes about the possibility after seeing the SD3 Medium:

“I guess now they can go bankrupt safely and ethically.” [sic] way, after all.”

You Might Also Like

Gemini Gets Tons of New Google Extensions, Including Home, Phone, and More (Unpack APK)

Google’s AI search has publishers scrambling

The Operator is the deep state simulator I’ve been craving for so long

Leave a Reply Cancel reply