Stability AI messed up its own AI again

Published in AI

Stability AI messed up its own AI again

by Nick Farrell on13 June 2024

font size decrease font size increase font size
Print
Email

Keeping US puritans happy creates monsters

Stability AI’s Stable Diffusion 3 Medium has been borked by its desperate attempt to keep US puritans happy.

The much-awaited AI image-synthesis model that turns text prompts into AI-generated images has been ridiculed online because it will not think about human bodies in case they lead to erotic thoughts.

This 19th-century approach to the human body is a step backwards from other state-of-the-art image-synthesis models like Midjourney or DALL-E 3. As a result, it easily produces wild, anatomically incorrect visual abominations.

A thread on Reddit titled, "Is this release supposed to be a joke? [SD3-2B]" details the spectacular failures of SD3 Medium at rendering humans, especially human limbs like hands and feet. Another thread titled, "Why is SD3 so bad at generating girls lying on the grass?" shows similar issues, but for entire human bodies.

AI image fans blame the Stable Diffusion 3's anatomy failure on Stability's insistence on filtering adult content (often called "NSFW" content) from the SD3 training data that teaches the model how to generate images.

While this “censorship first” approach satisfies the moral codes of nuns, retired colonials, and religious loonies, it also prevents the model from understanding any human anatomy.

It is not as if Stability AI’ did not know this. The release of Stable Diffusion 2.0 in 2023 suffered from similar problems in depicting humans accurately. AI researchers soon discovered that censoring adult content that contains nudity also severely hampers an AI model's ability to generate accurate human anatomy.

At the time, Stability AI reversed course with SD 2.1 and SD XL, regaining some abilities lost by excluding NSFW content.

"It works fine as long as there are no humans in the picture. I think their improved NSFW filter for filtering training data decided anything humanoid is NSFW," wrote a Redditor.

Any time a prompt hones in on a concept that isn't represented well in its training dataset, the image model will confabulate its best interpretation of what the user is asking for. And sometimes, that can be completely terrifying. Using a free online demo of SD3 on Hugging Face, we ran prompts and saw similar results to those reported by others.

For example, the prompt "a man showing his hands" returned an image of a man holding up two giant-sized backward hands, although each hand had at least five fingers.

Last modified on 13 June 2024

Rate this item

(0 votes)

More in this category: « Apple invented AI Adobe promises that creative genius will belong to users after all »

Latest comments

Topher.Mctophertone
but is it really?? ... all this cpu and ram auto-voltage and auto-thermal management is only...

Ryzen 7 9800X3D bricking fixed · 2 hours ago
Techngro
LCD screen. $80 games.
No thank you.

Nintendo Switch 2 console launching on June 5 at $449.99 · 5 hours ago
Techngro
"“We need a full investigation into WHO is working for DOGE,” Democrats on the...

Musk’s ex-hacker now burrowed into US Justice Department · 5 hours ago
Luke Mc
Don't worry. The tariffs in the US will destroy a lot of demand for RAM/NAND.

Micron jacks up the prices · 6 hours ago
Dmitry
Hugely doubtful

Nintendo Switch 2 console launching on June 5 at $449.99 · 9 hours ago

Stability AI messed up its own AI again

Latest comments

Read more about: