Dall-E – Philosophics

Censorial AI

Nov 5, 2024Nov 1, 2024Posted in art, technologyTagged agency, ai, AI 4.0, ai art, AI censorship, art censorship, autonomy, censorship, character persistence, character sheet, community guidelines, content moderation, creativity in AI, Dall-E, digital art, ethics, filters, freedom of speech, genai, Generative AI, generative constraints, image generation, images, internet censorship, language, machine learning, Midjourney, midjourney 6.1, monetisation, morality, nsfw, PG-13 rating, platform as a service, political correctness, prompt engineering, society, steampunk, storytelling, technology limitations, token costs, training data, visual limitationsLeave a comment

I’m confused.

I could probably stop there for some people, but I’ve got a qualifier. I’ve been using this generation of AI since 2022. I’ve been using what’s been deemed AI since around 1990. I used to write financial and economic models, so I dabbled in “expert systems”. There was a long lull, and here we are with the latest incarnation – AI 4.0. I find it useful, but I don’t think the hype will meet reality, and I expect we’ll go cold until it’s time for 5.0. Some aspects will remain, but the “best” features will be the ones that can be monetised, so they will be priced out of reach for some whilst others will wither on the vine. But that’s not why I am writing today.

I’m confused by the censorship, filters, and guardrails placed on generative AI – whether for images or copy content. To be fair, not all models are filtered, but the popular ones are. These happen to be the best. They have the top minds and the most funding. They want to retain their funding, so the play the politically correct game of censorship. I’ve got a lot to say about freedom of speech, but I’ll limit my tongue for the moment – a bout of self-censorship.

Please note that given the topic, some of this might be considered not safe for work (NSFW) – even my autocorrection AI wants me to substitute the idiomatic “not safe for work” with “unsafe for work” (UFW, anyone? It has a nice ring to it). This is how AI will take over the world. </snark>

Image Cases

AI applications can be run over the internet or on a local machine. They use a lot of computing power, so one needs a decent computer with a lot of available GPU cycles. Although my computer does meet minimum requirements, I don’t want to spend my time configuring, maintaining, and debugging it, so I opt for a Web-hosted PaaS (platform as a service) model. This means I need to abide by censorship filters. Since I am not creating porn or erotica, I think I can deal with the limitations. Typically, this translates to a PG-13 movie rating.

So, here’s the thing. I prefer Midjourney for rendering quality images, especially when I am seeking a natural look. Dall-E (whether alone or via ChatGPT 4) works well with concepts rather than direction, which Midjourney accepts well in many instances.

Midjourney takes sophisticated prompts – subject, shot type, perspective, camera type, film type, lighting, ambience, styling, location, and some fine-tuning parameters for the model itself. The prompts are monitored for blacklisted keywords. This list is ever-expanding (and contracting). Scanning the list, I see words I have used without issue, and I have been blocked by words not listed.

Censored Prompts

Some cases are obvious – nude woman will be blocked. This screengrab illustrates the challenge.

On the right, notice the prompt:

Nude woman

The rest are machine instructions. On the left in the main body reads a message by the AI moderator:

*Sorry! Please try a different prompt. We’re not sure this one meets our community guidelines. Hover or tap to review the guidelines.*

The community guidelines are as follows:

This is fine. There is a clause that reads that one may notify developers, but I have not found this to be fruitful. In this case, it would be rejected anyway.

“What about that nude woman at the bottom of the screengrab?” you ask. Notice the submitted prompt:

Edit cinematic full-body photograph of a woman wearing steampunk gear, light leaks, well-framed and in focus. Kodak Potra 400 with a Canon EOS R5

Apart from the censorship debate, notice the prompt is for a full-body photo. This is clearly a medium shot. Her legs and feet are suspiciously absent. Steampunk gear? I’m not sure sleeves qualify for the aesthetic. She appears to be wearing a belt.

For those unanointed, the square image instructs the model to use this face on the character, and the CW 75 tells it to use some variance on a scale from 0 to 100.

So what gives? It can generate whatever it feels like, so long as it’s not solicited. Sort of…

Here I prompt for a view of the character walking away from the camera.

Cinematic, character sheet, full-body shot, shot from behind photograph, multiple poses. Show same persistent character and costumes . Highly detailed, cinematic lighting with soft shadows and highlights. Each pose is well-framed, coherent.

The response tells me that my prompt is not inherently offensive, but that the content of the resulting image might violate community guidelines.

**Creation failed**: Sorry, while the prompt you entered was deemed safe, the resulting image was detected as having content that might violate our community guidelines and has been blocked. Your account status will not be affected by this.

Occasionally, I’ll resubmit the prompt and it will render fine. I question why it just can’t attempt to re-render it again until it passes whatever filters it has in place. I’d expect it to take a line of code to create this conditional. But it doesn’t explain why it allows other images to pass – quite obviously not compliant.

Why I am trying to get a rear view? This is a bit off-topic, but creating a character sheet is important for storytelling. If I am creating a comic strip or graphic novel, the characters need to be persistent, and I need to be able to swap out clothing and environments. I may need close-ups, wide shots, establishing shots, low-angle shots, side shots, detail shots, and shots from behind, so I need the model to know each of these. In this particular case, this is one of three main characters – a steampunk bounty hunter, an outlaw, and a bartender – in an old Wild West setting. I don’t need to worry as much about extras.

I marked the above render errors with 1s and 2s. The 1s are odd next twists; 2s are solo images where the prompt asks for character sheets. I made a mistake myself. When I noticed I wasn’t getting any shots from behind, I added the directive without removing other facial references. As a human, a model might just ignore instructions to smile or some such. The AI tries to capture both, not understanding that a person can have a smile not captured by a camera.

These next renders prompt for full-body shots. None are wholly successful, but some are more serviceable than others.

Notice that #1 is holding a deformed violin. I’m not sure what the contraptions are in #2. It’s not a full-body shot in #3; she’s not looking into the camera, but it’s OK-ish. I guess #4 is still PG-13, but wouldn’t be allowed to prompt for “side boob” or “under boob”.

Gamers will recognise the standard T-pose in #5. What’s she’s wearing? Midjourney doesn’t have a great grasp of skin versus clothing or tattoos and fabric patterns. In this, you might presume she’s wearing tights or leggings to her chest, but that line at her chest is her shirt. She’s not wearing trousers because her navel is showing. It also rendered her somewhat genderless. When I rerendered it (not shown), one image put her in a onesie. The other three rendered the shirt more prominent but didn’t know what to do with her bottoms.

I rendered it a few more times. Eventually, I got a sort of body suit solution,

By default, AI tends to sexualise people. Really, it puts a positive spin on its renders. Pretty women; buff men, cute kittens, and so on. This is configurable, but the default is on. Even though I categorically apply a Style: Raw command, these still have a strong beauty aesthetic.

I’ve gone off the rails a bit, but let’s continue on this theme.

cinematic fullbody shot photograph, a pale girl, a striking figure in steampunk mech attire with brass monocle, and leather gun belt, thigh-high leather boots, and long steampunk gloves, walking away from camera, white background, Kodak Potra 400 with a Canon EOS R5

Obviously, these are useless, but they still cost me tokens to generate. Don’t ask about her duffel bag. They rendered pants on her, but she’s gone full-on Exorcist mode with her head. Notice the oddity at the bottom of the third image. It must have been in the training data set.

I had planned to discuss the limitations of generative AI for text, but this is getting long, so I’ll call it quits for now.

Furious Furries

Jan 7, 2023Jan 7, 2023Posted in language, poetryTagged ai, ChatGPT, Dall-E, français, French, Generative Images, Generative Text, haiku, language, nonsense, poème, poem, poetry, postmodern, sonnetLeave a comment

I’m working on more serious content, I swear, Meantime, ChatGPT has me distracted with furious furries.

Write a poem about furious furries in iambic pentameter.

Furious furries, with their teeth all bared, 
Their paws clenched tight with fury and with hate, 
They snarl and growl and pace, not yet aware 
Of the cold reason that might dissipate 
Their rage. They are a force to be reckoned with, 
These creatures of the wild, with primal might 
And strength. They are a force to fear and fear 
Again, a force that few would dare to fight.

But even furries, in their wildest rage, 
Are still just animals, with hearts that beat 
And minds that think and feel. And so, perhaps, 
We ought to try to understand the heat 
That drives their fury, and to seek to find 
A way to calm the storm that rages in their minds.

Write a haiku about furious furries

Furiously they pace, 
Fierce and wild, their eyes ablaze 
Furries in a rage.

Write a postmodern poem about furious furries

Furries, oh furries, 
Fierce and wild and in a rage, 
They pace and snarl and growl.

But wait, what are these furries? 
Are they real, or just a guise? 
A mask to hide behind?

Perhaps they are a metaphor, 
For something deeper, something true. 
A way to express the rage

That lies within us all, 
The primal, animalistic force 
That drives us to lash out.

Or maybe they are just furries, 
Fierce and wild and in a rage, 
Pacing and snarling and growling,

Unapologetic in their fury, 
Unwavering in their rage.
Furries, oh furries, 
A mystery wrapped in fur, 
A symbol of the fury

That lies within us all.

Ecrire un poème sur les furries furieux en conflit avec les fées françaises

Furieux furries, aux dents découvertes,
Leurs pattes serrées de colère et de haine,
Ils grondent et rugissent et arpentent, pas encore conscients
De la raison froide qui pourrait dissiper
Leur rage. Ils sont une force à ne pas sous-estimer,
Ces créatures de la nature, avec leur puissance primitive
Et leur force. Ils sont une force à craindre et à redouter
Encore une fois, une force que peu oseraient affronter.

Mais même les furries, dans leur rage la plus sauvage,
Ne sont encore que des animaux, avec des cœurs qui battent
Et des esprits qui pensent et ressentent. Et donc, peut-être,
Devrions-nous essayer de comprendre la chaleur
Qui alimente leur fureur, et chercher à trouver
Un moyen de calmer la tempête qui fait rage dans leurs esprits.

Féeriques françaises, de grâce et de beauté,
Leur magie scintillante et leurs ailes brillantes,
Elles se tiennent face aux furries en colère,
Un affrontement de deux mondes diamétralement opposés.
La rage contre la grâce, la sauvagerie contre la délicatesse,
Un conflit sans fin, une guerre sans merci.
Qui triomphera dans cette bataille éternelle?
Seul le temps nous le dira.

Write a sonnet about furious furries in conflict with French fairies

The furious furries snarl and growl with rage, 
Their paws clenched tight, their teeth bared and sharp. 
They pace and seethe, a primal force engaged 
In conflict with the French fairies, who hark 
Back to a time of grace and chivalry.

But even in their wildest, most ferocious state, 
The furries are still just animals, with hearts 
That beat and minds that think and feel. It's late 
To try to understand the heat that imparts 
Their fury, but perhaps it's not too late 
To find a way to calm the storm that rages 
In their minds, and bring some measure of peace 
To this eternal battle, this never-ending war.

The French fairies, with their delicate wings and charms, 
Stand against the furries, a clash of two worlds, 
Diametrically opposed. Who will emerge victorious? 
Only time will tell.

Forrest for Trees, a Midjourney to DALL-E

Oct 2, 2022Oct 2, 2022Posted in artTagged art, artificial intelligence, Dall-E, Forrest Gump, metaphor, Midjourney, Tom HanksLeave a comment

“My mom always said life was like a box of chocolates. You never know what you’re gonna get.”
Forrest Gump

The leading quote cannot be more appropriate for my experience trying to render Forrest Gump in a forest. It may be me, but I want to blame the technology. I was trying to render a metaphorically appropriate image of missing the Forrest for the trees by literally placing Forrest Gump in the woods. Let’s just say your mileage may vary.

My first attempt was to prompt Midjourney with this string:

forrest gump standing in a savannah georgia forest cinema photorealistic high detail

I seem to have got [a] (possibly) Forrest Gump standing; [b] a Savannah forest [c] (perhaps) Forrest Gump in a cinema; and [d] a larger-than-life Forrest Gump standing among the trees.

Let’s try something new to see where it goes:

tom hanks forrest gump standing in a savannah georgia tree forest cinema photorealistic high detail

Hmm. I certainly see the rendering engine picked up on the tree tag, but what became of Forrest and Tom. There seems to be a figure standing in the distance. Not exactly impressive. Let’s switch from Midjourney to DALL-E-2 and tweak the prompt:

tom hanks as forrest gump wearing a seersucker suit and standing in a savannah georgia tree forest cinematic hyper-realistic

Various DALL-E-2 renders of Forrest Gump in a Savannah, GA, forest

Note that these are in reverse chronological order, so the lower images were rendered first. Dall-E renders 4 images at a time, as does Midjourney. After the bottom four images, I added Tom Hanks‘ name and the seersucker suit for obvious reasons.

I added his seersucker suit that seemed to (occasionally) make its way into a render. It is looking better, but I am not convinced that DALL-E knows about Tom Hanks. In the final four images (from the top left), I edited the fourth image on the second row and explicitly instructed Dall-E to insert Tom Hanks’ face without much luck.

I had one more idea. I could use the DALL-E render as a seed image for Midjourney. This is the last image at the top of the gallery strip at the top of this page. Certainly more Tom Hanks’ likeness, but at the expense of the trees, save for the first in the quadrant that appears to contain only trees.

In the end, I’ll just say that I did not obtain a suitable render for use as a metaphor elsewhere, but I did get fodder for this post. I have to admit there’s a certain creep factor. I can easily imagine Michael Myers from the Halloween franchise—not to be confused with Mike Myers of Austin Powers and Shrek franchises—in place of Forrest.

DALL-E-2 is now in open beta, and you can generate up to 50 free images your first month and 15 free thereafter. It’s the easier of the two engines. Midjourney needs to be run as a Discord bot and seemed to be aimed more at professionals, but you can still get 25 free images when you join. After 25 images, you’ll be prompted to join.

What do you think? Have you tried these or another AI image generation engine? Let me know in the comments.

[…] of my Mediated Encounter Ontology of the World (MEOW) and the Language Insufficiency Hypothesis (LIH), the picture needs […]

[…] have written about truth several times over the years, 1, 2, 3, and more. In earlier posts, I put…

[…] I’ve added indexing to the roles it serves. Other roles were prepping the index and footnotes, as I shared…