Many of my readers know that I use AI often. I have been using it to create content for an in-depth book review for The Blind Owl. For those less aware of the foibles of generative AI, I share some insights—or low-lights. For this, I used Midjourney v6.1.

Prompt: a young woman gives a flower to an old man, who is crouched under a large cypress tree by a river
I issued this prompt, and as per usual, it rendered four options. Notice that in some instances, the tree is not a key element.
Given enough time, one can slowly improve to obtain the desired result.

Here, an old man indeed crouches under a prominent cypress tree and by a river. A young woman hands him some flowers—though not so much blue morning glories. On balance, I like this output, but it still needs work.
Some other problems:
- The man is looking away—neither at her nor her flowers.
- Her (right) eye is deformed.
- Her left hand is deformed.
- I didn’t ask for jewellery—an earring.
At least I can in-paint out these imperfections—perhaps.
Here’s another render using the same image prompt.

Notice that it ignored the man altogether. My point is that for every awesome image you see, there may have been hundreds of iterations to get there. There are ways to get persistent characters and scenes, but this takes a bit of up-from effort and iterations that one can leverage going forward.
On the topic of Midjourney model 6.0 versus 6.1, I share this comparison—front-facing faces for a character sheet for this old man. Here, I prefer the earlier model as displayed in the top row.

In some cases, there are minor improvements over v6.0. In other cases, they stepped back. v6.1 renders less realistic human images, making them look more computer-generated and less natural. It also over-applies sexual stereotypes, traditional beauty archetypes, smoother skin, and so on. But that’s not the main topic for today.
DISCLAIMER: This post has little to do with philosophy, but it ties into a philosophical novella.
Great post!!
I’ve been experimenting with AI art generators to help express what it feels like to live with chronic migraine (I occasionally post the results on my blog, otherwise on Insta).
Like you, there are a huge amount of ‘failures’ to get close to something ‘decent’, and, as you note, the stereotypes are relentless. I worry sometimes that by running the request I’m somehow accidentally feeding the machine and making things worse… as a PhD student, there is a philosophical dimension to every image that is produced.
Linda 🙂
LikeLike
Thanks for commenting. You make some strong points. Firstly, if/when AI becomes a major contributor to content generation, the originality will tend to regress to zero.
There are switches and parameters to reduce sterotypes. In Midjourney, I use –Style Raw to disable their style. As I mentioned, for human characters, I use the –v 6 (instead of the default –v 6.1), becasue the new version renders characters that look “better”, but imperfections are what make humans human.
As for feeding the machine, I only wish. One session doesn’t even know wheat the next session is doing, so the learning only occurs in the training. Unless you are training your own models, the feedback look is nonexistent.
LikeLiked by 1 person
I know my experiments aren’t enabling learning, but if we’re all plugging data in around the world, perhaps all of us collectively are improving the system… although, thinking about the state of the world right now, and the amount of conflict… I guess that’s a tad delusional…
LikeLike