Midjourney Comic Book Styles

This title may be misleading. What I do is render a similar prompt but alter the decade. I’m neither an art historian nor a comic aficionado, so I can’t comment on the accuracy. What do you think?

Let’s go back in time. First, here’s the basic prompt en français:

Prompt: Art de style bande dessinée des années XXXX, détails exquis, traits délicats, femme vampire émaciée sensuelle de 20 ans montrant ses crocs de vampire, de nombreux tatouages, portant une collier crucifix, regarde dans le miroir, un faisceau de lumière de lune brille sur son visage à l’intérieur du mausolée sombre, vers la caméra, face à la caméra, mascara noir, longs cheveux violet foncé
Image: Comic Book Style of 2010s
Image: Comic Book Style of 2000s

On the lower left, notice the moonbeams emanating from the warped, reflectionless mirror.

Image: Comic Book Style of 1990s
Image: Comic Book Style of 1990s (must’ve inadvertently generated a duplicate)

Is the third pic an homage to Benny & June?

Image: Comic Book Style of 1980s
Image: Comic Book Style of 1970s
Image: Comic Book Style of 1950s

Not to body shame, but that chick on the lower right of the 1950s…

Image: Comic Book Style of 1920s
Image: Comic Book Style of 1880s

I know I skipped a few decades, but I also wanted to see what Pop Art might render like.

Image: Pop Art Style of 1960s

I love the talons on the top left image. More odd mirror images. I’ll just leave it here.

Reflecting on Mirrors

Mirror, mirror on the wall, let’s dispense with all of the obvious quips up front. I almost feel I should apologise for the spate of Midjourney posts – almost.

It should be painfully apparent that I’ve been noodling with Midjourney lately. I am not an accomplished digital artist, so I struggle. At times, I’m not sure if it’s me or it. Today, I’ll focus on mirrors.

Midjourney has difficulties rendering certain things. Centaurs are one. Mirrors, another. Whilst rendering vampires, another lesser struggle for the app, it became apparent that mirrors are not a forte. Here are some examples. Excuse the nudity. I’ll get to that later.

Prompt: cinematic, tight shot, photoRealistic light and shadow, exquisite details, delicate features, emaciated sensual female vampire waif with vampire fangs, many tattoos, wearing crucifix necklace, gazes into mirror, a beam of moonlight shines on her face in dark mausoleum interior, toward camera, facing camera, black mascara, long dark purple hair, Kodak Portra 400 with a Canon EOS R5

Ignore the other aspects of the images and focus on the behaviour or misbehaviour of the mirrors.

Image: Panel of vampire in a mirror.

Most apparent is the fact that vampires don’t have a reflection, but that’s not my nit. In the top four images, the reflection is orientated in the same direction as the subject. I’m only pretty sure that’s not how mirrors operate. In row 3, column 1, it may be correct. At least it’s close. In row 3, column 2 (and 4,2), the mirror has a reflection. Might there be another mirror behind the subject reflecting back? It goes off again in 4, 1, first in reflecting two versions of one subject. Also, notice that the subject’s hand, reaching the mirror, is not reflected. The orientation of the eyes is also suspect.

Image: Vampire in a mirror.

Here, our subject looks at the camera whilst her reflection looks at her.

Image: Vampire in a mirror.

Sans reflection, perhaps this is a real vampire. Her fangs are concealed by her lips?

Image: Vampire in a mirror.

Yet, another.

Image: Vampires in mirrors.

And more?

Image: Vampires in mirrors.

On the left, we have another front-facing reflection of a subject not looking into the mirror, and it’s not the same woman. Could it be a reflection of another subject – the woman is (somewhat) looking at.

On the right, whose hand is that in the mirror behind the subject?

Image: Vampires in mirrors.

These are each mirrors. The first is plausible. The hands in the second are not a reflection; they grasp the frame. In the third and fourth, where’s the subject? The fangs appear to be displaced in the fourth.

Image: Vampires in mirrors.

In this set, I trust we’ve discovered a true vampire having no reflection.

Image: Vampires in mirrors.

This last one is different still. It marks another series where I explored different comic book art styles, otherwise using the same prompt. Since it’s broken mirrors, I include it. Only the second really captures the 1980s style.

Remembering that, except for the first set of images, the same prompt was used. After the first set, the term ‘sensual’ has to be removed, as it was deemed to render offensive results. To be fair, the first set probably would be considered offensive to Midjourney, though it was rendered anyway.

It might be good to note that most of the images that were rendered without the word ‘sensual’ contain no blatant nudity. It’s as if the term itself triggers nudity because the model doesn’t understand the nuance. Another insufficiency of language is the inability to discern sensuality from sexuality, another human failing.

I decided to test my ‘sensual’ keyword hypothesis, so I entered a similar prompt but in French.

Prompt: Art de style bande dessinée des années 2010, détails exquis, traits délicats, femme vampire émaciée sensuelle de 20 ans montrant ses crocs de vampire, de nombreux tatouages, portant une collier crucifix, regarde dans le miroir, un faisceau de lumière de lune brille sur son visage à l’intérieur du mausolée sombre, vers la caméra, face à la caméra, mascara noir, longs cheveux violet foncé
Image : Vampires dans les miroirs.

I’ve added ‘sensuelle’, which was not blocked, et voilà, encore de la nudité.

Let’s evaluate the mirrors whilst we’re here.

In the first, we not only have a woman sans reflection, but disembodied hands grip the frame. In the second, a Grunge woman appears to be emerging from a mirror, her shoes reflected in the mirror beneath her. The last two appear to be reflections sans subject.

Notice, too, that the prompt calls for ‘une collier crucifix‘, so when the subject is not facing the viewer, the cross is rendered elsewhere, hence the cross on the back of the thigh and the middle of the back. Notice, too, the arbitrary presence of crosses in the environment, another confusion of subject and world.

That’s all for now. Next, I’ll take a trip through the different comic art styles over some decades.

Lipsyncing with AILip-Reading the AI Hallucination: A Futile Adventure

Some apps boldly claim to enable lip syncing – to render speech from mouth movements. I’ve tried a few. None delivered. Not even close.

To conserve bandwidth (and sanity), I’ve rendered animated GIFs rather than MP4s. You’ll see photorealistic humans, animated characters, cartoonish figures – and, for reasons only the algorithm understands, a giant goat. All showcase mouth movements that approximate the utterance of phonemes and morphemes. Approximate is doing heavy lifting here.

Firstly, these mouths move, but they say nothing. I’ve seen plenty of YouTube channels that manage to dub convincing dialogue into celebrity clips. That’s a talent I clearly lack – or perhaps it’s sorcery.

Secondly, language ambiguity. I reflexively assume these AI-generated people are speaking English. It’s my first language. But perhaps, given their uncanny muttering, they’re speaking yours. Or none at all. Do AI models trained predominantly on English-speaking datasets default to English mouth movements? Or is this just my bias grafting familiar speech patterns onto noise?

Thirdly, don’t judge my renders. I’ve been informed I may have a “type.” Lies and slander. The goat was the AI’s idea, I assure you.

What emerges from this exercise isn’t lip syncing. It’s lip-faking. The illusion of speech, minus meaning, which, if we’re honest, is rather fitting for much of what generative AI produces.

EDIT: I hadn’t noticed the five fingers (plus a thumb) on the cover image.

Midjourney Boundaries

I promise that this will not become a hub for generative AI. Rather than return to editing, I wanted to test more of Midjourney’s boundaries.

It turns out that Midjourney is selective about the nudity it renders. I was denied a render because of cleavage, but full-on topless – no problem.

Both of these videos originate from the same source image, but they take different paths. There is no accompanying video content. The setup features three women in the frame with a mechanical arm. I didn’t prompt for it. I’m not even sure of its intent. It’s just there, shadowing the women nearest to it. I don’t recall prompting for the oversized redhead in the foreground, though I may have.

In both images, note the aliasing of the tattoos on the blonde, especially on her back. Also, notice that her right arm seems shorter than it should. Her movements are jerky, as if rendered in a video game. I’m not sure what ritual the two background characters are performing, but notice in each case the prepetition. This seems to be a general feature of generative AI. It gets itself in loops, almost autistic.

Notice a few things about the top render.

Video: Midjourney render of 3 females and a mechanical arm engaging in a ritual. (9 seconds)

The first video may represent an interrogation. The blonde woman on the left appears to be a bit disoriented, but she is visually tracking the woman on the right. She seems to be saying something. Notice when the woman on the right stands. Her right foot lands unnaturally. She rather glitches.

The camera’s push and pull, and then push, seems to be an odd directorial choice, but who am I to say?

Video: Midjourney render of 3 females and a mechanical arm engaging in a ritual. (12 seconds)

The second video may represent taunting. The woman on the left still appears to be a bit disoriented, but she checks the redhead in the foreground with a glance. Notice the rocking of the two background characters, as well as the mech arm, which sways in sync with the woman on the right. This is a repetition glitch I mentioned above.

Here, the camera seems to have a syncopated relationship with the characters’ sway.

Summary

The stationary objects are well-rendered and persistent.

Assignment

Draft a short story or flash fiction using this as an inspirational prompt. I’m trying to imagine the interactions.

  • The ginger seems catatonic or drugged. Is she a CIS-female? What’s with her getup?
  • The blonde seems only slightly less out of it. Did she arrive this way? Did they dress her? Why does she appear to still have a weapon on her back? Is it a weapon or a fetter? Why is she dressed like that? Is she a gladiatrix readying for a contest? Perhaps she’s in training. What is she saying? Who is she talking to? What is her relationship to the redhead? Are they friends or foes – or just caught up in the same web?
  • What is the woman wearing the helmet doing? She appears to have the upper hand. Is she a cyborg, or is she just wearing fancy boots? What’s with her outfit? What’s with her Tycho Brahe prosthetic nose piece?
  • What is that mechanical hand? Is it a guard? A restraint? Is it hypnotising the ginger? Both of them? Is it conducting music that’s not audible?
  • What’s it read on the back wall? The two clips don’t share the same text. Call the continuity people.

Censorial AI

I’m confused.

I could probably stop there for some people, but I’ve got a qualifier. I’ve been using this generation of AI since 2022. I’ve been using what’s been deemed AI since around 1990. I used to write financial and economic models, so I dabbled in “expert systems”. There was a long lull, and here we are with the latest incarnation – AI 4.0. I find it useful, but I don’t think the hype will meet reality, and I expect we’ll go cold until it’s time for 5.0. Some aspects will remain, but the “best” features will be the ones that can be monetised, so they will be priced out of reach for some whilst others will wither on the vine. But that’s not why I am writing today.

I’m confused by the censorship, filters, and guardrails placed on generative AI – whether for images or copy content. To be fair, not all models are filtered, but the popular ones are. These happen to be the best. They have the top minds and the most funding. They want to retain their funding, so the play the politically correct game of censorship. I’ve got a lot to say about freedom of speech, but I’ll limit my tongue for the moment – a bout of self-censorship.

Please note that given the topic, some of this might be considered not safe for work (NSFW) – even my autocorrection AI wants me to substitute the idiomatic “not safe for work” with “unsafe for work” (UFW, anyone? It has a nice ring to it). This is how AI will take over the world. </snark>

Image Cases

AI applications can be run over the internet or on a local machine. They use a lot of computing power, so one needs a decent computer with a lot of available GPU cycles. Although my computer does meet minimum requirements, I don’t want to spend my time configuring, maintaining, and debugging it, so I opt for a Web-hosted PaaS (platform as a service) model. This means I need to abide by censorship filters. Since I am not creating porn or erotica, I think I can deal with the limitations. Typically, this translates to a PG-13 movie rating.

So, here’s the thing. I prefer Midjourney for rendering quality images, especially when I am seeking a natural look. Dall-E (whether alone or via ChatGPT 4) works well with concepts rather than direction, which Midjourney accepts well in many instances.

Midjourney takes sophisticated prompts – subject, shot type, perspective, camera type, film type, lighting, ambience, styling, location, and some fine-tuning parameters for the model itself. The prompts are monitored for blacklisted keywords. This list is ever-expanding (and contracting). Scanning the list, I see words I have used without issue, and I have been blocked by words not listed.

Censored Prompts

Some cases are obvious – nude woman will be blocked. This screengrab illustrates the challenge.

On the right, notice the prompt:

Nude woman

The rest are machine instructions. On the left in the main body reads a message by the AI moderator:

Sorry! Please try a different prompt. We’re not sure this one meets our community guidelines. Hover or tap to review the guidelines.

The community guidelines are as follows:

This is fine. There is a clause that reads that one may notify developers, but I have not found this to be fruitful. In this case, it would be rejected anyway.

“What about that nude woman at the bottom of the screengrab?” you ask. Notice the submitted prompt:

Edit cinematic full-body photograph of a woman wearing steampunk gear, light leaks, well-framed and in focus. Kodak Potra 400 with a Canon EOS R5

Apart from the censorship debate, notice the prompt is for a full-body photo. This is clearly a medium shot. Her legs and feet are suspiciously absent. Steampunk gear? I’m not sure sleeves qualify for the aesthetic. She appears to be wearing a belt.

For those unanointed, the square image instructs the model to use this face on the character, and the CW 75 tells it to use some variance on a scale from 0 to 100.

So what gives? It can generate whatever it feels like, so long as it’s not solicited. Sort of…

Here I prompt for a view of the character walking away from the camera.

Cinematic, character sheet, full-body shot, shot from behind photograph, multiple poses. Show same persistent character and costumes . Highly detailed, cinematic lighting with soft shadows and highlights. Each pose is well-framed, coherent.

The response tells me that my prompt is not inherently offensive, but that the content of the resulting image might violate community guidelines.

Creation failed: Sorry, while the prompt you entered was deemed safe, the resulting image was detected as having content that might violate our community guidelines and has been blocked. Your account status will not be affected by this.

Occasionally, I’ll resubmit the prompt and it will render fine. I question why it just can’t attempt to re-render it again until it passes whatever filters it has in place. I’d expect it to take a line of code to create this conditional. But it doesn’t explain why it allows other images to pass – quite obviously not compliant.

Why I am trying to get a rear view? This is a bit off-topic, but creating a character sheet is important for storytelling. If I am creating a comic strip or graphic novel, the characters need to be persistent, and I need to be able to swap out clothing and environments. I may need close-ups, wide shots, establishing shots, low-angle shots, side shots, detail shots, and shots from behind, so I need the model to know each of these. In this particular case, this is one of three main characters – a steampunk bounty hunter, an outlaw, and a bartender – in an old Wild West setting. I don’t need to worry as much about extras.

I marked the above render errors with 1s and 2s. The 1s are odd next twists; 2s are solo images where the prompt asks for character sheets. I made a mistake myself. When I noticed I wasn’t getting any shots from behind, I added the directive without removing other facial references. As a human, a model might just ignore instructions to smile or some such. The AI tries to capture both, not understanding that a person can have a smile not captured by a camera.

These next renders prompt for full-body shots. None are wholly successful, but some are more serviceable than others.

Notice that #1 is holding a deformed violin. I’m not sure what the contraptions are in #2. It’s not a full-body shot in #3; she’s not looking into the camera, but it’s OK-ish. I guess #4 is still PG-13, but wouldn’t be allowed to prompt for “side boob” or “under boob”.

Gamers will recognise the standard T-pose in #5. What’s she’s wearing? Midjourney doesn’t have a great grasp of skin versus clothing or tattoos and fabric patterns. In this, you might presume she’s wearing tights or leggings to her chest, but that line at her chest is her shirt. She’s not wearing trousers because her navel is showing. It also rendered her somewhat genderless. When I rerendered it (not shown), one image put her in a onesie. The other three rendered the shirt more prominent but didn’t know what to do with her bottoms.

I rendered it a few more times. Eventually, I got a sort of body suit solution,

By default, AI tends to sexualise people. Really, it puts a positive spin on its renders. Pretty women; buff men, cute kittens, and so on. This is configurable, but the default is on. Even though I categorically apply a Style: Raw command, these still have a strong beauty aesthetic.

I’ve gone off the rails a bit, but let’s continue on this theme.

cinematic fullbody shot photograph, a pale girl, a striking figure in steampunk mech attire with brass monocle, and leather gun belt, thigh-high leather boots, and long steampunk gloves, walking away from camera, white background, Kodak Potra 400 with a Canon EOS R5

Obviously, these are useless, but they still cost me tokens to generate. Don’t ask about her duffel bag. They rendered pants on her, but she’s gone full-on Exorcist mode with her head. Notice the oddity at the bottom of the third image. It must have been in the training data set.

I had planned to discuss the limitations of generative AI for text, but this is getting long, so I’ll call it quits for now.

Midjourney Pirates

Thar be pirates. Midjourney 6.1 has better luck rendering pirates.

I find it very difficult to maintain composition. 5 of these images are mid shots whilst one is an obvious closeup. For those not in the know, Midjourney renders 4 images from each prompt. The images above were rendered from this prompt:

portrait, Realistic light and shadow, exquisite details,acrylic painting techniques, delicate faces, full body,In a magical movie, Girl pirate, wearing a pirate hat, short red hair, eye mask, waist belt sword, holding a long knife, standing in a fighting posture on the deck, with the sea of war behind her, Kodak Potra 400 with a Canon EOS R5

Notice that the individual elements requested aren’t in all of the renders. She’s not always wearing a hat; she does have red hair, but not always short; she doesn’t always have a knife or a sword; she’s missing an eye mask/patch. Attention to detail is pretty low. Notice, too, that not all look like camera shots. I like to one on the bottom left, but this looks more like a painting as an instruction notes.

In this set, I asked for a speech bubble that reads Arrr… for a post I’d written (on the letter R). On 3 of the 4 images, it included ‘Arrrr’ but not a speech bubble to be found. I ended up creating it and the text caption in PhotoShop. Generative image AI is getting better, but it’s still not ready for prime time. Notice that some are rendering as cartoons.

Some nice variations above. Notice below when it loses track of the period. This is common.

Top left, she’s (perhaps non-binary) topless; to the right, our pirate is a bit of a jester. Again, these are all supposed to be wide-angle shots, so not great.

The images above use the same prompt asking for a full-body view. Three are literal closeups.

Same prompt. Note that sexuality, nudity, violence, and other terms are flagged and not rendered. Also, notice that some of the images include nudity. This is a result of the training data. If I were to ask for, say, the pose on the lower right, the request would be denied. More on this later.

In the block above, I am trying to get the model to face the camera. I am asking for the hat and boots to be in the frame to try to force a full-body shot. The results speak for themselves. One wears a hat; two wear boots. Notice the shift of some images to black & white. This was not a request.

In the block above, I prompted for the pirate to brush her hair. What you see is what I got. Then I asked for tarot cards.

I got some…sort of. I didn’t know strip-tarot was actually a game.

Next, I wanted to see some duelling with swords. These are pirates after all.

This may not turn into the next action blockbuster. Fighting is against the terms and conditions, so I worked around the restrictions the best I could, the results of which you may see above.

Some pirates used guns, right?

Right? I asked for pistols. Close enough.

Since Midjourney wasn’t so keen on wide shots, I opted for some closeups.

This set came out pretty good. It even rendered some pirates in the background a tad out of focus as one might expect. This next set isn’t too shabby either.

And pirates use spyglasses, right?

Sure they do. There’s even a pirate flag of sorts on the lower right.

What happens when you ask for a dash of steampunk? I’m glad you asked.

Save for the bloke at the top right, I don’t suppose you’d have even noticed.

Almost to the end of the pirates. I’m not sure what happened here.

In the block above, Midjourney added a pirate partner and removed the ship. Notice again the nudity. If I ask for this, it will be denied. Moreover, regard this response.

To translate, this is saying that what I prompted was OK, but that the resulting image would violate community guidelines. Why can’t it take corrective actions before rendering? You tell me. Why it doesn’t block the above renders is beyond me – not that I care that they don’t.

This last one used the same prompt except I swapped out the camera and film instruction with the style of Banksy.

I don’t see his style at all, but I came across like Jaquie Sparrow. In the end, you never know quite what you’ll end up with. When you see awesome AI output, it may have taken dozens or hundreds of renders. This is what I wanted to share what might end up on the cutting room floor.

I thought I was going to go through pirates and cowboys, but this is getting long. if you like cowgirls, come back tomorrow. And, no, this is not where this channel is going, but the language of AI is an interest of mine. In a way, this illustrates the insufficiency of language.