AI second round

 

In my first article, I reflected on the fascination and dangers of AI-generated images. Now I've started a small experiment to create better AI images using AI. This might sound a bit strange at first. Here's how:


To generate an image via AI, you need to formulate a description for the image—the so-called "prompt." On Midjourney, the platform where I enter my prompts, I wrote the following text (you'll get better results using English):

Photograph of old buses in a busstation with a woman sitting in one of the buses looking out of the window with waterdrops on the window. The woman has black hair and a face with a sad expression, industrial area, night, cold colors except inside the bus with the woman, the photograph should look like a painting of edward hopper but ultrarealistic. The picture is taken from outside the bus. In the picture you can see the whole bus and other buses in the background. drone shot on dji mavic 3.

Midjourney generated four images from this text:

The results were visually appealing, but they didn't match my intention. So I enlisted the help of AI and asked Microsoft's Copilot (many thanks to Jannik for the introduction) to formulate a more detailed, ten-sentence version of my text. The result was the following text:

The photograph is an overview of an old bus station taken from outside the bus. The station is located in an industrial area and the photograph was taken at night. The photograph has a cold color palette except for the inside of the bus where the woman is sitting. The woman has black hair and a sad expression on her face. She is sitting in one of the buses and looking out of the window. The window has water droplets on it, which adds to the melancholic mood of the photograph. The photograph is taken from a drone using a DJI Mavic 3. The photograph is ultra-realistic and looks like a painting by Edward Hopper. The photograph captures the whole bus and other buses in the background. The photograph is a stunning example of how a photograph can capture the mood of a place and the emotions of a person.

With this description, Midjourney has generated new, different images:

I tried again:

The photograph is an overview of an old bus station taken from an aerial perspective. There are five busses in the busstation which is located in an industrial area and the photograph was taken at a rainy night. All buses are empty, except one bus in which a lady is sitting inside the bus. The photograph has a cold color palette except for the inside of the one bus where the woman is sitting and looking out of the window. The window has water droplets on it, which adds to the melancholic mood of the photograph. In this one bus, a warm light enlights the inside of the bus. The woman has black hair and a sad expression on her face. The photograph is taken from a drone using a DJI Mavic 3. The photograph is ultra-realistic andhas a style like an Edward Hopper painting.

Now the bus station came out better, but the scene was shown from so far away that the individual person is no longer visible:

I gave up the idea of the aerial photography and formulated a new text together with Copilot:

The photograph is an ultra-photorealistic image captured with a Leica M camera and a 21mm lens. The photograph is a strong wide-angle shot that captures the entire bus station. In the foreground, you can see a bus that is illuminated from the inside. A young woman with black hair is sitting at a window in the middle of the bus, looking sadly out of the window. The photograph was taken during the blue hour, which gives the photograph a cool color palette. However, the interior of the bus where the woman is sitting is bathed in warm light, creating a stark contrast with the cool colors outside. In the background, you can see other buses parked in the bus station. They are all empty, and no other person can be seen in the whole picture except for the one woman sitting in the illuminated bus. It is raining, and you can see drops of water on the windows of the buses.

Here is the new result:

Still not what I imagined in my mind's eye, but better. I liked the image below right the best, and I generated four versions of this photo in Midjourney:

I didn't really manage to capture what I had in mind. The image below at least somewhat captures the mood I had in mind:

Addendum

Midjourney has just been updated, and version 6.0 is now online. With this version, I managed for the first time to generate an image that is much closer to my text description. Most importantly, the system now correctly detects that the image was taken from outside the bus. Technology is advancing rapidly, and in a few months we'll see even more accurate results.


 
Back
Back

Schaffhuuse shows heart

Further
Further

The Pope and the Dalai Lama drinking coffee...