AI second round

 

In my first article I thought about the fascination and dangers of AI-generated images. Now I have started a small experiment to create better AI images using AI. That sounds a bit strange at first. It's going ok:


In order to generate an image via AI, you have to formulate an image description - the so-called "Prompt". On Midjourney, the platform on which I enter my prompts, I formulated the following text (English produces better results):

Photograph of old buses in a bus station with a woman sitting in one of the buses looking out of the window with waterdrops on the window. The woman has black hair and a face with a sad expression, industrial area, night, cold colors except inside the bus with the woman, the photograph should look like a painting of Edward Hopper but ultrarealistic. The picture is taken from outside the bus. In the picture you can see the whole bus and other buses in the background. drone shot on dji mavic 3.

Midjourney generated four images from this text:

Visually appealing results, but they don't correspond to my intention. So I called on the AI ​​to help and asked Copilot from Microsoft (many thanks Jannik for the introduction) to formulate a more detailed ten-sentence version of my text. The following text came out:

The photograph is an overview of an old bus station taken from outside the bus. The station is located in an industrial area and the photograph was taken at night. The photograph has a cold color palette except for the inside of the bus where the woman is sitting. The woman has black hair and a sad expression on her face. She is sitting in one of the buses and looking out of the window. The window has water droplets on it, which adds to the melancholic mood of the photograph. The photograph is taken from a drone using a DJI Mavic 3. The photograph is ultra-realistic and looks like a painting by Edward Hopper. The photograph captures the whole bus and other buses in the background. The photograph is a stunning example of how a photograph can capture the mood of a place and the emotions of a person.

With this description, Midjourney generated new, different images:

I tried again:

The photograph is an overview of an old bus station taken from an aerial perspective. There are five buses in the bus station which is located in an industrial area and the photograph was taken at a rainy night. All buses are empty, except one bus in which a lady is sitting inside the bus. The photograph has a cold color palette except for the inside of the one bus where the woman is sitting and looking out of the window. The window has water droplets on it, which adds to the melancholic mood of the photograph. In this one bus, a warm light illuminates the inside of the bus. The woman has black hair and a sad expression on her face. The photograph is taken from a drone using a DJI Mavic 3. The photograph is ultra-realistic and has a style like an Edward Hopper painting.

Now the bus station came out better, but the scene was shown from so far away that the individual person is no longer visible:

I gave up the idea of ​​taking an aerial photograph and formulated a new text together with my copilot:

The photograph is an ultra-photorealistic image captured with a Leica M camera and a 21mm lens. The photograph is a strong wide-angle shot that captures the entire bus station. In the foreground, you can see a bus that is illuminated from the inside. A young woman with black hair is sitting at a window in the middle of the bus, looking sadly out of the window. The photograph was taken during the blue hour, which gives the photograph a cool color palette. However, the interior of the bus where the woman is sitting is bathed in warm light, creating a stark contrast with the cool colors outside. In the background, you can see other buses parked in the bus station. They are all empty, and no other person can be seen in the whole picture except for the one woman sitting in the illuminated bus. It is raining, and you can see drops of water on the windows of the buses.

Here is the new result:

Still not what I imagined in my mind's eye, but better. I liked the image at the bottom right the best, and I had Midjourney generate four versions of this photo:

I didn't really succeed in implementing what I had in my mind's eye. The picture below at least somewhat captures the mood I had in mind:

Addendum

Midjourney has just been updated and version 6.0 is now online. With this version I managed for the first time to generate an image that is much closer to my text description. Above all, the system has now correctly recorded that the image was taken from outside the bus. Technology is advancing rapidly, and in just a few months we will see even more accurate results.


 
Back
Back

Schaffhuuse shows heart

Further
Further

The Pope and the Dalai Lama drinking coffee...