Member-only story
Cinematography with Midjourney, Part 1: Anatomy Of A Cinematic Prompt (AI Image Generation)
Creating Cinematic AI Images with Midjourney
This is the first part of a 3-part series on cinematic prompts. Here’s what we’ll cover in this post:
- Anatomy of AI prompts for film stills
- Referencing cinematic style by era
- Referencing cinematic style by genre
- Referencing cinematic style by combinations of styles
- (Indirect) referencing of directors and cinematographers
Anatomy of a cinematographic image prompt
Cinematographic style refers to the visual techniques used by a cinematographer or director to create a particular look or aesthetic in a film. This includes visual elements such as camera angle, lens choice, lighting, color grading, and composition.
Film Stills
As is so often the case when creating images using AI, there is more than one way to do this. In this post we will focus on Midjourney and a basic prompt that helps us to explore how we can control certain aspects of rendering in a cinematographic style. Since words at the beginning of a Midjourney prompt seem to have more “weight” than those at the end, I usually start a cinematographic prompt with the prefix “film still”. The anatomy is like this:
/imagine prompt: film still, [scene description], [style description] — ar 3:2 [options]
(note: the “ — ar 3:2” part sets the aspect ratio to a more cinematographic one, unfortunately, Midjourney version 4 does not yet allow the use of 16:9 or 21:1)
The prefix “film still” instructs Midjourney to work towards a cinematic image composition right from the beginning without the need to explicitly describe camera angles, how people are arranged in the frame, depth of field, lighting, etc.
Of course, you could also drop the “film still” prefix altogether and still create great film footage — in fact, it would give you more freedom in terms of scene composition — but you would also have to add explicitly in which way you want Midjourney to establish a cinematic look.
A third option I see quite often is “footage from XY”, e.g. “footage from a 1973 science fiction film”. This can yield amazing results if you know exactly what…