AI Prompts for Generative Graphic
- #Graphic Design
- #A.I.
• 7 min read
What is a prompt?
A prompt is a text query that a neural network uses to produce content we want. Image generation is one of the most viral ways of using generative neural networks. This is due to the relatively low barrier to entry, the development of relevant tools, and the impressive results that can be seen within minutes.
Many designers, sensing the potential opportunities (or threats), have tried generating their first neural art using networks like DALL·E 2 or MidJourney. However, many were disappointed with the results as their images turned out not as impressive as examples they found on the internet.
The root of disappointment lies in the specific prompts the designers used to interact with the neural networks. For example, SayMulator (which uses the DALL·E 2 neural network for image generation) produces perhaps the most boring image of a cat I've ever encountered when given the prompt "cat".
In this article, I will try to explain what constitutes a high-quality prompt for a neural network to help you generate images that, at minimum, you won't be embarrassed to share on social media and, at most, can be used in your daily work. All images in this article are generated using SayMulator (DALL·E-2), but the described techniques and principles also apply to other generative neural networks.
Our first prompt
Let's return to our boring cat. Using this example, I will gradually complicate our prompt and demonstrate how this affects the final image.
The main principle: the broader context we provide to the neural network, the more predictable the result we will get.
Let's start with describing an action. For example, I want my cat to drink coffee. I state it in the prompt: “cat drinks coffee”.
Add location
Describing the location helps the neural network understand the environment in which our object will be situated. This can influence both the background image and the interaction with the object. In my case, "cat drinks coffee in forest," but the context of "forest" placed the cat on a tree stump, even though I hadn't thought about that.
Choose your visual style
One of the most influential parameters on the final result is the visual style, as it helps the neural network understand how to "draw" your neural art. It's similar to the collaboration between an art director and a junior designer, where the former comes up with the idea and describes it based on their experience and background, while the latter directly implements it. The neural network understands a lot of visual styles:
- All artistic styles from ancient times to the present: Roman mosaic, Renaissance, Baroque, Art Nouveau, Cubism, Pop Art, etc.
- Cultural movements: Gothic, Cyberpunk, Memphis, Sci-fi, Glassmorphism, Skeuomorphism, Neumorphism, the 50s, 70s, 30s, etc.
- Analog illustration styles: Watercolor, Street Art, Stencil, Oil Painting, Pastels, Tattoo, Ukiyo-e, Layered Paper, Manuscript, etc.
- Digital illustration styles: Storybook, Digital Painting, Pixel Art, Vector Art, Sticker Art, Magazine Collage, Low Poly, etc.
- Animated films and movements: Adventure Time, Anime, Pixar, Studio Ghibli, Vintage Disney, The Simpsons, South Park, etc.
- 3D graphics: Octane 3D render, Houdini 3D render, ZBrush, Cinema 4D, Blender, etc.
- Artists/illustrators/photographers: Salvador Dali, Norman Rockwell, Terry Richardson, Keith Haring, Vincent van Gogh, etc.
I suggested the neural network imagine what "cat drinks coffee in forest, Studio Ghibli style" might look like.
Color & lighting
With clarifications in the text prompt, we can adjust the color scheme of the final image. This can be done in several ways:
- Color palettes: neon colors, pastel colors, dark, greens, colorful, black and white, etc.
- Outdoor lighting: golden hour, blue hour, midday, shadow & silhouette, etc.
- Studio lighting: warm lighting, cold lighting, pink and yellow lighting, studio lighting, flash photography, etc.
- Techniques and tools in digital and analog photography: Lomography, double exposure, infrared, Polaroid, etc.
I decided that calm colors don't suit my cat, so "cat drinks coffee in forest, Studio Ghibli style, neon colors”.
It’s already apparent that, from a design perspective, we have created something unique, as we added bright neon colors to the Studio Ghibli style, which is atypical for their animation.
Camera setup
The previous two points had a strong influence on the image's graphics, while describing the camera angle and type allows us to control the composition of the image. The neural network understands but doesn't always accurately reproduce:
- proximity: extreme close-up shot, close-up shot, medium shot, long shot, extreme long shot, etc.
- angles: overhead view, aerial view, low angle, Dutch angle, over-the-shoulder shot, etc.
- lenses and objectives: fisheye, 50 mm, 25 mm, macro, etc.
Let's take a closer look at our cat — "cat drinks coffee in forest, Studio Ghibli style, neon colors, medium shot."
More adjectives
The neural network understands the meanings of various adjectives and can interpret them visually. In everyday life, I did not notice how much the image and meaning of an adjective change depending on the context, but the neural network allows you to feel it. This is a subtle parameter that requires additional time to master.
To move away from the Studio Ghibli style even further, I decided to scare my cat, so "cat drinks coffee in forest, Studio Ghibli style, neon colors, medium shot, frightful".
There are rumors about a set of certain adjectives that make any prompt result more attractive. I cannot confirm or refute this information, but the rough list is as follows: highly detailed, 4k, 8k, high resolution, award-winning, cinematic…
Movie Magic
The neural network can also illustrate the relics of popular culture, including movies, TV shows, and series. We can ask it to stylize our prompt or character based on a movie. The network will stylize the background, costumes, hairstyles, and many other parameters that are difficult to predict.
It seems that "cat, from Tron: Legacy (2010)" and "cat, from Mad Max: Fury Road (2015)" would look exactly like this.
Conclusion
Let's compare the results for the queries "cat" and "cat drinks coffee in forest, Studio Ghibli style, neon colors, medium shot, frightful".
Short prompts can be effective and helpful if you are conducting visual research on a particular topic or you are a very lucky person. However, if your goal is to generate an illustration or collage with specific requirements, a long prompt is your choice because it allows you to find a common language with the neural network and use its potential to the fullest. To sum up, a high-quality prompt would typically include at least the object and action, location, visual style, color & lightning, and camera setup, but you can also add additional adjectives and other preferences.