Basics of Prompt Engineering

Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. Think of it as the language you need to speak in order to tell an AI model what to draw. Prompt engineering is a great way to stretch the limitations of Text to Image models. Good prompts can make images go from good to great.

The secret for generating good images:

  • A well-written prompt consisting of modifiers and a good sentence structure.
  • Well-adjusted Stable Diffusion parameters. You can always use the default, but sometimes fine-tuned parameters can generate better results.

Definitions

This is an example of a prompt and all the parameters

Prompt: Funko pop superman figurine, made of plastic, product studio shot, on a white background, diffused lighting, centered.
Model: Elldreth’s Vivid
Seed: -1
CFG Scale: 4
Steps: 30

Prompts are the words or tags that the AI will spend its time thinking about and pull from its database. More or less, anything mentioned in the prompt the AI will attempt to generate something from. This is why a detailed prompt is important.

Negative prompts are what the AI will intentionally avoid and not pull from. In general, the more you tell the AI to not do, the more it will slice off.

Sampler is which algorithm the model will use to produce the image.

CFG Scale or ‘Classifier Free Guidance Scale’ is how strongly the image should conform to the text. Lower values produce more creative results and therefore less accurate. Higher values will get real specific, but will cease to pull missing bits from other sources and can result in odd images.

Seed controls the output of a random number generator. The same seed will generate the same image. The default value of -1 uses a random seed. It’s like Minecraft; If you have the exact same settings with the exact same seed, you will get the exact same result.

Structure it is also very important to get the image you want that your prompt should be at the front. The models weigh content closer to the beginning more. Keep your prompt at the front and separate every modifier. For example, "woman sunbathing on a beach, sunny, windy, blue sky," and so on.

Intensifiers and parentheses, like the structure, makes the AI focus or prioritize different tags. If it is important that the sky is blue you can write (blue sky) to ensure the AI spends more time on that element. Alternatively, you can write blue sky:1.4 to set the level of intensity of that specific tag. The prompt (((blue sky:1.4))) will make the AI spend a lot of energy on that prompt, but can be overpowering.