Using prompts to modify face and body in Stable Diffusion 1.5

Created on

There is new version of this article available here. It’s based on SDXL, but it’s still relevant for SD 1.5. It’s more detailed and contains more examples.

Preparations

The key to success lies in three things: the Model (Checkpoint), Positive Prompt, and Negative Prompt (or Negative Embedding).

There are many models created by the community. Currently, there are several popular, realistic models worth using. Just visit CivitAI.com and find your favourite. For the purposes of this article, I will use my own model AddictiveFuture Realistic SemiAsian. This model focuses on creating realistically looking semi-Asian females. However, you will soon see that it’s possible to achieve slightly different end results.

Using a good model and a well-chosen Negative Prompt is enough to achieve good results. Below you can see the prompt I’ve been using to generate realistically looking people.

(low quality:2), (normal quality:2), (worst quality:2), anime, bad anatomy, bad proportions, blurry, cloned face,
cropped, deformed, dehydrated, disfigured, dot, drawing, duplicate, error, extra arms, extra fingers, extra legs,
extra limbs, fused fingers, grayscale, gross proportions, illustration, jpeg artifacts, long neck, low quality,
lowres, malformed limbs, manga, missing arms, missing legs, mole, monochrome, morbid, mutated hands, mutation,
mutilated, normal quality, out of frame, painting, paintings, poorly drawn face, poorly drawn hands, signature,
sketches, text, too many fingers, ugly, username, watermark, worst quality

There are also Negative Embeddings. Cool thing is that you need to specify only one keyword (the name of embedding file) to use it. They work similarly to Negative Prompts, but each of them may have a different impact on the final result. You will need to find the right one for your needs.

Prompting age

Just specify whether you want to see a young or old person. These prompts affect the entire body. The older the person, the more the body shape changes, the skin ages with wrinkles and sagging. If you want your prompt to have the most impact on the final result, you should set Clip Skip to 1 (it’s default value), and CFG Scale to around 6. Keep in mind that a lot depends on the model you are using. The values I provided should work in most cases. Ok, let’s use these three prompts and fixed seed (3575497043):

photography of girl, 20 years oldphotography of midlife woman, 40 years oldphotography of very old woman, elderly woman

Prompting nationality

I started experimenting with Nationalities after finding that post on Reddit.

Prompts specifying a particular nationality have an impact on the overall appearance of the person. They change the shape of their face, alter the skin color, influence the hairstyle and hair color, affect eyes color, affect the clothing, and determine the background of the location where the person is situated. If we don’t specify these aspects precisely, they will be influenced by the country we mentioned. Moreover, even if we specify the clothing the person should wear, it may still have colors associated with the country (often seen with Israel, white and blue) or include accessories, such as beads frequently seen in African countries. Head coverings (scarves, straw hats) can also appear.

On the pictures below, you can see results of prompt: “photography of midlife X woman, 40 years old”, where X you can replace with following nationalities: Grenadian, Chinese, Bangladeshi, Sri Lankan, Israeli and Turkish. Seed was set to random (if you’re wondering why random seed, the explanation can be found at the end of the article).

Image of Grenadian womanImage of Chinese womanImage of Bangladeshi womanImage of Sri Lankan womanImage of Israeli womanImage of Turkish woman

Mixing age and nationality

Of course, you can mix age and nationality together.

Below, you can see some example mixes using random seeds (without using ControlNet this time). You can observe how the clothing and surroundings of the photographed person have changed. This is particularly evident in the case of the Dutch girl (last photo).

Prompting body types

Actually, you only need a simple prompt describing body type. You can try prompts like:

Image of skinny body typeImage of fat body typeImage of muscular body type

Of course, you can specify individual body parts precisely. Wide hips, narrow waist, big ass, large breast… whatever else you can come up with.

If the results are not satisfactory

Problems you may encounter

The models you use may not allow you to generate certain things because they might lack information on how certain things look. Most of the time, Stable Diffusion either does not generate things it doesn’t know or cleverly conceals them. A simple example is photos of naked, old women. Usually, models do not know how an old naked woman looks like. If we prompt for something like that, we will get a photo of a naked woman, but she won’t appear old. This happens because Stable Diffusion tries to generate the body the best it can, and since the body won’t look old, the face shouldn’t either.

Let’s stay with older women. Suppose we want to create a photo of an older woman wearing some sexy clothes. Maybe a skirt? Or perhaps a croptop that shows the belly? Probably we won’t get an older woman because there will be a problem with the exposed body parts. Again, Stable Diffusion will have to compromise. It will generate those body parts as best as it can, and adjust the face accordingly.

Just try not to combine things that shouldn’t go together.

If you significantly change your prompt, you should also use a new seed, preferably a random one at the beginning. If you previously generated a light-skinned person and now want to generate a dark-skinned person, you should use a new seed, as the old seed may have information about the skin color stored in it. This could result in not being able to fully achieve a dark-skinned person. So, if you’re unable to achieve the desired results, it may be due to the wrong configuration, limitations of the model, your poorly constructed prompt or bad seed.

All images were created on Vast.ai servers (using RTX4090).