In general, the software is Stable Diffusion running locally on my computer (big graphic card). I use SDXL models - yes, I know Flux is better, but I have the needed LORAs only in SDXL. I use my own LORA for the female character at weight 1, Wowifier LORA at weight 0.4 for the better partial nudity, crowd_notrigger at weight 0.3 for the background people.
LORA = kind of a model plugin that teaches the base AI model about new concepts, e.g. that a crowd of people is more than just 2 persons.
Stable diffusion needs a prompt, i.e. a description what should be in the picture. For instance: "Amazing artwork, captivating digital art, oil painting, full body view, 1girl ..."
Stable diffusion has different modes, e.g. you can generate an image from the prompt alone (txt2img) or you can generate a similar image based on a previous image (img2img).
I use img2img a lot. This means I start with a picture found on the net (for instance a heroic
renaissance oil painting) and tell the AI what I want to have changed, e.g. that the main subject is female, and generate a new image. The first shot is seldomly satisfying, hence I take the new image as new input and generate a second generation image ... and so on ... until I get what I want. Sometimes I help the AI by just drawing a sketch into the image and then let the AI regenerate my sketch in more artistic style. Sometimes I do batch generation of 100 images just to find the one where, say, the hands or the face look best.
At the end, I have plenty of generated images of the same scene. I merge them together via a Photo Editor, say, from one picture, I take the face, from another picture the left hand, right hand, feet and so on. The blended image goes through the AI one final time with low denoise ratio (< 0.3) to give it the final touch. At the end there are steps to sharpen the image and to enlarge it to the desired size.