Customizing Image Prompts for Text‑to‑Image Models

Introduction

Popular text‑to‑image models such as Midjourney, DALL‑E 3 and Stable Diffusion generate images based on natural‑language prompts. Each model allows a user to tailor the generation through a combination of descriptive content (subject, environment, mood) and formal parameters (style, aspect ratio, camera details). This research summarises the main categories of settings supported by the three models and provides a practical “menu” of options for end‑users to craft advanced prompts. Citations are provided from credible sources.

Core Prompt Elements (common across models)

Even without special parameters, all models respond to descriptive cues. A good prompt typically includes the following elements:

Element/categoryShort description and examples
SubjectPrimary focus of the image – person, object or scene (e.g., “a futuristic cityscape”, “portrait of an elderly fisherman”). Describe number and position of subjects to reduce randomness (e.g., “three cats” instead of “cats”). Ork Digital’s prompt guide emphasises being clear about the subject and context.
Medium / Art styleThe artistic medium or genre (photo, oil painting, watercolor, doodle, tapestry, low‑poly 3‑D, etc.). The same guide suggests specifying a medium to guide the appearance. Mediums can include traditional materials (watercolor, charcoal, risograph), digital media (pixel art, 3D render), or art movements (cubism, impressionism).
Environment / SettingWhere the scene takes place – outdoors, indoors, underwater, on the moon, etc. Listing the setting helps the model know what to include.
LightingQualitative description of the light (soft, ambient, neon, studio lights, golden hour, backlighting, rim lighting).
Lighting influences mood and contrast; the Stable Diffusion photography guide notes that lighting conditions like golden hour, midday contrast and nighttime ambience change the mood.
Color paletteUse adjectives such as vibrant, muted, monochromatic, pastel, warm or cool.
Mood / ToneWords like calm, whimsical, dark, energetic, surreal or dystopian to guide the emotional feel.
Composition / PerspectiveDescribe framing (portrait, landscape, square), shot type (full shot, medium shot, close‑up, extreme close‑up) and perspective (bird’s‑eye view, eye level, low angle, high angle).
Stable Diffusion’s photography guide explains that shot types determine how much of a subject is seen and that camera angles (low, high, normal) change the narrative

The descriptive categories above can be combined with model‑specific parameters to gain further control.

Midjourney Settings

Midjourney is accessed via Discord; prompts take the form /imagine <description> [parameters]. Midjourney V5/V6 emphasises photorealism but also supports many artistic styles. Useful parameters and descriptive settings include:

Setting (syntax)Purpose / options
Aspect Ratio (--aspect or --ar)Changes the width:height ratio of the generated image. The default is 1:1. Any integer ratio is permitted (e.g., 16:9, 4:3, 7:4). A guide recommends altering the aspect ratio to achieve widescreen or portrait compositions.
Style switch (--style)Chooses a pre‑defined style variation of the model. V5.1/5.2 supports --style raw (minimal default styling). V4 supports --style 4a, 4b, or 4c, each having different aesthetics. The Niji V5 anime model supports --style cute, expressive, original or scenic.
Stylize (--stylize or --s)Controls how strongly the model’s default artistic style is applied. Higher numbers produce more artistic, expressive images while lower numbers stick closer to the prompt.
Quality (--quality or --q)Controls rendering time and detail. Typical values are 0.25, 0.5 or 1 (default). Higher values spend more GPU time and produce finer details; lower values are faster.
Seed (--seed)Sets the random seed used to initialise the noise. Using the same seed with the same prompt produces similar compositions.
Chaos (--chaos)Adjusts variability. Values range 0–100; high values create more varied and unexpected results.
Weird (--weird)Explores unusual aesthetics. Accepts values 0–3000; higher values yield more surreal results
Negative prompt (--no)Excludes specific subjects, e.g., --no text or --no plants.
Model version (--version or --v)Specifies Midjourney version (1–6). Later versions yield higher quality but may respond differently.
Niji model (--niji)Switches to an anime‑focused model.
Other parameters--tile creates seamless patterns; --stop ends generation early; --repeat generates multiple jobs; --iw sets image weight for image prompts; upscaler parameters such as --uplight or --upbeta change the upscaling algorithm.

Camera and lens details in Midjourney – Midjourney does not have dedicated camera parameters, but descriptive phrases influence the result.
For example, adding “35 mm prime lens”, “50 mm lens, f/1.8, ISO 100” or “shot on Kodak Portra 400” yields images with the corresponding field‑of‑view and depth of field.

promptXfactory

Tired of this chaos ?

Build perfect Midjourney prompts with our intuitive visual editor.

Combine subjects, styles, parameters, and settings to create stunning AI artwork!

Start Creating

DALL‑E 3 Settings

DALL‑E 3 is available through the OpenAI API and ChatGPT. It offers fewer manual parameters than Midjourney but still allows control over style, quality and size:

ParameterPurpose / options
Model (model)Choose dall-e-2 or dall-e-3
If unspecified, the API defaults to DALL‑E 2
Style (style)Sets the overall aesthetic.
Accepts vivid (hyper‑real, dramatic) or natural (more natural, less hyperreal).
The API defaults to vivid
Quality (quality)Specifies image detail. Options are standard or hd. HD produces finer details and more consistent compositions but costs more and takes longer
Image size (size)DALL‑E 3 accepts three fixed sizes: 1024×1024 (square), 1792×1024 (landscape) and 1024×1792 (portrait) These correspond approximately to aspect ratios 1:1, 7:4 and 4:7
Number of images (n)The API can return between 1 and 10 images, but DALL‑E 3 currently supports n=1
PromptUp to 1000 characters. DALL‑E 3 rewrites prompts internally to improve results. However, descriptive details about subject, environment, medium, camera type and mood still influence the generation, and specifying “vertical” or “horizontal” helps the model choose the appropriate composition
cookbook.openai.com

DALL‑E 3 does not support explicit negative prompts or seeds. To influence composition, one must describe camera angles, film types or aspect ratio in the textual prompt. For example, stating “shot on a 35 mm film camera” can produce film‑like results.

Stable Diffusion (SDXL and SD1.5)

Stable Diffusion models are widely available via open‑source interfaces. They do not enforce fixed parameters like Midjourney but rely heavily on prompt structure and optional negative prompts. The Segmind guide for SDXL emphasises constructing prompts using traditional photography concepts.

  1. Subject & Type of Image – describe the main subject and genre (portrait, landscape, macro, street, architectural)
  2. Details & Shot Type – specify attributes and shot type (full shot, medium shot, close‑up or extreme close‑up)
  3. Environment & Camera Angle – describe the surroundings and choose a camera angle (low, normal or high)
  4. Mood & Lighting – set the ambiance (joyful, dramatic, tranquil) and lighting conditions (golden hour, midday contrast, nighttime)
  5. Equipment – define camera type (mirrorless, DSLR, full frame), lens and eye line (normal, low, high). Common lenses include 24 mm wide‑angle, 35 mm prime, 50 mm standard, 85 mm portrait, 105 mm macro and 70–200 mm telephoto. Eye line influences the emotional feel
  6. Exposure & Style – optionally specify aperture (f‑stop), shutter speed and ISO. A large aperture (e.g., f/1.8) creates a shallow depth of field; small apertures (f/16) keep more of the scene in focus. Fast shutter speeds freeze motion; slow speeds introduce blur. ISO controls noise

Stable Diffusion also supports negative prompts to suppress unwanted elements, seeds to reproduce images, sampling steps (CFG scale), and image size choices (e.g., 512×512, 768×768 or custom ratios). Many UIs allow control over CFG scale (how closely the image follows the prompt) and the number of denoising steps. Like Midjourney, descriptive style keywords (impressionist, anime, cyberpunk) guide the aesthetic. The SDXL models RealVis XL and DreamShaper XL specialise in photorealistic or artistic outputs.

promptXfactory

Text-to-image prompt builder

Create perfect prompts for DALL-E and Stable Diffusion

Choose your model, configure parameters, and generate stunning AI artwork.

Build now

Menu of Settings for Prompt Crafting

The following menu summarises common settings. Users can mix elements from each category to build detailed prompts. Long explanations are provided in the text; the table contains concise choices.

CategoryMidjourney optionsDALL‑E 3 optionsStable Diffusion options
Model--version 1–6, --niji for anime model, style switch (--style raw, 4a, 4b, 4c, cute, expressive, original, scenic)model=dall-e-3 or dall-e-2; style defaults to vivid or naturalChoose checkpoint:
SDXL 1.0, RealVis XL, DreamShaper XL or SD1.5
Aspect ratio / size--ar width:height – e.g., 1:1, 16:9, 9:16, 4:3, custom ratiosSizes: 1024×1024 (square), 1792×1024 (landscape ~7:4) or 1024×1792 (portrait ~4:7). Describe orientation (horizontal/vertical) in prompt.Set resolution in UI (e.g., 512×512 or 768×1024).
Custom aspect ratios are supported by some UIs;
describe composition (portrait, landscape, square) in prompt.
Style / aestheticUse style switch for built‑in variations (raw, 4a, 4b, 4c, cute, expressive, etc.);

adjust --stylize to control artistic flair.
Also specify art styles (cyberpunk, anime, baroque, minimalist).
Choose style=vivid or style=natural via API;

further style should be described in prompt (e.g., “oil painting”, “watercolor”, “digital art”, “cyberpunk”).
Specify art movements or mediums (e.g., watercolor, oil painting, Pixar style).
Negative prompts can suppress unwanted aesthetics.
The CFG scale influences fidelity to the prompt.
Quality / detail–quality 0.250.5to trade off speed vs detail:

--stylize for creativity
--seed for reproducibility
Camera type & lensDescribed in prompt: “shot on a DSLR, 50 mm lens”; include aperture and film type (Kodak Portra 400) for film look.No explicit parameter; include camera descriptors (e.g., “35 mm photograph”, “iPhone photo”) in prompt.Part of the “Equipment” field. Choose camera type:
(mirrorless, DSLR, full frame) and lens (24 mm wide‑angle, 35 mm prime, 50 mm standard, 85 mm portrait, 105 mm macro, 70–200 mm telephoto).


Specify aperture (f‑stop), shutter speed and ISO for photographic realism.
Shot type / perspectiveDescribe shot type (full shot, medium shot, close‑up, extreme close‑up) and perspective (eye level, low angle, high angle). Use --ar to reinforce composition.Describe composition (full body portrait, close‑up of object, wide landscape) and orientation in prompt.Use the structured prompt:
[Subject & Type]
[Details & Shot Type]
[Environment & Angle]
[Mood & Lighting]
[Equipment], [Exposure]
Lighting & moodInclude lighting descriptors (golden hour, rim lighting, backlighting, silhouette, cinematic lighting); adjust --stylize for mood.Describe lighting and mood in prompt (soft light, neon glow, chiaroscuro).
Optionally choose style=natural for more subdued colours
Use the Mood & Lighting Conditions part of the structured prompt (golden hour, midday, nighttime).
Negative prompts can exclude certain lighting effects.
Randomness & reproducibility--chaos controls variation; --seed fixes noise for repeatability; --weird explores unusual aesthetics.No explicit randomness parameter; results vary slightly each call.Adjust the seed value to reproduce results;
set CFG scale (lower values allow creative variation, higher values adhere closely to prompt).
Negative promptsUse --no to exclude elements (e.g., --no text).Not officially supported; rephrase the prompt to avoid unwanted objects.Provide a negative prompt string (e.g., “text, watermark, logo”) to discourage certain elements;
some UIs support separate fields for negative prompts.

Practical Workflow for Crafting Advanced Prompts

  1. Choose the model (Midjourney, DALL‑E 3 or Stable Diffusion) and note the parameters available for that model.
  2. Define the subject and genre: decide what should be in the image and whether it is a portrait, landscape, macro, product photo, illustration or fantasy scene.
  3. Select an art style: pick the medium or aesthetic (photorealistic, oil painting, watercolor, anime, cyberpunk, surrealism). For Midjourney, you can also choose one of the built‑in style switches (raw, 4a, 4b, 4c, cute, expressive, etc.). For DALL‑E 3, set style=vivid for dramatic colours or style=natural for realism
  4. Decide on aspect ratio/size: choose square (1:1), landscape (16:9 or 7:4), portrait (4:5 or 4:7), or a custom ratio. Midjourney uses the -ar parameter;
    DALL‑E 3 uses fixed sizes;
    Stable Diffusion uses the resolution settings of your UI.
  5. Specify camera settings (for photographic realism):
    • Camera type: mirrorless, DSLR, full frame
    • Lens: 24 mm (wide), 35 mm (standard prime), 50 mm (natural perspective), 85 mm (portrait), 105 mm (macro), 70–200 mm (telephoto).
    • Camera eye line: normal, low or high.
    • Exposure: aperture (e.g., f/1.8, f/8, f/16), shutter speed (e.g., 1/1000 s, 1/30 s) and ISO (e.g., 100, 800, 3200).
      These settings can be embedded into the prompt to give more photographic fidelity.
  6. Define mood, lighting and colour: pick emotional tone (joyful, dramatic, serene), lighting conditions (golden hour, soft ambient light, rim lighting, silhouette) and colour palette (warm, pastel, neon, monochromatic). These details strongly influence the output.
  7. Use model‑specific parameters: apply -quality, -stylize, -chaos, -seed and other Midjourney parameters to control detail and variation; choose quality and style in DALL‑E 3; set CFG scale, sampling steps, seeds and negative prompts in Stable Diffusion.
  8. Iterate: run the prompt, evaluate the output and adjust parameters or descriptive elements accordingly. Use -seed to reproduce desirable results or -chaos/-weird to explore alternatives.


Conclusion

Crafting advanced prompts requires a mix of descriptive detail and parameter knowledge.
All three models respond well to clear descriptions of subject, environment, style, lighting and composition.
Midjourney provides a rich set of parameters for aspect ratio, style variations, stylization, randomness and upscaling.

DALL‑E 3 simplifies control through preset styles (natural or vivid), HD quality and fixed image sizes. Stable Diffusion rewards users who structure prompts like a photographer – choosing camera type, lens, angle, shot type and exposure.
By combining these settings thoughtfully, users can reliably guide AI models toward the desired artistic vision.

🚀 Supercharge Your Prompts

promptXfactory

Stop wasting time on guesswork
Start building smarter prompts—faster

Try now