POST/imagine1-8 credits

Imagine

Generate images from a text prompt. Supports 30 models across 9 families. Midjourney is the default and returns a 2×2 grid of four candidates you refine with upscale/variation/editing endpoints. All other models (nano-banana, seedream, gpt-image, grok, ideogram, flux-2, flux-kontext, qwen, z-image) are synchronous and return a single image directly - no polling required.

Request Parameters

Parameter	Type	Required	Description
prompt	`string`	Required	The text prompt for image generation. Midjourney supports parameters like --ar, --sref, --style, --chaos. Flux models use plain natural language.
model	`string`	Optional	Which AI model to use. Synchronous models return a finished result immediately — no polling required. Async models (Midjourney, Flux 1) return a task_id to poll.Default: `midjourney` `midjourneynano-banana-2nano-banananano-banana-hd`
aspect_ratio	`string`	Optional	Aspect ratio for the generated image. For Midjourney, if you already have --ar in your prompt, this field is ignored.Default: `1:1` `1:116:99:164:33:43:22:3`
output_format	`string`	Optional	Output image format. Flux.1 models only. `pngjpg`
resolution	`string`	Optional	Output resolution. Flux.1 and nano-banana-2 only. `1K2K4K`
image_url	`string`	Optional	Reference image URL for image-to-image or editing. Required for edit models: seededit, qwen-image-edit, ideogram-v3-edit, ideogram-v3-remix, ideogram-v3-reframe, flux-kontext-pro, flux-kontext-max. Optional for I2I on grok-image, nano-banana-2, gpt-image-1, gpt-image-1.5.
negative_prompt	`string`	Optional	What to avoid in the generation. Flux models only. Max 4000 characters.
webhook_url	`string`	Optional	URL where we POST the completed task result.
webhook_secret	`string`	Optional	Sent as x-webhook-secret header in webhook delivery for verification.

Example Request

# Midjourney (default)
curl -X POST https://api.journeyapi.com/api/v1/imagine \
  -H "Authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a cyberpunk cityscape at night, neon lights reflecting on wet streets --ar 16:9",
    "webhook_url": "https://your-server.com/webhook"
  }'

# Flux Pro
curl -X POST https://api.journeyapi.com/api/v1/imagine \
  -H "Authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux-pro",
    "prompt": "a cyberpunk cityscape at night, neon lights reflecting on wet streets",
    "aspect_ratio": "16:9",
    "output_format": "png",
    "webhook_url": "https://your-server.com/webhook"
  }'

Immediate Response

Returns immediately with a task ID. Use /fetch or webhooks to get the result.

200 OK

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000"
}

Webhook / Fetch Response

Delivered to your webhook_url or returned by /fetch when the task completes.

Completed Task

// Midjourney result (2×2 grid)
{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "task_type": "imagine",
  "model": "midjourney",
  "status": "finished",
  "percentage": "100",
  "original_image_url": "https://cdn.midjourney.com/abc123/grid_0.png",
  "image_urls": [
    "https://cdn.midjourney.com/abc123/0_0.png",
    "https://cdn.midjourney.com/abc123/0_1.png",
    "https://cdn.midjourney.com/abc123/0_2.png",
    "https://cdn.midjourney.com/abc123/0_3.png"
  ],
  "seed": "3847291056"
}

// Flux result (single image)
{
  "task_id": "661f9511-f30c-52e5-b827-557766551111",
  "task_type": "imagine",
  "model": "flux-pro",
  "status": "finished",
  "percentage": "100",
  "image_url": "https://images.journeyapi.com/flux/abc123.png"
}

Response Fields

Field	Type	Description
task_id	`string`	UUID of the created task.
task_type	`string`	Always "imagine".
model	`string`	The model used for generation.
status	`string`	Task status: "pending", "processing", "finished", or "failed".
percentage	`string`	Progress from "0" to "100".
original_image_url	`string`	URL of the full 2×2 grid image (Midjourney only).
image_urls	`string[]`	Array of 4 individual image URLs (Midjourney only).
image_url	`string`	Single output image URL (Flux models).
seed	`string`	Seed value, if available.

Tips

1Omit the model field to default to Midjourney - fully backward compatible with existing integrations.
2Flux models use plain natural language prompts. Midjourney supports special parameters like --ar, --sref, --style, --chaos.
3Credit costs vary by model: midjourney = 4, flux-pro = 4, flux-dev = 2, flux-schnell = 1. See the Models reference page for the full cost table.
4Midjourney returns a 2×2 grid (image_urls array). All other models return a single image (image_url).
5The response returns immediately with a task_id. Use /fetch or webhooks to get the result.
6Most new models (nano-banana-2, seedream-*, gpt-image-*, etc.) are **synchronous** - the task finishes in one API call. You'll receive `status: 'finished'` and `image_url` the first time you call /fetch, rather than needing to poll.
7Edit and image-to-image models (seededit, qwen-image-edit, ideogram-v3-edit, flux-kontext-*) require `image_url` to be set. Without it, the generation ignores the editing intent.
8Credit costs vary widely: flux-2-dev = 1 credit, ideogram-v3 = 6, flux-kontext-max = 8.

Quirks & Gotchas

Midjourney grid images in image_urls are 0-indexed (0_0.png through 0_3.png) even though Midjourney displays them as positions 1-4.

Flux.1-specific parameters (output_format, negative_prompt) are ignored when using Midjourney or synchronous models.

If the prompt triggers a model's content filter, the task will fail with an error message.

Flux Kontext only supports English prompts. Non-English prompts will produce poor results unless the upstream translation layer is active.

The `gpt-4o-image` model is a reverse-proxy of ChatGPT image generation and may be subject to stricter content moderation or occasional rate limits.

All new models return a single `image_url`, not a grid. `image_urls` and `original_image_url` will be absent on non-Midjourney results.

Expert Tips & Best Practices

Prompt structure for Midjourney

Structure prompts as: subject → style → lighting → mood. For example: "a lone samurai standing in fog, watercolor painting style, soft diffused morning light, melancholic atmosphere --ar 3:2". Midjourney weighs earlier tokens more heavily, so lead with the most important subject.

Essential parameters to know

Use --ar (aspect ratio) to control framing: 16:9 for cinematic, 9:16 for mobile/portrait, 3:2 for photography. Use --chaos 0-100 to increase variety between the four grid images - higher values produce more unexpected results. Use --stylize 0-1000 to control how strongly Midjourney applies its aesthetic (lower = more literal, higher = more artistic).

Style references with --sref

Append --sref <image_url> to use an existing image as a style reference without copying its content. You can chain multiple URLs: --sref url1 url2. Use --sw 0-1000 to control style weight. This is distinct from using image_url (which influences composition, not just style).

--v 7 vs --niji

The default model (--v 7) excels at photorealistic, cinematic, and detailed renders. Use --niji 6 for anime, manga, or 2D illustration styles - it has been trained on a much wider range of illustration styles and handles character art significantly better.

Google Nano-Banana - Best for photorealism and multi-reference consistency

nano-banana-2 supports up to 14 reference images (10 objects + 4 characters) via the image_url field, making it ideal for character-consistent generation. Pass resolution: '2K' or '4K' for higher resolution output (nano-banana-2 only). The model excels at text rendering inside images - useful for infographics, marketing posters, and banners. Supports extreme aspect ratios (1:4, 4:1) for long-format content.

Seedream - ByteDance's creative image suite

seedream-5-lite is the latest generation with the best quality. The seededit model is a general-purpose image editor: use it for retouching, clothing replacement, style transfer, or adding/removing elements using natural language. For seedream-3, reference images can be passed inline in the prompt as a space-separated URL rather than via image_url.

GPT Image - OpenAI's instruction-following image models

gpt-image-1 and gpt-image-1.5 are the official OpenAI image API. They excel at following detailed text instructions and rendering legible text in images. gpt-4o-image is a proxy of ChatGPT's native image generation - higher throughput but less predictable content moderation. Use gpt-image-1.5 for consistency-sensitive work.

Grok - xAI's photorealistic image generation

Grok image models produce high-quality photorealistic results. grok-image-2 is the newer and higher-quality version. The models support very long, detailed prompts. For image-to-image use, pass a reference image URL in image_url - the model will transform it according to your prompt.

Ideogram v3 - Best-in-class for text in images

Ideogram v3 is the strongest model for generating images with legible typography - logos, signage, labels, infographics. ideogram-v3-remix preserves the composition of a reference image while changing the style. ideogram-v3-reframe extends a square image to any aspect ratio without distortion. ideogram-v3-edit uses a white mask (white = regenerate, black = preserve) - same convention as Midjourney's inpaint. Credit cost is 6 per call.

Flux Kontext - Context-aware image editing

Flux Kontext maintains subject identity across edits, making it ideal for product photography adjustments, background replacement, and style changes that preserve the main subject. To edit: set image_url to the source image and describe the change in the prompt. To generate fresh: omit image_url. English-only prompts. flux-kontext-max handles complex multi-element scenes; flux-kontext-pro is sufficient for most editing tasks. Generated images expire after 14 days.

Flux.2 - Black Forest Labs second generation

Flux.2 models (flux-2-dev, flux-2-flex, flux-2-max) are the successor to Flux.1 (flux-pro, flux-dev, flux-schnell) and go through BLTCY's synchronous endpoint rather than polling. flux-2-dev is fastest and cheapest (1 credit) - good for drafts and testing. flux-2-max produces output comparable to Midjourney at 4 credits.

Qwen & Z-Image - Alibaba's image models

qwen-image has strong capabilities for Chinese text rendering and East Asian aesthetics. qwen-image-edit supports precise text editing inside images - useful for modifying labels, signs, or UI elements in a source image. z-image is a fast, budget-friendly option (2 credits) for general generation.

Related Endpoints

POST/reroll POST/upscale-1x POST/variations