Generate Image

Tools / Generate Image

Generate Image is a full AI image studio with 20+ models in a single tool. Create from text prompts, edit existing images with natural language, compose new shots from multiple references, and upscale to 4K — all without switching between different apps or APIs.

Models range from fast drafts to photorealistic renders, with specialists for typography, editorial photography, product shots, and artistic styles. The default model handles most tasks well; call list_models when you need something specific.

Typography guidance: All models approximate named fonts (e.g. "Outfit Black") rather than rendering them exactly. For designs where legible or stylised text is the focus, use ideogram-v3 or gpt-image-2 — they produce the cleanest results. The default model is better suited to conceptual or illustrative output where precise text is not required. For pixel-exact font matching, generate the image without text and add typography in a design tool afterward.

What you can do

text_to_image — generate images from text prompts with full control over size, style, aspect ratio, and model
edit_image — modify an existing image using natural language: change backgrounds, add objects, remove elements, apply style transfers
image_to_image — compose a new image from up to 4 reference photos, combining scenes, personas, products, and outfits
check_image — poll for a pending result when a model is still processing
upscale_image — increase resolution up to 10x from inside the same workflow
list_models — browse all available models with pricing, capabilities, and supported parameters

Who it's for

Marketers producing campaign assets, product teams visualizing concepts, content creators building visual content, and developers adding image generation to AI workflows.

How to use it

Use text_to_image with a descriptive prompt — the default model works well without setting a model parameter
Use edit_image to modify an existing photo with a natural language instruction
Use image_to_image to combine reference images from your library (scenes, personas, products, outfits) into one new composition
Use list_models to pick a specialist model for typography, photorealism, or artistic styles

Getting started

Works immediately with no configuration — the default model handles text-to-image, editing, and composition out of the box.

v0.162026-04-28

Added typography guidance to model descriptions — ideogram-v3 and gpt-image-2 are now clearly recommended for text-heavy designs; gallery previews now show the generation prompt instead of a generic label.

v0.152026-04-22

Quality-tier billing now exact across every model — low/medium/high calls on gpt-image-2 bill at $0.01 / $0.04 / $0.15 instead of a flat rate. Applies universally: every provider now returns the true per-request cost.

v0.142026-04-21

Added OpenAI GPT Image 2 — excellent prompt adherence and typography. Model key "gpt-image-2" works for text-to-image, editing, and mask-based inpainting; add image_urls to trigger edit mode. Also available via OpenRouter as "openrouter-gpt-image-2".

v0.132026-04-14

Accept personas, scenes, products, and outfits from your file library

v0.122026-04-13

Added image_to_image skill — compose new images from up to 4 reference URLs or saved file IDs (scenes, personas, products, outfits) plus a prompt. Pass scene_file_id, persona_file_id, product_file_ids, outfit_file_ids, or raw image_urls and the gateway resolves and merges them automatically.

v0.112026-04-06

Style references now supported — pass a style reference when generating or editing images for consistent visual direction

v0.102026-04-04

Ideogram style values are now case-insensitive — pass "realistic" or "REALISTIC", both work (valid: REALISTIC, GENERAL, DESIGN, AUTO)

v0.092026-03-31

Added Google as a direct provider — 6 new models: Nano Banana, Nano Banana 2, Nano Banana Pro, Imagen 4 Fast, Imagen 4, Imagen 4 Ultra. Direct Google API access with token-based billing for Nano Banana and flat-rate for Imagen 4.

v0.082026-03-27

Added Photalabs as a direct provider — phota, phota-edit, and phota-enhance models now route via the Photalabs API instead of fal.ai

v0.072026-03-24

Default model changed to nano-banana-2
Removed hardcoded model names from descriptions — agents discover models dynamically via list_models

v0.062026-03-24

Added Higgsfield Soul model — realistic human-centric UGC photos (model key: "soul")

v0.052026-03-23

Added Prodia as an alternative provider — 10 new models (prodia-flux-2-pro, prodia-sdxl, prodia-recraft-v4, prodia-gemini-3-pro, etc.)

v0.042026-03-23

upscale_image now delegates to image-upscale tool via composition

v0.032026-03-22

Fuzzy model resolution — natural names like "nano banana 2" now resolve correctly
Broadened live model discovery to catch more fal.ai categories and tags
Added subtitle, expanded description, and agent instructions

v0.022026-03-20

Initial release

v0.012026-04-03

Expanded Nano Banana aspect-ratio support to include auto, ultrawide, and extreme ratios
fal-backed image generation now returns recoverable pending results with check_image instead of silently degrading on long queue waits
fal live-pricing lookup now falls back to static registry pricing to avoid pricing-endpoint outages breaking image generation