February 20, 2026 · 4 min read
From Prompt to Lookbook: A Session in 21 Images
21 images, 72 messages, 8 locations across two continents — a single session that became a complete fashion lookbook.

21 images. 72 messages. 8 locations. One session.
I ran this session while building aether studio. It started as a test — could the tool hold a creative thread across dozens of generations? It ended as a complete fashion lookbook spanning two continents.
Every prompt, every image, every branch point is preserved in the session graph. Here's how it unfolded.
Starting point
One uploaded reference image. One prompt:
"create a full body shot of her, iphone pro photo, in nyc streets near greenwich, august, standing naturally, facing the camera"
That's art direction, not prompting. iPhone aesthetic. Greenwich Village. August light. Natural pose. The AI needed to execute a vision, not interpret a wish.


The session arc
What followed was a conversational tour of New York. Each prompt built on the last — I never re-described the model, the outfit, or the shooting style. The context carried forward.
"similarly, but in upper west side"
One sentence. The AI understood "similarly" because it could see everything that came before — the Greenwich shot and the full creative context behind it.
"let's have her stand naturally with a friend in front of guggenheim"
A second model enters. The lookbook evolves from solo shots to scenes.
"ok how about in central park", "ok now let's have her in front of the vessel"
Then wardrobe changes — pink sweater, magenta sweater — each living as a parallel branch. The session kept building on itself.
Then this:
"ok now let's have her travel to 성수"
Seoul's Seongsu district. One prompt, and the lookbook crossed the Pacific. The model's identity and style persisted; only the world changed.
"let's have her enter one of the cafes and enjoy the coffee"
She wasn't just being placed in locations anymore. She was living a day — walking NYC, flying to Seoul, finding a cafe. The lookbook became a narrative because the session had continuity.
Why this matters
Three things happened in this session that can't happen in any other AI image tool I've used.
Context carries forward. I wrote "similarly, but in upper west side" — seven words — and got a consistent continuation. Without the graph carrying context, I'd need to re-describe the model, the style, the lighting, and the composition every time. That's the difference between directing and prompting.
Branches preserve everything. Pink sweater and magenta sweater both exist. Solo shots and duo shots both exist. I never had to choose between trying something new and keeping what I had. This is the visual steering problem in practice — when exploration is free, you explore more.
The session is the documentation. At the end, I had a complete record: every prompt, every image, every branch. If someone asks "can we go back to Central Park but with the pink sweater?" — it's one click, not a reconstruction from memory. This is what happens when you solve the three problems that every other tool ignores.
The numbers
| Images | 21 |
| Messages | 72 |
| Locations | 8 (NYC + Seoul) |
| Characters | 2 |
| Images lost | 0 |
That last row is the point. Twenty-one images, zero lost. Every direction explored, every decision recorded.