Introducing AI Voxel Generation: From Prompt to 3D in Seconds
We built an AI pipeline that turns a single text prompt into a fully-formed, game-ready voxel model. Here's how it works under the hood.
Voxel AI generation has been in the works for a while, and today it’s live. Type a prompt like “a medieval tower with a wooden gate” and the system spins up a 3D voxel model ready to drop into your scene or export to .vox.
How it works
The pipeline runs in three stages:
- Image generation — your prompt is sent to an image model tuned on voxel-style references, producing a consistent multi-angle sprite sheet
- Voxelization — a custom depth-estimation + voxel-projection step converts the 2D sheets into a 3D grid at your target resolution
- Post-processing — the raw voxel grid is cleaned, palette-reduced to your active color set, and surfaced as a live preview in the editor
What you can generate
Almost any object works well: characters, props, architecture, and environmental pieces. Abstract concepts and very fine text don’t voxelize cleanly — use the manual editor to refine those.
Generation tokens
Each generation consumes tokens from your account balance:
| Action | Token cost |
|---|---|
| Image generation | 1 token |
| Voxelization | 3 tokens |
| Save to cloud | 0 tokens |
Free accounts start with 50 tokens. Builder and Pro plans refill tokens monthly.
Under the hood
The voxelization step was the hardest part. Early versions produced hollow shells — the outer faces looked right but the interior was empty, which caused problems in game engines that rely on solid geometry for collision.
The palette reduction step maps generated colors to your active palette using perceptual distance (CIEDE2000) rather than RGB Euclidean distance — the results look significantly better, especially for skin tones and wood textures.
What’s next
We’re working on:
- Scene generation — generate entire environments from a single description
- Style transfer — match the palette and style of an existing model in your project
- Animation hints — tag bone attachment points at generation time for rigging
Try it now in the editor. Feedback welcome on Discord.