Skip to content

REST API

The Voxel-AI API is a two-step pipeline designed to create high-quality 3D assets from simple text prompts.

First, you generate candidate reference images from a text prompt. Then, you select one of those images to be processed into a 3D MagicaVoxel (.vox) model.

Generation runs on serverless GPU workers that scale to zero when idle. A request may wait for a cold start, and a textured voxelize can take several minutes. Rather than holding one long HTTP request open, each step submits a job and returns a jobId immediately. You then poll for the result until the job completes.

Every generation therefore follows the same shape:

  1. SubmitPOST the step’s endpoint to get a jobId.
  2. PollGET /api/generate/status?jobId=... every couple of seconds.
  3. Read — once state is completed, the images or .vox are in output.

See Endpoints for the full request and response shapes, and Authentication for the required Bearer token.

Generating candidate images is free. Voxelizing a chosen image costs 1 token, deducted when the job is submitted and automatically refunded if the job fails — so a failed or abandoned generation never costs a token.

You do not need to over-engineer your text prompts. Our backend automatically augments your input with custom styling tags (e.g., isometric framing, blocky/voxel styling keywords) to ensure the AI creates a clean asset optimized for 3D conversion.

A simple prompt like “A rustic wooden treasure chest” is all you need!