Inference API
The inference API provides an HTTP endpoint for running jobs to convert text to
images, an image to an image, text to a video, and
more. The jobs endpoint covers all types of
supported inference workloads through a unified IO interface and a single
/v2/job endpoint. Job requests are made in a synchronous manner with the
input containing the job configuration along with the input data and the output
containing an updated job configuration with the output data.
Authentication
Section titled “Authentication”API requests require authentication via the Authorization
header
with a Bearer token scheme:
Authorization: Bearer xxxxxxxxxxxxTokens are created on the API Dashboard and are required to use the API.
Base URL
Section titled “Base URL”The base URL for all requests should be: https://inference.prodia.com
POST /v2/job Execute a Job
Section titled “POST /v2/job Execute a Job”Jobs are executed by posting to the /v2/job endpoint with an appropriate job
configuration and input data.
Job Configuration
Section titled “Job Configuration”All requests minimally have a JSON job
configuration that is differentiated on the type field:
{ "type": "inference.ping.v1"}There are a variety of job types documented in the explorer and they have the following in common:
- All jobs have a
typefield that indicates which job type is being requested. - Jobs with configuration have a
configfield that contains the type specific job configuration.
For example, this is a job configuration for a FLUX.1 [schnell] text to image generation:
{ "type": "inference.flux-fast.schnell.txt2img.v2", "config": { "prompt": "grainy photograph of a space explorer", "loras": ["prodia/flux-lora-painting"] }}Job Input Data
Section titled “Job Input Data”Some jobs need data that isn’t best transferred in the JSON format (e.g. binary
PNG data). This additional non-JSON input
to the job execution is sent as a
multipart/form-data
part named input. Multiple input data parts may be specified as long as they
use different filenames. When sending input data the job configuration is sent
in the part named job with filename="job.json" (and there must only be one part with this name).
For example, a FLUX.1 [schnell] image to image generation requires an input image.
This would result in a multipart/form-data request with 2 parts (one for the
job configuration and another for the input image). Given the following job
configuration and an input.jpg in the local directory:
{ "type": "inference.flux.schnell.img2img.v1", "config": { "prompt": "grainy photograph of a space explorer", "loras": ["prodia/flux-lora-painting"] }}Curl can be used to make the multipart request:
curl -H "Authorization: Bearer $PRODIA_TOKEN" \ -H 'Accept: image/jpeg' \ --output output.jpg \ 'https://inference.prodia.com/v2/job'An HTTP trace of the request might look something like:
POST /v2/job HTTP/2Host: inference.prodia.comUser-Agent: curl/8.11.1Accept: image/jpegAuthorization: Bearer $PRODIA_TOKENContent-Type: multipart/form-data; boundary=------------------------fYf8kg9C7fC50PWcYjMuWt
--------------------------fYf8kg9C7fC50PWcYjMuWtContent-Disposition: form-data; name="job"; filename="job.json"Content-Type: application/json
{ "type": "inference.flux.schnell.img2img.v1", "config": { "prompt": "grainy photograph of a space explorer", "loras": ["prodia/flux-lora-painting"] }}
--------------------------fYf8kg9C7fC50PWcYjMuWtContent-Disposition: form-data; name="input"; filename="input.jpg"Content-Type: image/jpeg
......JFIF.........more bytes...--------------------------fYf8kg9C7fC50PWcYjMuWt--Job Result
Section titled “Job Result”All jobs return a job result which is a mirror of the original job configuration with additional information from the job execution. A job result includes all fields in the job configuration and the following:
created_atis the UTC time the server created the jobupdated_atis the UTC time the job result was last updatedexpires_atis the UTC time after which the job is considered expiredidis a UUID generated by the server to identify this jobstatehas a single fieldcurrentindicating the final state of the jobmetricscontains the elapsed inference time for the job and additional metrics when appropriate (e.g. iterations per second)erroris an error message present if the final state of the job is “failed”
The job result may also update the config field to include default values
(e.g. random seed used) or even results themselves (e.g.
NSFW image class).
For example, using the job configuration above would render a job result similar to this:
{ "type": "inference.flux.schnell.img2img.v1", "created_at": "2025-01-01T00:00:14.885Z", "updated_at": "2025-01-01T00:00:20.11Z", "expires_at": "2025-01-01T00:01:14.885Z", "id": "c83d7027-240a-484a-aeba-96014e568711", "state": { "current": "completed" }, "config": { "prompt": "grainy photograph of a space explorer", "loras": ["prodia/flux-lora-painting"] }, "metrics": { "elapsed": 5.138920783996582, "ips": 4.864834670706344 }}Job Output Data
Section titled “Job Output Data”Similar to job input data, all jobs support returning a
multipart/form-data response that includes the job result JSON as the job
part and output data as the output parts.
An HTTP trace of such a response might look something like:
HTTP/2 200content-type: multipart/form-data; boundary=b3d4fb976dce6e5d036fa7fb3da645bcfcc56384abb2cb618f8c8bdfd360x-request-id: 835b0174-5a7c-45e0-8256-050d0f2c47a1
--b3d4fb976dce6e5d036fa7fb3da645bcfcc56384abb2cb618f8c8bdfd360Content-Disposition: form-data; name="job"; filename="job.json"Content-Type: application/json
{ "type": "inference.flux.schnell.img2img.v1", "created_at": "2025-01-01T00:00:14.885Z", "updated_at": "2025-01-01T00:00:20.11Z", "expires_at": "2025-01-01T00:01:14.885Z", "id": "c83d7027-240a-484a-aeba-96014e568711", "state": { "current": "completed" }, "config": { "prompt": "grainy photograph of a space explorer", "loras": [ "prodia/flux-lora-painting" ] }, "metrics": { "elapsed": 5.138920783996582, "ips": 4.864834670706344 }}--b3d4fb976dce6e5d036fa7fb3da645bcfcc56384abb2cb618f8c8bdfd360Content-Disposition: form-data; name="output"; filename="f9774fd02a85-8985-4d2a-a093-24bda78322b7"Content-Type: image/jpeg
......JFIF.........more bytes...--b3d4fb976dce6e5d036fa7fb3da645bcfcc56384abb2cb618f8c8bdfd360--Content Negotiation
Section titled “Content Negotiation”Request Body
Section titled “Request Body”The HTTP request format is specified via the request Content-Type header. All
jobs can accept a multipart/form-data request. If a job type doesn’t (or
optionally doesn’t) accept job input data then the Content-Type can be set to
application/json and the job configuration can be sent directly.
Response Body
Section titled “Response Body”The desired HTTP response format is specified via the request Accept header.
All jobs can negotiate a multipart/form-data response which works much like
job input data except that instead of input parts it has
output parts. If a job only outputs a single job output data file, then it
can be returned directly by setting the Accept header to one of the supported
output formats for the job type.
When negotiating a multipart/form-data response the default output content
type can be overridden by specifying a secondary type in the Accept header.
For example Accept: multipart/form-data; image/png would format the response
into a multipart/form-data where the first output part has Content-Type: image/png.
Status Codes
Section titled “Status Codes”200 OK
Section titled “200 OK”A 200 status code
indicates the job was completed successfully.
400 Bad Request
Section titled “400 Bad Request”A 400 status code
indicates that the request was malformed. If possible the server will respond
with a job result with the state set to failed and a message regarding the
error in the error field.
401 Unauthorized
Section titled “401 Unauthorized”A 401 status code
indicates the request requires authentication.
403 Forbidden
Section titled “403 Forbidden”A 403 status code
indicates that the authentication provided does not have
sufficient privileges for the request. This can happen if the job type requires
additional permissions.
429 Too Many Requests
Section titled “429 Too Many Requests”A 429 status code indicates that there is no idle capacity available at
request time. This is a normal part of load management. The client should
retry the request after a delay specified in the Retry-After
response header. The Retry-After header specifies the delay in seconds.
5xx Server Errors
Section titled “5xx Server Errors”A status code in the 5xx range
indicates a server error. These errors are typically transient and will be
resolved soon.