Segmenting Images
The segmentation endpoint uses SAM2 (Segment Anything Model 2) from Meta to automatically detect and segment all objects in an image. Unlike background removal which returns a single mask, segmentation returns multiple masks - one for each distinct object detected in the image.
This is useful for:
- Extracting individual objects from complex scenes
- Creating object-level masks for further processing
- Analyzing image composition
Segmenting an image
Section titled “Segmenting an image”Let’s generate an image and then segment it to extract masks for all detected objects.
import fs from "node:fs/promises";import { createProdia } from "prodia/v2";
const prodia = createProdia({ token: process.env.PRODIA_TOKEN,});
// First generate an image to segmentconsole.log("Generating image...");const imageJob = await prodia.job({ type: "inference.flux-fast.schnell.txt2img.v1", config: { prompt: "a cute robot cat on a colorful background", resolution: "1024x1024", },});
const imageBuffer = await imageJob.arrayBuffer();await fs.writeFile("input.jpg", new Uint8Array(imageBuffer));console.log("Saved input.jpg");
// Now segment it using SAM2console.log("Segmenting image...");const segmentJob = await prodia.job( { type: "inference.segment.v1" }, { accept: "multipart/form-data", inputs: [new Uint8Array(imageBuffer)] });
// Get all mask outputsconst formData = await segmentJob.formData();const masks = formData.getAll("output");
for (const [i, mask] of masks.entries()) { const buffer = await mask.arrayBuffer(); await fs.writeFile(`mask_${i}.png`, new Uint8Array(buffer));}
console.log(`Saved ${masks.length} mask files`);node main.jsUnderstanding the output
Section titled “Understanding the output”The segmentation endpoint returns a multipart response containing multiple PNG mask images. Each mask corresponds to a distinct object or region detected by SAM2:
- White pixels (255) indicate the segmented object
- Black pixels (0) indicate everything else
The number of masks varies based on image content - more complex scenes with multiple objects will produce more masks.
Input requirements
Section titled “Input requirements”| Constraint | Value |
|---|---|
| Accepted formats | PNG, JPEG, WebP |
| Minimum dimensions | 256 x 256 |
| Maximum dimensions | 2048 x 2048 |
| Maximum file size | 10 MB |