inference.sam3.segment.v1
The inference.sam3.segment.v1 job performs text-prompted image segmentation using
Meta’s SAM 3 (Segment Anything Model 3). Unlike SAM 2, SAM 3 accepts natural language
text prompts to identify and segment specific objects in images.
Basic Usage
Section titled “Basic Usage”{ "type": "inference.sam3.segment.v1", "config": { "prompt": "fish" }}This returns one mask per detected instance matching the prompt.
Configuration Options
Section titled “Configuration Options”| Parameter | Type | Default | Description |
|---|---|---|---|
prompt | string | (required) | Text describing what to segment (e.g., “yellow school bus”, “person”, “cat”) |
confidence_threshold | number | 0.5 | Confidence threshold (0.0-1.0). Lower values return more masks, higher values only high-confidence masks |
Examples
Section titled “Examples”Segment all fish in an image
Section titled “Segment all fish in an image”{ "type": "inference.sam3.segment.v1", "config": { "prompt": "fish" }}High confidence detection only
Section titled “High confidence detection only”{ "type": "inference.sam3.segment.v1", "config": { "prompt": "person", "confidence_threshold": 0.9 }}Low confidence for more detections
Section titled “Low confidence for more detections”{ "type": "inference.sam3.segment.v1", "config": { "prompt": "bird", "confidence_threshold": 0.3 }}Input Requirements
Section titled “Input Requirements”- Format: PNG, JPEG, or WebP
- Size: 256x256 minimum, 4096x4096 maximum
- Max file size: 10MB
Output
Section titled “Output”Returns one or more binary masks as PNG images. Each mask corresponds to a detected instance of the prompted object. Masks are grayscale images where:
- White (255) = object pixels
- Black (0) = background pixels
Performance
Section titled “Performance”Tested on NVIDIA H100 80GB:
- Model load time: ~8.3s
- Average inference time: ~88ms per image
- Memory usage: ~12GB VRAM
Confidence Threshold Effects
Section titled “Confidence Threshold Effects”| Threshold | Typical Result |
|---|---|
| 0.3 | Many detections, may include false positives |
| 0.5 | Balanced detection (default) |
| 0.7 | Fewer, higher quality detections |
| 0.9 | Only very confident detections |
Schema
Section titled “Schema”{ "type": "object", "required": [ "type", "config" ], "additionalProperties": false, "properties": { "type": { "enum": [ "inference.segment.v2", "inference.sam3.segment.v1" ] }, "config": { "type": "object", "required": [ "prompt" ], "additionalProperties": false, "properties": { "prompt": { "type": "string", "minLength": 1, "maxLength": 500, "description": "Text prompt describing what to segment (e.g., 'yellow school bus', 'person', 'cat')." }, "confidence_threshold": { "type": "number", "default": 0.5, "minimum": 0, "maximum": 1, "description": "Confidence threshold for detections. Lower values return more masks, higher values only return high-confidence masks." } } } }}