Deployments
DeploymentsResource
Methods
Get a list of deployments associated with the current account, including the latest release configuration for each deployment.
Example cURL request:
curl -s \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
https://api.replicate.com/v1/deployments
The response will be a paginated JSON array of deployment objects, sorted with the most recent deployment first:
{
"next": "http://api.replicate.com/v1/deployments?cursor=cD0yMDIzLTA2LTA2KzIzJTNBNDAlM0EwOC45NjMwMDAlMkIwMCUzQTAw",
"previous": null,
"results": [
{
"owner": "replicate",
"name": "my-app-image-generator",
"current_release": {
"number": 1,
"model": "stability-ai/sdxl",
"version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf",
"created_at": "2024-02-15T16:32:57.018467Z",
"created_by": {
"type": "organization",
"username": "acme",
"name": "Acme Corp, Inc.",
"avatar_url": "https://cdn.replicate.com/avatars/acme.png",
"github_url": "https://github.com/acme"
},
"configuration": {
"hardware": "gpu-t4",
"min_instances": 1,
"max_instances": 5
}
}
}
]
}
Create a new deployment:
Example cURL request:
curl -s \
-X POST \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "my-app-image-generator",
"model": "stability-ai/sdxl",
"version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf",
"hardware": "gpu-t4",
"min_instances": 0,
"max_instances": 3
}' \
https://api.replicate.com/v1/deployments
The response will be a JSON object describing the deployment:
{
"owner": "acme",
"name": "my-app-image-generator",
"current_release": {
"number": 1,
"model": "stability-ai/sdxl",
"version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf",
"created_at": "2024-02-15T16:32:57.018467Z",
"created_by": {
"type": "organization",
"username": "acme",
"name": "Acme Corp, Inc.",
"avatar_url": "https://cdn.replicate.com/avatars/acme.png",
"github_url": "https://github.com/acme"
},
"configuration": {
"hardware": "gpu-t4",
"min_instances": 1,
"max_instances": 5
}
}
}
Delete a deployment
Deployment deletion has some restrictions:
- You can only delete deployments that have been offline and unused for at least 15 minutes.
Example cURL request:
curl -s -X DELETE \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
https://api.replicate.com/v1/deployments/acme/my-app-image-generator
The response will be an empty 204, indicating the deployment has been deleted.
Get information about a deployment by name including the current release.
Example cURL request:
curl -s \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
https://api.replicate.com/v1/deployments/replicate/my-app-image-generator
The response will be a JSON object describing the deployment:
{
"owner": "acme",
"name": "my-app-image-generator",
"current_release": {
"number": 1,
"model": "stability-ai/sdxl",
"version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf",
"created_at": "2024-02-15T16:32:57.018467Z",
"created_by": {
"type": "organization",
"username": "acme",
"name": "Acme Corp, Inc.",
"avatar_url": "https://cdn.replicate.com/avatars/acme.png",
"github_url": "https://github.com/acme"
},
"configuration": {
"hardware": "gpu-t4",
"min_instances": 1,
"max_instances": 5
}
}
}
Update properties of an existing deployment, including hardware, min/max instances, and the deployment's underlying model version.
Example cURL request:
curl -s \
-X PATCH \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"min_instances": 3, "max_instances": 10}' \
https://api.replicate.com/v1/deployments/acme/my-app-image-generator
The response will be a JSON object describing the deployment:
{
"owner": "acme",
"name": "my-app-image-generator",
"current_release": {
"number": 2,
"model": "stability-ai/sdxl",
"version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf",
"created_at": "2024-02-15T16:32:57.018467Z",
"created_by": {
"type": "organization",
"username": "acme",
"name": "Acme Corp, Inc.",
"avatar_url": "https://cdn.replicate.com/avatars/acme.png",
"github_url": "https://github.com/acme"
},
"configuration": {
"hardware": "gpu-t4",
"min_instances": 3,
"max_instances": 10
}
}
}
Updating any deployment properties will increment the number field of the current_release.
The SKU for the hardware used to run the model. Possible values can be retrieved from the hardware.list endpoint.
The maximum number of instances for scaling.
The minimum number of instances for scaling.
The ID of the model version that you want to deploy
The name of the deployment.
The owner of the deployment.
Predictions
DeploymentsResource.PredictionsResource
Methods
Create a prediction for the deployment and inputs you provide.
Example cURL request:
curl -s -X POST -H 'Prefer: wait' \
-d '{"input": {"prompt": "A photo of a bear riding a bicycle over the moon"}}' \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H 'Content-Type: application/json' \
https://api.replicate.com/v1/deployments/acme/my-app-image-generator/predictions
The request will wait up to 60 seconds for the model to run. If this time is exceeded the prediction will be returned in a "starting" state and need to be retrieved using the predictions.get endpoint.
For a complete overview of the deployments.predictions.create API check out our documentation on creating a prediction which covers a variety of use cases.