Documentation | Models › Predictions

Models

Methods

client.models.list(, ):

get/models

Get a paginated list of public models.

Example cURL request:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models

The response will be a pagination object containing a list of model objects.

See the models.get docs for more details about the model object.

Sorting

You can sort the results using the sort_by and sort_direction query parameters.

For example, to get the most recently created models:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  "https://api.replicate.com/v1/models?sort_by=model_created_at&sort_direction=desc"

Available sorting options:

model_created_at: Sort by when the model was first created
latest_version_created_at: Sort by when the model's latest version was created (default)

Sort direction can be asc (ascending) or desc (descending, default).

client.models.create(, ):

ModelCreateResponse

post/models

Create a model.

Example cURL request:

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"owner": "alice", "name": "hot-dog-detector", "description": "Detect hot dogs in images", "visibility": "public", "hardware": "cpu"}' \
  https://api.replicate.com/v1/models

The response will be a model object in the following format:

{
  "url": "https://replicate.com/alice/hot-dog-detector",
  "owner": "alice",
  "name": "hot-dog-detector",
  "description": "Detect hot dogs in images",
  "visibility": "public",
  "github_url": null,
  "paper_url": null,
  "license_url": null,
  "run_count": 0,
  "cover_image_url": null,
  "default_example": null,
  "latest_version": null,
}

Note that there is a limit of 1,000 models per account. For most purposes, we recommend using a single model and pushing new versions of the model as you make changes to it.

client.models.delete(, ): void

delete/models/{model_owner}/{model_name}

Delete a model

Model deletion has some restrictions:

You can only delete models you own.
You can only delete private models.
You can only delete models that have no versions associated with them. Currently you'll need to delete the model's versions before you can delete the model itself.

Example cURL request:

curl -s -X DELETE \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/replicate/hello-world

The response will be an empty 204, indicating the model has been deleted.

client.models.get(, ):

ModelGetResponse

get/models/{model_owner}/{model_name}

Example cURL request:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/replicate/hello-world

The response will be a model object in the following format:

{
  "url": "https://replicate.com/replicate/hello-world",
  "owner": "replicate",
  "name": "hello-world",
  "description": "A tiny model that says hello",
  "visibility": "public",
  "github_url": "https://github.com/replicate/cog-examples",
  "paper_url": null,
  "license_url": null,
  "run_count": 5681081,
  "cover_image_url": "...",
  "default_example": {...},
  "latest_version": {...},
}

The model object includes the input and output schema for the latest version of the model.

Here's an example showing how to fetch the model with cURL and display its input schema with jq:

curl -s \
    -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
    https://api.replicate.com/v1/models/replicate/hello-world \
    | jq ".latest_version.openapi_schema.components.schemas.Input"

This will return the following JSON object:

{
  "type": "object",
  "title": "Input",
  "required": [
    "text"
  ],
  "properties": {
    "text": {
      "type": "string",
      "title": "Text",
      "x-order": 0,
      "description": "Text to prefix with 'hello '"
    }
  }
}

The cover_image_url string is an HTTPS URL for an image file. This can be:

An image uploaded by the model author.
The output file of the example prediction, if the model author has not set a cover image.
The input file of the example prediction, if the model author has not set a cover image and the example prediction has no output file.
A generic fallback image.

The default_example object is a prediction created with this model.

The latest_version object is the model's most recently pushed version.

client.models.search(, ):

CursorURLPage

ModelSearchResponse

query/models

Get a list of public models matching a search query.

Example cURL request:

curl -s -X QUERY \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: text/plain" \
  -d "hello" \
  https://api.replicate.com/v1/models

The response will be a paginated JSON object containing an array of model objects.

See the models.get docs for more details about the model object.

client.models.update(, ):

ModelUpdateResponse

patch/models/{model_owner}/{model_name}

Update select properties of an existing model.

You can update the following properties:

description - Model description
readme - Model README content
github_url - GitHub repository URL
paper_url - Research paper URL
weights_url - Model weights URL
license_url - License URL

Example cURL request:

curl -X PATCH \
  https://api.replicate.com/v1/models/your-username/your-model-name \
  -H "Authorization: Token $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Detect hot dogs in images",
    "readme": "# Hot Dog Detector\n\n🌭 Ketchup, mustard, and onions...",
    "github_url": "https://github.com/alice/hot-dog-detector",
    "paper_url": "https://arxiv.org/abs/2504.17639",
    "weights_url": "https://huggingface.co/alice/hot-dog-detector",
    "license_url": "https://choosealicense.com/licenses/mit/"
  }'

The response will be the updated model object with all of its properties.

Models

Examples

Models.Examples

Methods

client.models.examples.list(, ):

CursorURLPage

Prediction

get/models/{model_owner}/{model_name}/examples

List example predictions made using the model. These are predictions that were saved by the model author as illustrative examples of the model's capabilities.

If you want all the examples for a model, use this operation.

If you just want the model's default example, you can use the models.get operation instead, which includes a default_example object.

Example cURL request:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/replicate/hello-world/examples

The response will be a pagination object containing a list of example predictions:

{
  "next": "https://api.replicate.com/v1/models/replicate/hello-world/examples?cursor=...",
  "previous": "https://api.replicate.com/v1/models/replicate/hello-world/examples?cursor=...",
  "results": [...]
}

Each item in the results list is a prediction object.

Models

Predictions

Models.Predictions

Methods

client.models.predictions.create(, ):

Prediction

post/models/{model_owner}/{model_name}/predictions

Create a prediction using an official model.

If you're not running an official model, use the predictions.create operation instead.

Example cURL request:

curl -s -X POST -H 'Prefer: wait' \
  -d '{"input": {"prompt": "Write a short poem about the weather."}}' \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H 'Content-Type: application/json' \
  https://api.replicate.com/v1/models/meta/meta-llama-3-70b-instruct/predictions

The request will wait up to 60 seconds for the model to run. If this time is exceeded the prediction will be returned in a "starting" state and need to be retrieved using the predictions.get endpoint.

For a complete overview of the deployments.predictions.create API check out our documentation on creating a prediction which covers a variety of use cases.

Parameters

params:

PredictionCreateParams

model_owner: string

Path param: The name of the user or organization that owns the model.

model_name: string

Path param: The name of the model.

input: unknown

Body param: The model's input as a JSON object. The input schema depends on what model you are running. To see the available inputs, click the "API" tab on the model you are running or get the model version and look at its openapi_schema property. For example, stability-ai/sdxl takes prompt as an input.

Files should be passed as HTTP URLs or data URLs.

Use an HTTP URL when:

you have a large file > 256kb
you want to be able to use the file multiple times
you want your prediction metadata to be associable with your input files

Use a data URL when:

you have a small file <= 256kb
you don't want to upload and host the file somewhere
you don't need to use the file again (Replicate will not store it)

stream?: boolean

Body param: This field is deprecated.

Request a URL to receive streaming output using server-sent events (SSE).

This field is no longer needed as the returned prediction will always have a stream entry in its urls property if the model supports streaming.

webhook?: string

Body param: An HTTPS URL for receiving a webhook when the prediction has new output. The webhook will be a POST request where the request body is the same as the response body of the get prediction operation. If there are network problems, we will retry the webhook a few times, so make sure it can be safely called more than once. Replicate will not follow redirects when sending webhook requests to your service, so be sure to specify a URL that will resolve without redirecting.

webhook_events_filter?: Array<"start" | "output" | "logs" | 1 more...>

Body param: By default, we will send requests to your webhook URL whenever there are new outputs or the prediction has finished. You can change which events trigger webhook requests by specifying webhook_events_filter in the prediction request:

start: immediately on prediction start
output: each time a prediction generates an output (note that predictions can generate multiple outputs)
logs: each time log output is generated by a prediction
completed: when the prediction reaches a terminal state (succeeded/canceled/failed)

For example, if you only wanted requests to be sent at the start and end of the prediction, you would provide:

{
  "input": {
    "text": "Alice"
  },
  "webhook": "https://example.com/my-webhook",
  "webhook_events_filter": ["start", "completed"]
}

Requests for event types output and logs will be sent at most once every 500ms. If you request start and completed webhooks, then they'll always be sent regardless of throttling.

cancelAfter?: string

Header param: The maximum time the prediction can run before it is automatically canceled. The lifetime is measured from when the prediction is created.

The duration can be specified as string with an optional unit suffix:

s for seconds (e.g., 30s, 90s)
m for minutes (e.g., 5m, 15m)
h for hours (e.g., 1h, 2h30m)
defaults to seconds if no unit suffix is provided (e.g. 30 is the same as 30s)

You can combine units for more precision (e.g., 1h30m45s).

The minimum allowed duration is 5 seconds.

Prefer?: string

Header param: Leave the request open and wait for the model to finish generating output. Set to wait=n where n is a number of seconds between 1 and 60.

See https://replicate.com/docs/topics/predictions/create-a-prediction#sync-mode for more information.

Returns

Prediction

Request example

import Replicate from 'replicate';

const replicate = new Replicate({
  bearerToken: process.env['REPLICATE_API_TOKEN'], // This is the default and can be omitted
});

const prediction = await replicate.models.predictions.create({
  model_owner: 'model_owner',
  model_name: 'model_name',
  input: { prompt: 'Tell me a joke', system_prompt: 'You are a helpful assistant' },
});

console.log(prediction.id);

200Example

{
  "id": "id",
  "created_at": "2019-12-27T18:11:19.117Z",
  "data_removed": true,
  "error": "error",
  "input": {
    "foo": "bar"
  },
  "model": "model",
  "output": {},
  "status": "starting",
  "urls": {
    "cancel": "https://example.com",
    "get": "https://example.com",
    "web": "https://example.com",
    "stream": "https://example.com"
  },
  "version": "string",
  "completed_at": "2019-12-27T18:11:19.117Z",
  "deadline": "2019-12-27T18:11:19.117Z",
  "deployment": "deployment",
  "logs": "logs",
  "metrics": {
    "total_time": 0
  },
  "started_at": "2019-12-27T18:11:19.117Z"
}

Models

Readme

Models.Readme

Methods

client.models.readme.get(, ):

ReadmeGetResponse

get/models/{model_owner}/{model_name}/readme

Get the README content for a model.

Example cURL request:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/replicate/hello-world/readme

The response will be the README content as plain text in Markdown format:

# Hello World Model

This is an example model that...

Models

Versions

Models.Versions

Methods

client.models.versions.list(, ):

CursorURLPage

VersionListResponse

get/models/{model_owner}/{model_name}/versions

Example cURL request:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/replicate/hello-world/versions

The response will be a JSON array of model version objects, sorted with the most recent version first:

{
  "next": null,
  "previous": null,
  "results": [
    {
      "id": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa",
      "created_at": "2022-04-26T19:29:04.418669Z",
      "cog_version": "0.3.0",
      "openapi_schema": {...}
    }
  ]
}

client.models.versions.delete(, ): void

delete/models/{model_owner}/{model_name}/versions/{version_id}

Delete a model version and all associated predictions, including all output files.

Model version deletion has some restrictions:

You can only delete versions from models you own.
You can only delete versions from private models.
You cannot delete a version if someone other than you has run predictions with it.
You cannot delete a version if it is being used as the base model for a fine tune/training.
You cannot delete a version if it has an associated deployment.
You cannot delete a version if another model version is overridden to use it.

Example cURL request:

curl -s -X DELETE \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/replicate/hello-world/versions/5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa

The response will be an empty 202, indicating the deletion request has been accepted. It might take a few minutes to be processed.

client.models.versions.get(, ):

VersionGetResponse

get/models/{model_owner}/{model_name}/versions/{version_id}

Example cURL request:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/replicate/hello-world/versions/5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa

The response will be the version object:

{
  "id": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa",
  "created_at": "2022-04-26T19:29:04.418669Z",
  "cog_version": "0.3.0",
  "openapi_schema": {...}
}

Every model describes its inputs and outputs with OpenAPI Schema Objects in the openapi_schema property.

The openapi_schema.components.schemas.Input property for the replicate/hello-world model looks like this:

{
  "type": "object",
  "title": "Input",
  "required": [
    "text"
  ],
  "properties": {
    "text": {
      "x-order": 0,
      "type": "string",
      "title": "Text",
      "description": "Text to prefix with 'hello '"
    }
  }
}

The openapi_schema.components.schemas.Output property for the replicate/hello-world model looks like this:

{
  "type": "string",
  "title": "Output"
}

For more details, see the docs on Cog's supported input and output types