Documentation

Trainings

Trainings

Methods

cancel(, ):
post/trainings/{training_id}/cancel

Cancel a training

create(, ):
post/models/{model_owner}/{model_name}/versions/{version_id}/trainings

Start a new training of the model version you specify.

Example request body:

{
  "destination": "{new_owner}/{new_name}",
  "input": {
    "train_data": "https://example.com/my-input-images.zip",
  },
  "webhook": "https://example.com/my-webhook",
}

Example cURL request:

curl -s -X POST \
  -d '{"destination": "{new_owner}/{new_name}", "input": {"input_images": "https://example.com/my-input-images.zip"}}' \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H 'Content-Type: application/json' \
  https://api.replicate.com/v1/models/stability-ai/sdxl/versions/da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf/trainings

The response will be the training object:

{
  "id": "zz4ibbonubfz7carwiefibzgga",
  "model": "stability-ai/sdxl",
  "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf",
  "input": {
    "input_images": "https://example.com/my-input-images.zip"
  },
  "logs": "",
  "error": null,
  "status": "starting",
  "created_at": "2023-09-08T16:32:56.990893084Z",
  "urls": {
    "web": "https://replicate.com/p/zz4ibbonubfz7carwiefibzgga",
     "get": "https://api.replicate.com/v1/predictions/zz4ibbonubfz7carwiefibzgga",
     "cancel": "https://api.replicate.com/v1/predictions/zz4ibbonubfz7carwiefibzgga/cancel"
  }
}

As models can take several minutes or more to train, the result will not be available immediately. To get the final result of the training you should either provide a webhook HTTPS URL for us to call when the results are ready, or poll the get a training endpoint until it has finished.

When a training completes, it creates a new version of the model at the specified destination.

To find some models to train on, check out the trainable language models collection.

Parameters
model_owner: string

Path param: The name of the user or organization that owns the model.

model_name: string

Path param: The name of the model.

version_id: string

Path param: The ID of the version.

destination: string

Body param: A string representing the desired model to push to in the format {destination_model_owner}/{destination_model_name}. This should be an existing model owned by the user or organization making the API request. If the destination is invalid, the server will return an appropriate 4XX response.

input: unknown

Body param: An object containing inputs to the Cog model's train() function.

webhook?: string

Body param: An HTTPS URL for receiving a webhook when the training completes. The webhook will be a POST request where the request body is the same as the response body of the get training operation. If there are network problems, we will retry the webhook a few times, so make sure it can be safely called more than once. Replicate will not follow redirects when sending webhook requests to your service, so be sure to specify a URL that will resolve without redirecting.

webhook_events_filter?: Array<"start" | "output" | "logs" | 1 more...>

Body param: By default, we will send requests to your webhook URL whenever there are new outputs or the training has finished. You can change which events trigger webhook requests by specifying webhook_events_filter in the training request:

  • start: immediately on training start
  • output: each time a training generates an output (note that trainings can generate multiple outputs)
  • logs: each time log output is generated by a training
  • completed: when the training reaches a terminal state (succeeded/canceled/failed)

For example, if you only wanted requests to be sent at the start and end of the training, you would provide:

{
  "destination": "my-organization/my-model",
  "input": {
    "text": "Alice"
  },
  "webhook": "https://example.com/my-webhook",
  "webhook_events_filter": ["start", "completed"]
}

Requests for event types output and logs will be sent at most once every 500ms. If you request start and completed webhooks, then they'll always be sent regardless of throttling.

Returns
TrainingCreateResponse{
id?: string

The unique ID of the training

completed_at?: string
(format: date-time)

The time when the training completed

created_at?: string
(format: date-time)

The time when the training was created

error?: string | null

Error message if the training failed

input?: Record<string, unknown>

The input parameters used for the training

logs?: string

The logs from the training process

metrics?:

Metrics about the training process

model?: string

The name of the model in the format owner/name

output?:

The output of the training process

source?: "web" | "api"

How the training was created

started_at?: string
(format: date-time)

The time when the training started

status?: "starting" | "processing" | "succeeded" | 2 more...

The current status of the training

urls?:

URLs for interacting with the training

version?: string

The ID of the model version used for training

Request example
200Example
get(, ):
get/trainings/{training_id}

Get the current state of a training.

Example cURL request:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga

The response will be the training object:

{
  "completed_at": "2023-09-08T16:41:19.826523Z",
  "created_at": "2023-09-08T16:32:57.018467Z",
  "error": null,
  "id": "zz4ibbonubfz7carwiefibzgga",
  "input": {
    "input_images": "https://example.com/my-input-images.zip"
  },
  "logs": "...",
  "metrics": {
    "predict_time": 502.713876
  },
  "output": {
    "version": "...",
    "weights": "..."
  },
  "started_at": "2023-09-08T16:32:57.112647Z",
  "status": "succeeded",
  "urls": {
    "web": "https://replicate.com/p/zz4ibbonubfz7carwiefibzgga",
    "get": "https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga",
    "cancel": "https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga/cancel"
  },
  "model": "stability-ai/sdxl",
  "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf",
}

status will be one of:

  • starting: the training is starting up. If this status lasts longer than a few seconds, then it's typically because a new worker is being started to run the training.
  • processing: the train() method of the model is currently running.
  • succeeded: the training completed successfully.
  • failed: the training encountered an error during processing.
  • canceled: the training was canceled by its creator.

In the case of success, output will be an object containing the output of the model. Any files will be represented as HTTPS URLs. You'll need to pass the Authorization header to request them.

In the case of failure, error will contain the error encountered during the training.

Terminated trainings (with a status of succeeded, failed, or canceled) will include a metrics object with a predict_time property showing the amount of CPU or GPU time, in seconds, that the training used while running. It won't include time waiting for the training to start. The metrics object will also include a total_time property showing the total time, in seconds, that the training took to complete.

list(): <>
get/trainings

Get a paginated list of all trainings created by the user or organization associated with the provided API token.

This will include trainings created from the API and the website. It will return 100 records per page.

Example cURL request:

curl -s \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/trainings

The response will be a paginated JSON array of training objects, sorted with the most recent training first:

{
  "next": null,
  "previous": null,
  "results": [
    {
      "completed_at": "2023-09-08T16:41:19.826523Z",
      "created_at": "2023-09-08T16:32:57.018467Z",
      "error": null,
      "id": "zz4ibbonubfz7carwiefibzgga",
      "input": {
        "input_images": "https://example.com/my-input-images.zip"
      },
      "metrics": {
        "predict_time": 502.713876
      },
      "output": {
        "version": "...",
        "weights": "..."
      },
      "started_at": "2023-09-08T16:32:57.112647Z",
      "source": "api",
      "status": "succeeded",
      "urls": {
        "web": "https://replicate.com/p/zz4ibbonubfz7carwiefibzgga",
        "get": "https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga",
        "cancel": "https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga/cancel"
      },
      "model": "stability-ai/sdxl",
      "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf",
    }
  ]
}

id will be the unique ID of the training.

source will indicate how the training was created. Possible values are web or api.

status will be the status of the training. Refer to get a single training for possible values.

urls will be a convenience object that can be used to construct new API requests for the given training.

version will be the unique ID of model version used to create the training.