Deploy Model API
The deploy model operation reads the model’s chunks from the model index and then creates an instance of the model to cache in memory. This operation requires the model_id.
Starting with OpenSearch version 2.13, externally hosted models are deployed automatically by default when you send a Predict API request for the first time. To disable automatic deployment for an externally hosted model, set plugins.ml_commons.model_auto_deploy.enable to false:
PUT _cluster/settings
{
"persistent": {
"plugins.ml_commons.model_auto_deploy.enable": "false"
}
}
For information about user access for this API, see Model access control considerations.
Endpoints
POST /_plugins/_ml/models/<model_id>/_deploy
Example request: Deploying to all available ML nodes
In this example request, OpenSearch deploys the model to any available OpenSearch ML node:
POST /_plugins/_ml/models/WWQI44MBbzI2oUKAvNUt/_deploy
Example request: Deploying to a specific node
If you want to reserve the memory of other ML nodes within your cluster, you can deploy your model to a specific node(s) by specifying the node_ids in the request body:
POST /_plugins/_ml/models/WWQI44MBbzI2oUKAvNUt/_deploy
{
"node_ids": ["4PLK7KJWReyX0oWKnBA8nA"]
}
Example response
The Deploy Model API returns a task_id that you can use to monitor the deployment progress:
{
"task_id": "hA8P44MBhyWuIwnfvTKP",
"task_type": "DEPLOY_MODEL",
"status": "CREATED"
}
Monitoring deployment status
To check the status of your model deployment and retrieve the model ID when deployment completes, use the Get ML Task API and provide the returned task_id as a path parameter:
GET /_plugins/_ml/tasks/hA8P44MBhyWuIwnfvTKP
The Get ML Task API returns different response formats depending on whether the deployment is in progress or completed. For detailed information about all possible response formats, see Get ML Task API.
If a cluster or node is restarted, then you need to redeploy the model. To learn how to set up automatic redeployment, see Enable auto redeploy.