Querying a Mistral Model
This page explains how to query a Mistral model via API once you have deployed your LLM (large language model) service from OUTSCALE Marketplace. For more information about deploying a Mistral LLM service, see Deploying a Mistral LLM Service.
Deployed models expose a REST API that you can query using plain HTTP calls. To run your queries, you must set the following environment variables:
-
OUTSCALE_SERVER_URL
, which is the URL of the virtual machine (VM) hosting your Mistral model.Your server’s URL follows this pattern:
http://${serveur}:5000
. -
OUTSCALE_MODEL_NAME
, which is the name of the model to query.-
For the Small model, the model name to specify is
small-2409
. -
For the Codestral model, the model name to specify is
codestral-2405
. -
For the Ministral 8B model, the model name to specify is
ministral-8b-2410
.
-
The following examples use cURL and Python commands to query a model. For more information about the different types of query that you can use to query a Mistral model, see the official Mistral documentation.
Querying a Model in Chat Completion Mode
You can use the following commands to query you model for text generation tasks:
cURL (Completion)
$ echo $OUTSCALE_SERVER_URL/v1/chat/completions
$ echo $OUTSCALE_MODEL_NAME
$ curl --location $OUTSCALE_SRV_URL/v1/chat/completions \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"model": "$OUTSCALE_MODEL_NAME",
"temperature": 0,
"messages": [
{"role": "user", "content": "Who is the best French painter? Answer in one short sentence."}
],
"stream": false
}'
Python (Completion)
import os
from mistralai import Mistral
client = Mistral(server_url="OUTSCALE_SERVER_URL")
resp = client.chat.complete(
model="OUTSCALE_MODEL_NAME",
messages=[
{
"role": "user",
"content": "Who is the best French painter? Answer in one short sentence.",
}
],
temperature=0
)
print(resp.choices[0].message.content)
Querying a Codestral Model in FIM Mode
You can use the following commands to query your Codestral model in FIM (fill-in-the-middle) mode, for code generation tasks:
cURL (FIM)
$ curl --location $OUTSCALE_SERVER_URL/v1/fim/completions \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"model": "$OUTSCALE_MODEL_NAME",
"prompt": "def count_words_in_file(file_path: str) -> int:",
"suffix": "return n_words",
"stream": false
}'