MLflow AI Gateway Migration Guide

The MLflow AI Gateway is deprecated and has been replaced by the MLflow Deployments for LLMs. This page is a migration guide for users of the MLflow AI Gateway.

Configuration YAML file

Deprecated:

routes:
  - name: chat
    route_type: llm/v1/chat
    model:
      provider: openai
      name: gpt-4o-mini
      config:
        openai_api_key: $OPENAI_API_KEY

New:

endpoints:  # Renamed to "endpoints"
  - name: chat
    endpoint_type: llm/v1/chat  # Renamed to "endpoint_type"
    model:
      provider: openai
      name: gpt-4o-mini
      config:
        openai_api_key: $OPENAI_API_KEY

Launching the server

Deprecated:

mlflow gateway start --config-path path/to/config.yaml

New:

mlflow deployments start-server --config-path path/to/config.yaml
#      ^^^^^^^^^^^^^^^^^^^^^^^^
#      Renamed to "deployments start-server"

Querying the server

The fluent APIs have been replaced by the mlflow.deployments.MlflowDeploymentClient APIs. See the table below for the mapping between the deprecated and new APIs.

Deprecated

New

mlflow.gateway.get_route(name)

client.get_endpoint(name)

mlflow.gateway.search_routes()

client.list_endpoints()

mlflow.gateway.query(name, data)

client.predict(endpoint=name, inputs=data)

Deprecated:

import mlflow

mlflow.gateway.set_gateway_uri("http://localhost:5000")

route = mlflow.gateway.get_route("chat")
routes = mlflow.gateway.search_routes()
response = mlflow.gateway.query(
    route="chat",
    data={
        "message": [
            {"role": "user", "content": "Hello"},
        ]
    },
)

New:

from mlflow.deployments import get_deploy_client

client = get_deploy_client("http://localhost:5000")
endpoint = client.get_endpoint("chat")
endpoints = client.list_endpoints()
response = client.predict(
    endpoint="chat",
    inputs={
        "message": [
            {"role": "user", "content": "Hello"},
        ]
    },
)

Databricks

The fluent APIs have been replaced by the mlflow.deployments.DatabricksDeploymentClient APIs. See the table below for the mapping between the deprecated and new APIs.

Deprecated

New

mlflow.gateway.create_route(name, …)

client.create_endpoint(name, …)

mlflow.gateway.get_route(name)

client.get_endpoint(name)

mlflow.gateway.search_routes()

client.list_endpoints()

mlflow.gateway.delete_route(name)

client.delete_endpoint(name)

mlflow.gateway.get_limits(name)

client.get_endpoint(name)[“rate_limits”]

mlflow.gateway.set_limits(name, limits)

client.update_endpoint(name, limits)

mlflow.gateway.query(name, data)

client.predict(endpoint=name, inputs=data)

Deprecated:

import mlflow

mlflow.gateway.set_gateway_uri("databricks")

name = "chat"
mlflow.gateway.create_route(name, ...)
route = mlflow.gateway.get_route(name)
routes = mlflow.gateway.search_routes()
limits = mlflow.gateway.get_limits(name)
mlflow.gateway.set_limits(name, limits)
response = mlflow.gateway.query(
    route=name,
    data={
        "message": [
            {"role": "user", "content": "Hello"},
        ]
    },
)
mlflow.gateway.delete_route(name)

New:

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")

name = "chat"
client.create_endpoint(name, ...)
endpoint = client.get_endpoint(name)
endpoints = client.list_endpoints()
limits = client.gen_endpoint(name)["rate_limits"]
client.update_endpoint(name, {"rate_limits": limits})
response = client.predict(
    endpoint=name,
    inputs={
        "message": [
            {"role": "user", "content": "Hello"},
        ]
    },
)
client.delete_endpoint(name)