Track Application Versions with Externally Managed Code

This guide demonstrates MLflow's primary approach for versioning GenAI applications when your core application code resides in an external version control system (VCS) like Git. In this workflow, an MLflow LoggedModel acts as a metadata hub, linking a conceptual application version to its specific external code (e.g., a Git commit), configurations, and associated MLflow entities like traces and evaluation results.

The mlflow.set_active_model() function is key to this process, establishing a context so that subsequent traces and evaluations are automatically associated with the correct application version.

Prerequisites

This guide requires the following packages:

mlflow>=3.1: Core MLflow functionality with GenAI features. MLflow 2.x is not supported as it lacks the version tracking features with LoggedModel.
openai>=1.0.0: Required for the OpenAI-based examples in this guide (if using other LLM providers, install their respective SDKs instead)

Install the required packages:

pip install --upgrade "mlflow>=3.1" openai>=1.0.0

Environment Setup

Before running the examples below, configure your environment:

Open Source MLflow
Databricks

import os
import mlflow

# Optional: Set MLflow tracking URI if using a local or remote tracking server
mlflow.set_tracking_uri("http://localhost:5000")

# Set experiment (creates it if it doesn’t exist)
mlflow.set_experiment("my_genai_app_versions")

# Set your LLM provider API key (e.g., OpenAI)
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

Configure Databricks Authentication

note

If you're running this code inside a Databricks notebook, you can skip this authentication setup. MLflow will automatically use your notebook's authentication context.

If you're running this code outside of Databricks (e.g., in your local IDE), you need to set up authentication:

import os

# Set Databricks authentication (only needed when running outside Databricks)
os.environ[
    "DATABRICKS_HOST"
] = "https://your-workspace.databricks.com"  # Your workspace URL
os.environ["DATABRICKS_TOKEN"] = "your-databricks-token"  # Your access token

Configure API Keys

Set your LLM provider API keys as environment variables:

import os

# Set your OpenAI API key (required for this guide)
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

tip

You can also set these environment variables in your shell before running your script:

# For Databricks authentication (if running outside Databricks)
export DATABRICKS_HOST="https://your-workspace.databricks.com"
export DATABRICKS_TOKEN="your-databricks-token"

# For LLM providers
export OPENAI_API_KEY="your-api-key-here"

Configure MLflow Tracking

Set up MLflow to connect to your tracking server and experiment:

import mlflow

# Set MLflow tracking URI and experiment
mlflow.set_tracking_uri("http://127.0.0.1:5000")
mlflow.set_experiment("my_genai_app_versions")

Step 1: Create a LoggedModel (Metadata Hub)

A LoggedModel version serves as a central record (metadata hub) for a specific iteration of your application. It doesn't need to store the application code itself; instead, it points to where your code is managed (e.g., a Git commit hash).

We use mlflow.set_active_model() to declare which LoggedModel we are currently working with, or to create a new one if it doesn't already exist. This function returns an ActiveModel object containing the model_id which is useful for subsequent operations.

import subprocess

# Define your application and its version identifier
app_name = "customer_support_agent"

# Get current git commit hash for versioning
try:
    git_commit = (
        subprocess.check_output(["git", "rev-parse", "HEAD"])
        .decode("ascii")
        .strip()[:8]
    )
    version_identifier = f"git-{git_commit}"
except subprocess.CalledProcessError:
    version_identifier = "local-dev"  # Fallback if not in a git repo

logged_model_name = f"{app_name}-{version_identifier}"

# Set the active model context
active_model_info = mlflow.set_active_model(name=logged_model_name)
print(
    f"Active LoggedModel: '{active_model_info.name}', Model ID: '{active_model_info.model_id}'"
)

This LoggedModel (customer_support_agent-git-xxxxxxx or customer_support_agent-local-dev) now acts as the reference for this specific version of your externally managed code.

Step 2: Record Parameters for the Model Version

You can log key configuration parameters that define this version of your application directly to the LoggedModel using mlflow.log_model_params(). This is useful for recording things like LLM names, temperature settings, or retrieval strategies that are tied to this code version.

# Log key parameters defining this application version
app_params = {
    "llm": "gpt-4o-mini",
    "temperature": 0.7,
    "retrieval_strategy": "vector_search_v3",
}

mlflow.log_model_params(model_id=active_model_info.model_id, params=app_params)
print(f"Logged parameters to LoggedModel ID: {active_model_info.model_id}")

Step 3: Link Traces to the Application Version

With an active LoggedModel set, any traces generated by MLflow's autologging (or manual tracing) will be automatically associated with this specific LoggedModel version.

import mlflow
import openai

# Enable autologging for OpenAI
mlflow.openai.autolog()


# Define your agent function
def simple_agent_call(user_query):
    client = openai.OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o-mini",  # Matches "llm" param
        messages=[{"role": "user", "content": user_query}],
        temperature=0.7,  # Matches "temperature" param
        max_tokens=150,
    )
    return response.choices[0].message.content


# Example agent invocation
print("\nSimulating agent call...")
agent_response = simple_agent_call("How can I track my package?")
print(f"Agent response: {agent_response}")
print(
    f"Trace for this call is automatically linked to LoggedModel '{active_model_info.name}'."
)

The trace generated by this openai call is now linked to customer_support_agent-git-xxxxxxx.

Step 4: Retrieve and Inspect the LoggedModel

You can fetch the LoggedModel back using mlflow.get_logged_model() to inspect its properties, parameters, and associated metadata. This is useful for verifying what was logged or for retrieving configuration information.

# Fetch the LoggedModel using its ID
logged_model = mlflow.get_logged_model(model_id=active_model_info.model_id)

# Inspect basic properties
print(f"\n=== LoggedModel Information ===")
print(f"Model ID: {logged_model.model_id}")
print(f"Name: {logged_model.name}")
print(f"Experiment ID: {logged_model.experiment_id}")
print(f"Status: {logged_model.status}")
print(f"Model Type: {logged_model.model_type}")

# Access the parameters
print(f"\n=== Model Parameters ===")
for param_name, param_value in logged_model.params.items():
    print(f"{param_name}: {param_value}")

# Check creation timestamp
from datetime import datetime

creation_time = datetime.fromtimestamp(logged_model.creation_timestamp / 1000)
print(f"\n=== Timestamps ===")
print(f"Created at: {creation_time}")

# Access tags if any were set
if logged_model.tags:
    print(f"\n=== Model Tags ===")
    for tag_key, tag_value in logged_model.tags.items():
        print(f"{tag_key}: {tag_value}")

# You can also search for all LoggedModels in the experiment
from mlflow import search_logged_models

all_models = search_logged_models(
    experiment_ids=[logged_model.experiment_id],
    filter_string=f"name = '{app_name}'",
    output_format="list",
)
print(f"\n=== All '{app_name}' versions in experiment ===")
for model in all_models:
    print(
        f"- {model.name} (ID: {model.model_id}, Created: {datetime.fromtimestamp(model.creation_timestamp / 1000)})"
    )

The LoggedModel object provides access to all the metadata about your application version, making it easy to programmatically retrieve configuration details or build automated workflows around your versioned applications.

Step 5: Link Evaluation Results to the Application Version

You can evaluate your application and link the results to this LoggedModel version using mlflow.genai.evaluate() to assess your application's performance and automatically associate the metrics, evaluation tables, and traces with your specific LoggedModel version.

Next Steps

Compare Application Versions: Learn how to compare different LoggedModel versions.

Prerequisites​

Environment Setup​

Configure Databricks Authentication​

Configure API Keys​

Configure MLflow Tracking​

Step 1: Create a LoggedModel (Metadata Hub)​

Step 2: Record Parameters for the Model Version​

Step 3: Link Traces to the Application Version​

Step 4: Retrieve and Inspect the LoggedModel​

Step 5: Link Evaluation Results to the Application Version​

Next Steps​