Track Application Versions with Externally Managed Code
This guide demonstrates MLflow's primary approach for versioning GenAI applications when your core application code resides in an external version control system (VCS) like Git. In this workflow, an MLflow LoggedModel
acts as a metadata hub, linking a conceptual application version to its specific external code (e.g., a Git commit), configurations, and associated MLflow entities like traces and evaluation results.
The mlflow.set_active_model()
function is key to this process, establishing a context so that subsequent traces and evaluations are automatically associated with the correct application version.
Prerequisitesβ
This guide requires the following packages:
- mlflow>=3.1: Core MLflow functionality with GenAI features. MLflow 2.x is not supported as it lacks the version tracking features with
LoggedModel
. - openai>=1.0.0: Required for the OpenAI-based examples in this guide (if using other LLM providers, install their respective SDKs instead)
Install the required packages:
pip install --upgrade "mlflow>=3.1" openai>=1.0.0
Environment Setupβ
Before running the examples below, configure your environment:
- Open Source MLflow
- Databricks
import os
import mlflow
# Optional: Set MLflow tracking URI if using a local or remote tracking server
mlflow.set_tracking_uri("http://localhost:5000")
# Set experiment (creates it if it doesnβt exist)
mlflow.set_experiment("my_genai_app_versions")
# Set your LLM provider API key (e.g., OpenAI)
os.environ["OPENAI_API_KEY"] = "your-api-key-here"
Configure Databricks Authenticationβ
If you're running this code inside a Databricks notebook, you can skip this authentication setup. MLflow will automatically use your notebook's authentication context.
If you're running this code outside of Databricks (e.g., in your local IDE), you need to set up authentication:
import os
# Set Databricks authentication (only needed when running outside Databricks)
os.environ[
"DATABRICKS_HOST"
] = "https://your-workspace.databricks.com" # Your workspace URL
os.environ["DATABRICKS_TOKEN"] = "your-databricks-token" # Your access token
Configure API Keysβ
Set your LLM provider API keys as environment variables:
import os
# Set your OpenAI API key (required for this guide)
os.environ["OPENAI_API_KEY"] = "your-api-key-here"
You can also set these environment variables in your shell before running your script:
# For Databricks authentication (if running outside Databricks)
export DATABRICKS_HOST="https://your-workspace.databricks.com"
export DATABRICKS_TOKEN="your-databricks-token"
# For LLM providers
export OPENAI_API_KEY="your-api-key-here"
Configure MLflow Trackingβ
Set up MLflow to connect to your tracking server and experiment:
import mlflow
# Set MLflow tracking URI and experiment
mlflow.set_tracking_uri("http://127.0.0.1:5000")
mlflow.set_experiment("my_genai_app_versions")
Step 1: Create a LoggedModel (Metadata Hub)β
A LoggedModel
version serves as a central record (metadata hub) for a specific iteration of your application. It doesn't need to store the application code itself; instead, it points to where your code is managed (e.g., a Git commit hash).
We use mlflow.set_active_model()
to declare which LoggedModel
we are currently working with, or to create a new one if it doesn't already exist. This function returns an ActiveModel
object containing the model_id
which is useful for subsequent operations.
import subprocess
# Define your application and its version identifier
app_name = "customer_support_agent"
# Get current git commit hash for versioning
try:
git_commit = (
subprocess.check_output(["git", "rev-parse", "HEAD"])
.decode("ascii")
.strip()[:8]
)
version_identifier = f"git-{git_commit}"
except subprocess.CalledProcessError:
version_identifier = "local-dev" # Fallback if not in a git repo
logged_model_name = f"{app_name}-{version_identifier}"
# Set the active model context
active_model_info = mlflow.set_active_model(name=logged_model_name)
print(
f"Active LoggedModel: '{active_model_info.name}', Model ID: '{active_model_info.model_id}'"
)
This LoggedModel
(customer_support_agent-git-xxxxxxx
or customer_support_agent-local-dev
) now acts as the reference for this specific version of your externally managed code.
Step 2: Record Parameters for the Model Versionβ
You can log key configuration parameters that define this version of your application directly to the LoggedModel
using mlflow.log_model_params()
. This is useful for recording things like LLM names, temperature settings, or retrieval strategies that are tied to this code version.
# Log key parameters defining this application version
app_params = {
"llm": "gpt-4o-mini",
"temperature": 0.7,
"retrieval_strategy": "vector_search_v3",
}
mlflow.log_model_params(model_id=active_model_info.model_id, params=app_params)
print(f"Logged parameters to LoggedModel ID: {active_model_info.model_id}")
Step 3: Link Traces to the Application Versionβ
With an active LoggedModel
set, any traces generated by MLflow's autologging (or manual tracing) will be automatically associated with this specific LoggedModel
version.
import mlflow
import openai
# Enable autologging for OpenAI
mlflow.openai.autolog()
# Define your agent function
def simple_agent_call(user_query):
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini", # Matches "llm" param
messages=[{"role": "user", "content": user_query}],
temperature=0.7, # Matches "temperature" param
max_tokens=150,
)
return response.choices[0].message.content
# Example agent invocation
print("\nSimulating agent call...")
agent_response = simple_agent_call("How can I track my package?")
print(f"Agent response: {agent_response}")
print(
f"Trace for this call is automatically linked to LoggedModel '{active_model_info.name}'."
)
The trace generated by this openai
call is now linked to customer_support_agent-git-xxxxxxx
.
Step 4: Retrieve and Inspect the LoggedModelβ
You can fetch the LoggedModel back using mlflow.get_logged_model()
to inspect its properties, parameters, and associated metadata. This is useful for verifying what was logged or for retrieving configuration information.
# Fetch the LoggedModel using its ID
logged_model = mlflow.get_logged_model(model_id=active_model_info.model_id)
# Inspect basic properties
print(f"\n=== LoggedModel Information ===")
print(f"Model ID: {logged_model.model_id}")
print(f"Name: {logged_model.name}")
print(f"Experiment ID: {logged_model.experiment_id}")
print(f"Status: {logged_model.status}")
print(f"Model Type: {logged_model.model_type}")
# Access the parameters
print(f"\n=== Model Parameters ===")
for param_name, param_value in logged_model.params.items():
print(f"{param_name}: {param_value}")
# Check creation timestamp
from datetime import datetime
creation_time = datetime.fromtimestamp(logged_model.creation_timestamp / 1000)
print(f"\n=== Timestamps ===")
print(f"Created at: {creation_time}")
# Access tags if any were set
if logged_model.tags:
print(f"\n=== Model Tags ===")
for tag_key, tag_value in logged_model.tags.items():
print(f"{tag_key}: {tag_value}")
# You can also search for all LoggedModels in the experiment
from mlflow import search_logged_models
all_models = search_logged_models(
experiment_ids=[logged_model.experiment_id],
filter_string=f"name = '{app_name}'",
output_format="list",
)
print(f"\n=== All '{app_name}' versions in experiment ===")
for model in all_models:
print(
f"- {model.name} (ID: {model.model_id}, Created: {datetime.fromtimestamp(model.creation_timestamp / 1000)})"
)
The LoggedModel
object provides access to all the metadata about your application version, making it easy to programmatically retrieve configuration details or build automated workflows around your versioned applications.
Step 5: Link Evaluation Results to the Application Versionβ
You can evaluate your application and link the results to this LoggedModel
version using mlflow.genai.evaluate()
to assess your application's performance and automatically associate the metrics, evaluation tables, and traces with your specific LoggedModel
version.
Next Stepsβ
- Compare Application Versions: Learn how to compare different
LoggedModel
versions.