Use Prompts in Apps

Once you have created and versioned your prompts in the MLflow Prompt Registry, the next crucial step is to integrate them into your GenAI applications. This page guides you on how to load prompts, bind variables, handle versioning, and ensure complete lineage by linking prompt versions to your logged MLflow Models.

You will learn:

How to load specific prompt versions or aliases using mlflow.genai.load_prompt().
The process of formatting prompts with dynamic data using prompt.format().
Strategies for managing prompt versions within your application code.
How to use MLflow prompts with other popular frameworks like LangChain or LlamaIndex via prompt.to_single_brace_format().
The importance of logging the specific prompt version used when logging an MLflow Model.
How this linkage aids reproducibility, debugging, and inspection of models.
Best practices for error handling.

Loading Prompts from the Registry

The primary way to access a registered prompt in your application code is by using the mlflow.genai.load_prompt() function. This function retrieves a specific prompt version (or a version pointed to by an alias) from the registry.

It uses a special URI format: prompts:/<prompt_name>/<version_or_alias>.

<prompt_name>: The unique name of the prompt in the registry.
<version_or_alias>: Can be a specific version number (e.g., 1, 2) or an alias (e.g., production, latest-dev).

import mlflow

prompt_name = "my-sdk-prompt"

# Load by specific version (assuming version 1 exists)
mlflow.genai.load_prompt(name_or_uri=f"prompts:/{prompt_name}/1")

# Load by alias (assuming an alias 'staging' points to a version of a prompt)
mlflow.genai.load_prompt(name_or_uri=f"prompts:/{prompt_name}@staging")

Formatting Prompts with Variables

Once you have loaded a prompt object (which is a Prompt instance), you can populate its template variables using the prompt.format() method. This method takes keyword arguments where the keys match the variable names in your prompt template (without the {{ }} braces).

import mlflow

# define a prompt template
prompt_template = """\
You are an expert AI assistant. Answer the user's question with clarity, accuracy, and conciseness.

## Question:
{{question}}

## Guidelines:
- Keep responses factual and to the point.
- If relevant, provide examples or step-by-step instructions.
- If the question is ambiguous, clarify before answering.

Respond below:
"""
prompt = mlflow.genai.register_prompt(
    name="ai_assistant_prompt",
    template=prompt_template,
    commit_message="Initial version of AI assistant",
)

question = "What is MLflow?"
response = (
    client.chat.completions.create(
        messages=[{"role": "user", "content": prompt.format(question=question)}],
        model="gpt-4o-mini",
        temperature=0.1,
        max_tokens=2000,
    )
    .choices[0]
    .message.content
)

Using Prompts with Other Frameworks (LangChain, LlamaIndex)

MLflow prompts use the {{variable}} double-brace syntax for templating. Some other popular frameworks, like LangChain and LlamaIndex, often expect a single-brace syntax (e.g., {variable}). To facilitate seamless integration, the MLflow Prompt object provides the prompt.to_single_brace_format() method. This method returns the prompt template string converted to a single-brace format, ready to be used by these frameworks.

import mlflow
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# Load registered prompt
prompt = mlflow.genai.load_prompt("prompts:/summarization-prompt/2")

# Create LangChain prompt object
langchain_prompt = ChatPromptTemplate.from_messages(
    [
        (
            # IMPORTANT: Convert prompt template from double to single curly braces format
            "system",
            prompt.to_single_brace_format(),
        ),
        ("placeholder", "{messages}"),
    ]
)

# Define the LangChain chain
llm = ChatOpenAI()
chain = langchain_prompt | llm

# Invoke the chain
response = chain.invoke({"num_sentences": 1, "sentences": "This is a test sentence."})
print(response)

Linking Prompts to Logged Models for Full Lineage

For complete reproducibility and traceability, it is crucial to log which specific prompt versions your application (or model) uses. When you log your GenAI application as an MLflow Model (e.g., using mlflow.pyfunc.log_model(), mlflow.langchain.log_model(), etc.), you should include information about the prompts from the registry that it utilizes.

MLflow is designed to facilitate this. When a model is logged, and that model's code loads prompts using the prompts:/ URI via mlflow.genai.load_prompt(), MLflow can automatically record these prompt dependencies as part of the logged model's metadata.

Benefits of this linkage:

Reproducibility: Knowing the exact prompt versions used by a model version allows you to reproduce its behavior precisely.
Debugging: If a model version starts behaving unexpectedly, you can easily check if a change in an underlying prompt (even if updated via an alias) is the cause.
Auditing and Governance: Maintain a clear record of which prompts were used for any given model version.
Impact Analysis: Understand which models might be affected if a particular prompt version is found to be problematic.

How to Ensure Linkage (Conceptual)

While automatic detection is a goal, explicitly ensuring your model logging code or serving environment correctly captures these dependencies is good practice.

Consistent URI Usage: Always use the prompts:/<name>/<version_or_alias> URI with mlflow.genai.load_prompt() within your model's code.
MLflow Model Logging: When you log your model (e.g., your agent, your LangChain chain) using an appropriate mlflow.<flavor>.log_model() function, MLflow flavors designed for GenAI will often inspect the model for these prompts:/ URIs and record them.

Error Handling

When integrating prompts, implement robust error handling:

Prompt Loading: Handle cases where a prompt name, version, or alias doesn't exist.
Formatting: Catch errors if variables are missing or mismatched during prompt.format().
Logging: Ensure that failures during model logging (including prompt linkage) are caught and addressed.

Key Takeaways

Use mlflow.genai.load_prompt(uri="prompts:/<name>/<version_or_alias>") to fetch prompts from the registry.
Populate template variables with prompt.format(**kwargs).
Leverage prompt.to_single_brace_format() for compatibility with frameworks like LangChain.
Logging your application as an MLflow Model should capture the specific prompt versions used, providing crucial lineage and reproducibility.
Using aliases (e.g., prompts:/my-prompt/production) in your application code allows you to update the underlying prompt version in the registry without redeploying your application.
Robust error handling is essential for reliable application behavior.

Prerequisites

Application codebase where you intend to use prompts.
Access to an MLflow Prompt Registry with registered prompts.
Familiarity with MLflow Model Logging concepts if you plan to link prompts to models.

By following these guidelines, you can effectively integrate version-controlled prompts from the MLflow Prompt Registry into your applications, enhancing their maintainability, reliability, and traceability.

Loading Prompts from the Registry​

Formatting Prompts with Variables​

Using Prompts with Other Frameworks (LangChain, LlamaIndex)​

Linking Prompts to Logged Models for Full Lineage​

How to Ensure Linkage (Conceptual)​

Error Handling​

Key Takeaways​

Prerequisites​