{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Retriever Evaluation with MLflow" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "Download this Notebook" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "f8938de9-7fae-41cd-ad6b-7ee26c288eab", "showTitle": false, "title": "" } }, "source": [ "In MLflow 2.8.0, we introduced a new model type \"retriever\" to the `mlflow.evaluate()` API. It helps you to evaluate the retriever in a RAG application. It contains two built-in metrics `precision_at_k` and `recall_at_k`. In MLflow 2.9.0, `ndcg_at_k` is available.\n", "\n", "This notebook illustrates how to use `mlflow.evaluate()` to evaluate the retriever in a RAG application. It has the following steps:\n", "\n", "* Step 1: Install and Load Packages\n", "* Step 2: Evaluation Dataset Preparation\n", "* Step 3: Calling `mlflow.evaluate()`\n", "* Step 4: Result Analysis and Visualization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 1: Install and Load Packages" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "5bf12edb-2498-4edd-aeff-b4844451850f", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "%pip install mlflow==2.9.0 langchain==0.0.339 openai faiss-cpu gensim nltk pyLDAvis tiktoken" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "414eb948-7f7a-411b-8308-facadb0bdde8", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "import ast\n", "import os\n", "import pprint\n", "from typing import List\n", "\n", "import pandas as pd\n", "from langchain.docstore.document import Document\n", "from langchain.embeddings.openai import OpenAIEmbeddings\n", "from langchain.text_splitter import CharacterTextSplitter\n", "from langchain.vectorstores import FAISS\n", "\n", "import mlflow\n", "\n", "os.environ[\"OPENAI_API_KEY\"] = \"\"\n", "\n", "CHUNK_SIZE = 1000\n", "\n", "# Assume running from https://github.com/mlflow/mlflow/blob/master/examples/llms/rag\n", "OUTPUT_DF_PATH = \"question_answer_source.csv\"\n", "SCRAPPED_DOCS_PATH = \"mlflow_docs_scraped.csv\"\n", "EVALUATION_DATASET_PATH = \"static_evaluation_dataset.csv\"\n", "DB_PERSIST_DIR = \"faiss_index\"" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "eebcf0d9-6634-47d9-808d-e79c5a50fbbf", "showTitle": false, "title": "" } }, "source": [ "## Step 2: Evaluation Dataset Preparation\n", "The evaluation dataset should contain three columns: questions, ground truth doc IDs, retrieved relevant doc IDs. A \"doc ID\" is a unique string identifier of the documents in you RAG application. For example, it could be the URL of a documentation web page, or the file path of a PDF document.\n", "\n", "If you have a list of questions that you would like to evaluate, please see 1.1 Manual Preparation. If you do not have a question list yet, please see 1.2 Generate the Evaluation Dataset.\n" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "f8a690cc-7672-4f24-8518-8faabfc9afea", "showTitle": false, "title": "" } }, "source": [ "### Manual Preparation\n", "\n", "When evaluating a retriever, it's recommended to save the retrieved document IDs into a static dataset represented by a Pandas Dataframe or an MLflow Pandas Dataset containing the input queries, retrieved relevant document IDs, and the ground-truth document IDs for the evaluation.\n", "\n", "#### Concepts\n", "\n", "A \"document ID\" is a string that identifies a document.\n", "\n", "A list of \"retrieved relevant document IDs\" are the output of the retriever for a specific input query and a `k` value.\n", "\n", "A list of \"ground-truth document IDs\" are the labeled relevant documents for a specific input query.\n", "\n", "#### Expected Data Format\n", "\n", "For each row, the retrieved relevant document IDs and the ground-truth relevant document IDs should be provided as a tuple of document ID strings.\n", "\n", "The column name of the retrieved relevant document IDs should be specified by the `predictions` parameter, and the column name of the ground-truth relevant document IDs should be specified by the `targets` parameter.\n", "\n", "Here is a simple example dataset that illustrates the expected data format. The doc IDs are the paths of the documentation pages." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "1a61b1b2-582e-49d5-864d-b58d2b6c3392", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "data = pd.DataFrame(\n", " {\n", " \"questions\": [\n", " \"What is MLflow?\",\n", " \"What is Databricks?\",\n", " \"How to serve a model on Databricks?\",\n", " \"How to enable MLflow Autologging for my workspace by default?\",\n", " ],\n", " \"retrieved_context\": [\n", " [\n", " \"mlflow/index.html\",\n", " \"mlflow/quick-start.html\",\n", " ],\n", " [\n", " \"introduction/index.html\",\n", " \"getting-started/overview.html\",\n", " ],\n", " [\n", " \"machine-learning/model-serving/index.html\",\n", " \"machine-learning/model-serving/model-serving-intro.html\",\n", " ],\n", " [],\n", " ],\n", " \"ground_truth_context\": [\n", " [\"mlflow/index.html\"],\n", " [\"introduction/index.html\"],\n", " [\n", " \"machine-learning/model-serving/index.html\",\n", " \"machine-learning/model-serving/llm-optimized-model-serving.html\",\n", " ],\n", " [\"mlflow/databricks-autologging.html\"],\n", " ],\n", " }\n", ")" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "f740b47c-71ee-4633-944c-172887ff5081", "showTitle": false, "title": "" } }, "source": [ "### Generate the Evaluation Dataset\n", "There are two steps to generate the evaluation dataset: generate questions with ground truth doc IDs and retrieve relevant doc IDs. " ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "f6beddae-85e2-44e7-8ec6-7ca2f02bc16b", "showTitle": false, "title": "" } }, "source": [ "\n", "#### Generate Questions with Ground Truth Doc IDs\n", "If you don't have a list of questions to evaluate, you can generate them using LLMs. The [Question Generation Notebook](https://mlflow.org/docs/latest/llms/rag/notebooks/question-generation-retrieval-evaluation.html) provides an example way to do it. Here is the result of running that notebook." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "98bf55c7-3e58-4fff-bc0e-1af58d64839f", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "generated_df = pd.read_csv(OUTPUT_DF_PATH)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "17baa097-457f-46df-9e25-56061972785f", "showTitle": false, "title": "" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questionanswerchunkchunk_idsource
0What is the purpose of the MLflow Model Registry?The purpose of the MLflow Model Registry is to...Documentation MLflow Model Registry MLflow Mod...0model-registry.html
1What is the purpose of registering a model wit...The purpose of registering a model with the Mo...logged, this model can then be registered with...1model-registry.html
2What can you do with registered models and mod...With registered models and model versions, you...associate with registered models and model ver...2model-registry.html
\n", "
" ], "text/plain": [ " question \\\n", "0 What is the purpose of the MLflow Model Registry? \n", "1 What is the purpose of registering a model wit... \n", "2 What can you do with registered models and mod... \n", "\n", " answer \\\n", "0 The purpose of the MLflow Model Registry is to... \n", "1 The purpose of registering a model with the Mo... \n", "2 With registered models and model versions, you... \n", "\n", " chunk chunk_id \\\n", "0 Documentation MLflow Model Registry MLflow Mod... 0 \n", "1 logged, this model can then be registered with... 1 \n", "2 associate with registered models and model ver... 2 \n", "\n", " source \n", "0 model-registry.html \n", "1 model-registry.html \n", "2 model-registry.html " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generated_df.head(3)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "93165dc5-aff9-46f9-83ab-e6dbfcbbc32b", "showTitle": false, "title": "" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questionsource
0What is the purpose of the MLflow Model Registry?[model-registry.html]
1What is the purpose of registering a model wit...[model-registry.html]
2What can you do with registered models and mod...[model-registry.html]
\n", "
" ], "text/plain": [ " question source\n", "0 What is the purpose of the MLflow Model Registry? [model-registry.html]\n", "1 What is the purpose of registering a model wit... [model-registry.html]\n", "2 What can you do with registered models and mod... [model-registry.html]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Prepare dataframe `data` with the required format\n", "data = pd.DataFrame({})\n", "data[\"question\"] = generated_df[\"question\"].copy(deep=True)\n", "data[\"source\"] = generated_df[\"source\"].apply(lambda x: [x])\n", "data.head(3)" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "3eabe651-28be-45bb-94ad-58e6bc582137", "showTitle": false, "title": "" } }, "source": [ "#### Retrieve Relevant Doc IDs\n", "\n", "Once we have a list of questions with ground truth doc IDs from 1.1, we can collect the retrieved relevant doc IDs. In this tutorial, we use a LangChain retriever. You can plug in your own retriever as needed." ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "9817f671-f2fd-4b2e-abe9-3bc9afd9ce3c", "showTitle": false, "title": "" } }, "source": [ "First, we build a FAISS retriever from the docs saved at https://github.com/mlflow/mlflow/blob/master/examples/llms/question_generation/mlflow_docs_scraped.csv. See the [Question Generation Notebook](https://mlflow.org/docs/latest/llms/rag/notebooks/question-generation-retrieval-evaluation.html) for how to create this csv file." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "178b45b4-11f9-47ca-9564-c8caa32d2504", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "embeddings = OpenAIEmbeddings()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "e5a113bb-11b8-4d1a-a21b-b59b523f3525", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "scrapped_df = pd.read_csv(SCRAPPED_DOCS_PATH)\n", "list_of_documents = [\n", " Document(page_content=row[\"text\"], metadata={\"source\": row[\"source\"]})\n", " for i, row in scrapped_df.iterrows()\n", "]\n", "text_splitter = CharacterTextSplitter(chunk_size=CHUNK_SIZE, chunk_overlap=0)\n", "docs = text_splitter.split_documents(list_of_documents)\n", "db = FAISS.from_documents(docs, embeddings)\n", "\n", "# Save the db to local disk\n", "db.save_local(DB_PERSIST_DIR)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "bace7c63-e3d5-42f3-bf6a-00ef1842baae", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "# Load the db from local disk\n", "db = FAISS.load_local(DB_PERSIST_DIR, embeddings)\n", "retriever = db.as_retriever()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "c06bcb3c-58c8-454c-bf5b-e29ec227991f", "showTitle": false, "title": "" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Test the retriever with a query\n", "retrieved_docs = retriever.get_relevant_documents(\n", " \"What is the purpose of the MLflow Model Registry?\"\n", ")\n", "len(retrieved_docs)" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "2ec7b458-c248-4ec0-9d85-0e447d6b4ecd", "showTitle": false, "title": "" } }, "source": [ "After building a retriever, we define a function that takes a question string as input and returns a list of relevant doc ID strings." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "bc688e4b-3389-4804-b7bf-159bce4f9db8", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "# Define a function to return a list of retrieved doc ids\n", "def retrieve_doc_ids(question: str) -> List[str]:\n", " docs = retriever.get_relevant_documents(question)\n", " return [doc.metadata[\"source\"] for doc in docs]" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "330a3336-6ca2-455f-a1ae-5bd842a4d2bb", "showTitle": false, "title": "" } }, "source": [ "We can store the retrieved doc IDs in the dataframe as a column \"retrieved_doc_ids\"." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "f96ec69b-bea3-4023-8cd3-6bee1e327ff0", "showTitle": false, "title": "" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questionsourceretrieved_doc_ids
0What is the purpose of the MLflow Model Registry?[model-registry.html][model-registry.html, introduction/index.html,...
1What is the purpose of registering a model wit...[model-registry.html][model-registry.html, models.html, introductio...
2What can you do with registered models and mod...[model-registry.html][model-registry.html, models.html, deployment/...
\n", "
" ], "text/plain": [ " question source \\\n", "0 What is the purpose of the MLflow Model Registry? [model-registry.html] \n", "1 What is the purpose of registering a model wit... [model-registry.html] \n", "2 What can you do with registered models and mod... [model-registry.html] \n", "\n", " retrieved_doc_ids \n", "0 [model-registry.html, introduction/index.html,... \n", "1 [model-registry.html, models.html, introductio... \n", "2 [model-registry.html, models.html, deployment/... " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[\"retrieved_doc_ids\"] = data[\"question\"].apply(retrieve_doc_ids)\n", "data.head(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "5e5c4cd1-38c3-4709-8d41-6e319fb8a924", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "# Persist the static evaluation dataset to disk\n", "data.to_csv(EVALUATION_DATASET_PATH, index=False)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "deabd8f0-44cf-409f-a27b-e82dd4d99940", "showTitle": false, "title": "" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questionsourceretrieved_doc_ids
0What is the purpose of the MLflow Model Registry?[model-registry.html][model-registry.html, introduction/index.html,...
1What is the purpose of registering a model wit...[model-registry.html][model-registry.html, models.html, introductio...
2What can you do with registered models and mod...[model-registry.html][model-registry.html, models.html, deployment/...
\n", "
" ], "text/plain": [ " question source \\\n", "0 What is the purpose of the MLflow Model Registry? [model-registry.html] \n", "1 What is the purpose of registering a model wit... [model-registry.html] \n", "2 What can you do with registered models and mod... [model-registry.html] \n", "\n", " retrieved_doc_ids \n", "0 [model-registry.html, introduction/index.html,... \n", "1 [model-registry.html, models.html, introductio... \n", "2 [model-registry.html, models.html, deployment/... " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Load the static evaluation dataset from disk and deserialize the source and retrieved doc ids\n", "data = pd.read_csv(EVALUATION_DATASET_PATH)\n", "data[\"source\"] = data[\"source\"].apply(ast.literal_eval)\n", "data[\"retrieved_doc_ids\"] = data[\"retrieved_doc_ids\"].apply(ast.literal_eval)\n", "data.head(3)" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "c62b5306-9c0d-4ce8-8c4a-23cb1ecc7f66", "showTitle": false, "title": "" } }, "source": [ "## Step 3: Calling `mlflow.evaluate()`" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "bdeebdcc-b4e7-4f9d-8fdc-366f9c13ed20", "showTitle": false, "title": "" } }, "source": [ "### Metrics Definition\n", "\n", "There are three built-in metrics provided for the retriever model type. Click the metric name below to see the metrics definitions.\n", "\n", "1. [mlflow.metrics.precision_at_k(k)](https://mlflow.org/docs/latest/python_api/mlflow.metrics.html#mlflow.metrics.precision_at_k)\n", "1. [mlflow.metrics.recall_at_k(k)](https://mlflow.org/docs/latest/python_api/mlflow.metrics.html#mlflow.metrics.recall_at_k)\n", "1. [mlflow.metrics.ndcg_at_k(k)](https://mlflow.org/docs/latest/python_api/mlflow.metrics.html#mlflow.metrics.ndcg_at_k) \n", "\n", "All metrics compute a score between 0 and 1 for each row representing the corresponding metric of the retriever model at the given `k` value.\n", "\n", "The `k` parameter should be a positive integer representing the number of retrieved documents\n", "to evaluate for each row. `k` defaults to 3.\n", "\n", "When the model type is `\"retriever\"`, these metrics will be calculated automatically with the\n", "default `k` value of 3.\n" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "25a32237-b2ef-4f4a-9e5a-4537e7e43012", "showTitle": false, "title": "" } }, "source": [ "### Basic usage\n", "\n", "There are two supported ways to specify the retriever's output:\n", "\n", "* Case 1: Save the retriever's output to a static evaluation dataset\n", "* Case 2: Wrap the retriever in a function" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "0390728a-a6cf-4c84-867a-0c6832114471", "showTitle": false, "title": "" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2023/11/22 14:39:59 WARNING mlflow.data.pandas_dataset: Failed to infer schema for Pandas dataset. Exception: Unable to map 'object' type to MLflow DataType. object can be mapped iff all values have identical data type which is one of (string, (bytes or byterray), int, float).\n", "2023/11/22 14:39:59 INFO mlflow.models.evaluation.base: Evaluating the model with the default evaluator.\n", "2023/11/22 14:39:59 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...\n", "2023/11/22 14:39:59 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: precision_at_3\n", "2023/11/22 14:39:59 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: recall_at_3\n", "2023/11/22 14:39:59 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: ndcg_at_3\n" ] } ], "source": [ "# Case 1: Evaluating a static evaluation dataset\n", "with mlflow.start_run() as run:\n", " evaluate_results = mlflow.evaluate(\n", " data=data,\n", " model_type=\"retriever\",\n", " targets=\"source\",\n", " predictions=\"retrieved_doc_ids\",\n", " evaluators=\"default\",\n", " )" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "70aa6719-f69d-4fda-8a67-ac4e0d8ea6d8", "showTitle": false, "title": "" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questionsource
0What is the purpose of the MLflow Model Registry?[model-registry.html]
1What is the purpose of registering a model wit...[model-registry.html]
2What can you do with registered models and mod...[model-registry.html]
\n", "
" ], "text/plain": [ " question source\n", "0 What is the purpose of the MLflow Model Registry? [model-registry.html]\n", "1 What is the purpose of registering a model wit... [model-registry.html]\n", "2 What can you do with registered models and mod... [model-registry.html]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "question_source_df = data[[\"question\", \"source\"]]\n", "question_source_df.head(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "00672280-3dfc-4c00-9ae2-bea50732ef8b", "showTitle": false, "title": "" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2023/11/22 14:09:12 WARNING mlflow.data.pandas_dataset: Failed to infer schema for Pandas dataset. Exception: Unable to map 'object' type to MLflow DataType. object can be mapped iff all values have identical data type which is one of (string, (bytes or byterray), int, float).\n", "2023/11/22 14:09:12 INFO mlflow.models.evaluation.base: Evaluating the model with the default evaluator.\n", "2023/11/22 14:09:12 INFO mlflow.models.evaluation.default_evaluator: Computing model predictions.\n", "2023/11/22 14:09:24 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...\n", "2023/11/22 14:09:24 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: precision_at_3\n", "2023/11/22 14:09:24 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: recall_at_3\n", "2023/11/22 14:09:24 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: ndcg_at_3\n" ] } ], "source": [ "# Case 2: Evaluating a function\n", "def retriever_model_function(question_df: pd.DataFrame) -> pd.Series:\n", " return question_df[\"question\"].apply(retrieve_doc_ids)\n", "\n", "\n", "with mlflow.start_run() as run:\n", " evaluate_results = mlflow.evaluate(\n", " model=retriever_model_function,\n", " data=question_source_df,\n", " model_type=\"retriever\",\n", " targets=\"source\",\n", " evaluators=\"default\",\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "cb24318b-6149-4703-ad06-731c8a75866f", "showTitle": false, "title": "" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{ 'ndcg_at_3/mean': 0.7530888125490431,\n", " 'ndcg_at_3/p90': 1.0,\n", " 'ndcg_at_3/variance': 0.1209151911325433,\n", " 'precision_at_3/mean': 0.26785714285714285,\n", " 'precision_at_3/p90': 0.3333333333333333,\n", " 'precision_at_3/variance': 0.017538265306122448,\n", " 'recall_at_3/mean': 0.8035714285714286,\n", " 'recall_at_3/p90': 1.0,\n", " 'recall_at_3/variance': 0.15784438775510204}\n" ] } ], "source": [ "pp = pprint.PrettyPrinter(indent=4)\n", "pp.pprint(evaluate_results.metrics)" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "7a9b83c5-1544-4b0e-81f6-7abc1fafa258", "showTitle": false, "title": "" } }, "source": [ "### Try different k values\n", "To use another `k` value, use the `evaluator_config` parameter\n", "in the `mlflow.evaluate()` API as follows: `evaluator_config={\"retriever_k\": }`.\n", "\n", "\n", "```python\n", "# Case 1: Specifying the model type\n", "evaluate_results = mlflow.evaluate(\n", " data=data,\n", " model_type=\"retriever\",\n", " targets=\"ground_truth_context\",\n", " predictions=\"retrieved_context\",\n", " evaluators=\"default\",\n", " evaluator_config={\"retriever_k\": 5}\n", " )\n", "```\n", "\n", "Alternatively, you can directly specify the desired metrics\n", "in the `extra_metrics` parameter of the `mlflow.evaluate()` API without specifying a model\n", "type. In this case, the `k` value specified in the `evaluator_config` parameter will be\n", "ignored.\n", "\n", "\n", "```python\n", "# Case 2: Specifying the extra_metrics\n", "evaluate_results = mlflow.evaluate(\n", " data=data,\n", " targets=\"ground_truth_context\",\n", " predictions=\"retrieved_context\",\n", " extra_metrics=[\n", " mlflow.metrics.precision_at_k(4),\n", " mlflow.metrics.precision_at_k(5)\n", " ],\n", " )\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "4b7174aa-0aa2-497d-aaa5-842121fcf270", "showTitle": false, "title": "" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2023/11/22 14:40:22 WARNING mlflow.data.pandas_dataset: Failed to infer schema for Pandas dataset. Exception: Unable to map 'object' type to MLflow DataType. object can be mapped iff all values have identical data type which is one of (string, (bytes or byterray), int, float).\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.base: Evaluating the model with the default evaluator.\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_1\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_2\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_3\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_1\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_2\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_3\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_1\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_2\n", "2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_3\n" ] } ], "source": [ "with mlflow.start_run() as run:\n", " evaluate_results = mlflow.evaluate(\n", " data=data,\n", " targets=\"source\",\n", " predictions=\"retrieved_doc_ids\",\n", " evaluators=\"default\",\n", " extra_metrics=[\n", " mlflow.metrics.precision_at_k(1),\n", " mlflow.metrics.precision_at_k(2),\n", " mlflow.metrics.precision_at_k(3),\n", " mlflow.metrics.recall_at_k(1),\n", " mlflow.metrics.recall_at_k(2),\n", " mlflow.metrics.recall_at_k(3),\n", " mlflow.metrics.ndcg_at_k(1),\n", " mlflow.metrics.ndcg_at_k(2),\n", " mlflow.metrics.ndcg_at_k(3),\n", " ],\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "d57c201b-3718-43af-b8c2-ef22bfa2c15b", "showTitle": false, "title": "" } }, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "# Plotting each metric\n", "for metric_name in [\"precision\", \"recall\", \"ndcg\"]:\n", " y = [evaluate_results.metrics[f\"{metric_name}_at_{k}/mean\"] for k in range(1, 4)]\n", " plt.plot([1, 2, 3], y, label=f\"{metric_name}@k\")\n", "\n", "# Adding labels and title\n", "plt.xlabel(\"k\")\n", "plt.ylabel(\"Metric Value\")\n", "plt.title(\"Metrics Comparison at Different Ks\")\n", "# Setting x-axis ticks\n", "plt.xticks([1, 2, 3])\n", "plt.legend()\n", "\n", "# Display the plot\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "cac23d4b-bece-4274-836f-9ca2b7c3860d", "showTitle": false, "title": "" } }, "source": [ "### Corner case handling\n", "\n", "There are a few corner cases handle specially for each built-in metric." ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "e05a4ede-db44-46d2-bce8-752b0ce5d807", "showTitle": false, "title": "" } }, "source": [ "#### Empty retrieved document IDs\n", "\n", "When no relevant docs are retrieved:\n", "\n", "- `mlflow.metrics.precision_at_k(k)` is defined as:\n", " * 0 if the ground-truth doc IDs is non-empty\n", " * 1 if the ground-truth doc IDs is also empty\n", "\n", "- `mlflow.metrics.ndcg_at_k(k)` is defined as:\n", " * 0 if the ground-truth doc IDs is non-empty\n", " * 1 if the ground-truth doc IDs is also empty" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "931a32e7-29cb-4a22-b94e-ea2bf4f0b1a7", "showTitle": false, "title": "" } }, "source": [ "#### Empty ground-truth document IDs\n", "\n", "When no ground-truth document IDs are provided:\n", "\n", "- `mlflow.metrics.recall_at_k(k)` is defined as:\n", " * 0 if the retrieved doc IDs is non-empty\n", " * 1 if the retrieved doc IDs is also empty\n", "\n", "- `mlflow.metrics.ndcg_at_k(k)` is defined as:\n", " * 0 if the retrieved doc IDs is non-empty\n", " * 1 if the retrieved doc IDs is also empty" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "5a1453f6-a62d-43da-b230-955841c66651", "showTitle": false, "title": "" } }, "source": [ "#### Duplicate retreived document IDs\n", "\n", "It is a common case for the retriever in a RAG system to retrieve multiple chunks in the same document for a given query. In this case, `mlflow.metrics.ndcg_at_k(k)` is calculated as follows:\n", "\n", "If the duplicate doc IDs are in the ground truth,\n", " they will be treated as different docs. For example, if the ground truth doc IDs are\n", " [1, 2] and the retrieved doc IDs are [1, 1, 1, 3], the score will be equavalent to\n", " ground truth doc IDs [10, 11, 12, 2] and retrieved doc IDs [10, 11, 12, 3].\n", "\n", "If the duplicate doc IDs are not in the ground truth, the ndcg score is calculated as normal." ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "525ccc10-3a60-4dc9-804e-083cfa313349", "showTitle": false, "title": "" } }, "source": [ "## Step 4: Result Analysis and Visualization\n", "\n", "You can view the per-row scores in the logged \"eval_results_table.json\" in artifacts by either loading it to a pandas dataframe (shown below) or visiting the MLflow run comparison UI." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "32f3d5b3-245c-46b7-87ce-d85e261eac28", "showTitle": true, "title": "" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ee4bfb1998174c558e537ebb1dd737d9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading artifacts: 0%| | 0/1 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questionsourceretrieved_doc_idsprecision_at_1/scoreprecision_at_2/scoreprecision_at_3/scorerecall_at_1/scorerecall_at_2/scorerecall_at_3/scorendcg_at_1/scorendcg_at_2/scorendcg_at_3/score
0What is the purpose of the MLflow Model Registry?[model-registry.html][model-registry.html, introduction/index.html,...10.50.33333311111.00.919721
1What is the purpose of registering a model wit...[model-registry.html][model-registry.html, models.html, introductio...10.50.33333311111.01.000000
2What can you do with registered models and mod...[model-registry.html][model-registry.html, models.html, deployment/...10.50.33333311111.01.000000
3How can you add, modify, update, or delete a m...[model-registry.html][model-registry.html, models.html, deployment/...10.50.33333311111.01.000000
4How can you deploy and organize models in the ...[model-registry.html][model-registry.html, deployment/index.html, d...10.50.33333311111.00.919721
\n", "" ], "text/plain": [ " question source \\\n", "0 What is the purpose of the MLflow Model Registry? [model-registry.html] \n", "1 What is the purpose of registering a model wit... [model-registry.html] \n", "2 What can you do with registered models and mod... [model-registry.html] \n", "3 How can you add, modify, update, or delete a m... [model-registry.html] \n", "4 How can you deploy and organize models in the ... [model-registry.html] \n", "\n", " retrieved_doc_ids precision_at_1/score \\\n", "0 [model-registry.html, introduction/index.html,... 1 \n", "1 [model-registry.html, models.html, introductio... 1 \n", "2 [model-registry.html, models.html, deployment/... 1 \n", "3 [model-registry.html, models.html, deployment/... 1 \n", "4 [model-registry.html, deployment/index.html, d... 1 \n", "\n", " precision_at_2/score precision_at_3/score recall_at_1/score \\\n", "0 0.5 0.333333 1 \n", "1 0.5 0.333333 1 \n", "2 0.5 0.333333 1 \n", "3 0.5 0.333333 1 \n", "4 0.5 0.333333 1 \n", "\n", " recall_at_2/score recall_at_3/score ndcg_at_1/score ndcg_at_2/score \\\n", "0 1 1 1 1.0 \n", "1 1 1 1 1.0 \n", "2 1 1 1 1.0 \n", "3 1 1 1 1.0 \n", "4 1 1 1 1.0 \n", "\n", " ndcg_at_3/score \n", "0 0.919721 \n", "1 1.000000 \n", "2 1.000000 \n", "3 1.000000 \n", "4 0.919721 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "eval_results_table = evaluate_results.tables[\"eval_results_table\"]\n", "eval_results_table.head(5)" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "cf18dd29-1017-4245-9f3b-923dbd46f742", "showTitle": false, "title": "" } }, "source": [ "With the evaluate results table, you can further visualize the well-answered questions and poorly-answered questions using topical analysis techniques." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "b1d9e40a-ccf6-4d6a-b24c-8cf41bbfa005", "showTitle": true, "title": "Utilitity functions" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "[nltk_data] Downloading package punkt to\n", "[nltk_data] /Users/liang.zhang/nltk_data...\n", "[nltk_data] Package punkt is already up-to-date!\n", "[nltk_data] Downloading package stopwords to\n", "[nltk_data] /Users/liang.zhang/nltk_data...\n", "[nltk_data] Package stopwords is already up-to-date!\n" ] } ], "source": [ "import nltk\n", "import pyLDAvis.gensim_models as gensimvis\n", "from gensim import corpora, models\n", "from nltk.corpus import stopwords\n", "from nltk.tokenize import word_tokenize\n", "\n", "# Initialize NLTK resources\n", "nltk.download(\"punkt\")\n", "nltk.download(\"stopwords\")\n", "\n", "\n", "def topical_analysis(questions: List[str]):\n", " stop_words = set(stopwords.words(\"english\"))\n", "\n", " # Tokenize and remove stop words\n", " tokenized_data = []\n", " for question in questions:\n", " tokens = word_tokenize(question.lower())\n", " filtered_tokens = [word for word in tokens if word not in stop_words and word.isalpha()]\n", " tokenized_data.append(filtered_tokens)\n", "\n", " # Create a dictionary and corpus\n", " dictionary = corpora.Dictionary(tokenized_data)\n", " corpus = [dictionary.doc2bow(text) for text in tokenized_data]\n", "\n", " # Apply LDA model\n", " lda_model = models.LdaModel(corpus, num_topics=5, id2word=dictionary, passes=15)\n", "\n", " # Get topic distribution for each question\n", " topic_distribution = []\n", " for i, ques in enumerate(questions):\n", " bow = dictionary.doc2bow(tokenized_data[i])\n", " topics = lda_model.get_document_topics(bow)\n", " topic_distribution.append(topics)\n", " print(f\"Question: {ques}\\nTopic: {topics}\")\n", "\n", " # Print all topics\n", " print(\"\\nTopics found are:\")\n", " for idx, topic in lda_model.print_topics(-1):\n", " print(f\"Topic: {idx} \\nWords: {topic}\\n\")\n", " return lda_model, corpus, dictionary" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "e892d804-a4d8-468c-93e2-acc4a5fbcf2c", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "filtered_df = eval_results_table[eval_results_table[\"precision_at_1/score\"] == 1]\n", "hit_questions = filtered_df[\"question\"].tolist()\n", "filtered_df = eval_results_table[eval_results_table[\"precision_at_1/score\"] == 0]\n", "miss_questions = filtered_df[\"question\"].tolist()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "7c178b69-37d4-4a6b-9737-b93e7f3d75c5", "showTitle": false, "title": "" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Question: What is the purpose of the MLflow Model Registry?\n", "Topic: [(0, 0.0400703), (1, 0.040002838), (2, 0.040673085), (3, 0.04075462), (4, 0.8384991)]\n", "Question: What is the purpose of registering a model with the Model Registry?\n", "Topic: [(0, 0.0334267), (1, 0.033337697), (2, 0.033401005), (3, 0.033786207), (4, 0.8660484)]\n", "Question: What can you do with registered models and model versions?\n", "Topic: [(0, 0.04019648), (1, 0.04000775), (2, 0.040166058), (3, 0.8391777), (4, 0.040452003)]\n", "Question: How can you add, modify, update, or delete a model in the Model Registry?\n", "Topic: [(0, 0.025052568), (1, 0.025006149), (2, 0.025024023), (3, 0.025236268), (4, 0.899681)]\n", "Question: How can you deploy and organize models in the Model Registry?\n", "Topic: [(0, 0.033460867), (1, 0.033337582), (2, 0.033362914), (3, 0.8659808), (4, 0.033857808)]\n", "Question: What method do you use to create a new registered model?\n", "Topic: [(0, 0.028867528), (1, 0.028582651), (2, 0.882546), (3, 0.030021703), (4, 0.029982116)]\n", "Question: How can you deploy and organize models in the Model Registry?\n", "Topic: [(0, 0.033460878), (1, 0.033337586), (2, 0.033362918), (3, 0.8659798), (4, 0.03385884)]\n", "Question: How can you fetch a list of registered models in the MLflow registry?\n", "Topic: [(0, 0.0286206), (1, 0.028577656), (2, 0.02894385), (3, 0.88495284), (4, 0.028905064)]\n", "Question: What is the default channel logged for models using MLflow v1.18 and above?\n", "Topic: [(0, 0.02862059), (1, 0.028577654), (2, 0.028883327), (3, 0.8851736), (4, 0.028744776)]\n", "Question: What information is stored in the conda.yaml file?\n", "Topic: [(0, 0.050020963), (1, 0.051287953), (2, 0.051250603), (3, 0.7968765), (4, 0.05056402)]\n", "Question: How can you save a model with a manually specified conda environment?\n", "Topic: [(0, 0.02862434), (1, 0.02858204), (2, 0.02886313), (3, 0.8851747), (4, 0.028755778)]\n", "Question: What are inference params and how are they used during model inference?\n", "Topic: [(0, 0.86457103), (1, 0.03353862), (2, 0.033417325), (3, 0.034004394), (4, 0.034468662)]\n", "Question: What is the purpose of model signatures in MLflow?\n", "Topic: [(0, 0.040070876), (1, 0.04000346), (2, 0.040688124), (3, 0.040469088), (4, 0.8387685)]\n", "Question: What is the API used to set signatures on models?\n", "Topic: [(0, 0.033873636), (1, 0.033508822), (2, 0.033337757), (3, 0.035357967), (4, 0.8639218)]\n", "Question: What components are used to generate the final time series?\n", "Topic: [(0, 0.028693806), (1, 0.8853218), (2, 0.028573763), (3, 0.02862714), (4, 0.0287835)]\n", "Question: What functionality does the configuration DataFrame submitted to the pyfunc flavor provide?\n", "Topic: [(0, 0.02519801), (1, 0.025009492), (2, 0.025004204), (3, 0.025004204), (4, 0.8997841)]\n", "Question: What is a common configuration for lowering the total memory pressure for pytorch models within transformers pipelines?\n", "Topic: [(0, 0.93316424), (1, 0.016669936), (2, 0.016668117), (3, 0.016788227), (4, 0.016709473)]\n", "Question: What does the save_model() function do?\n", "Topic: [(0, 0.10002145), (1, 0.59994656), (2, 0.10001026), (3, 0.10001026), (4, 0.10001151)]\n", "Question: What is an MLflow Project?\n", "Topic: [(0, 0.06667001), (1, 0.06667029), (2, 0.7321751), (3, 0.06711196), (4, 0.06737265)]\n", "Question: What are the entry points in a MLproject file and how can you specify parameters for them?\n", "Topic: [(0, 0.02857626), (1, 0.88541776), (2, 0.02868285), (3, 0.028626908), (4, 0.02869626)]\n", "Question: What are the project environments supported by MLflow?\n", "Topic: [(0, 0.040009078), (1, 0.040009864), (2, 0.839655), (3, 0.040126894), (4, 0.040199146)]\n", "Question: What is the purpose of specifying a Conda environment in an MLflow project?\n", "Topic: [(0, 0.028579442), (1, 0.028580135), (2, 0.8841217), (3, 0.028901232), (4, 0.029817443)]\n", "Question: What is the purpose of the MLproject file?\n", "Topic: [(0, 0.05001335), (1, 0.052611485), (2, 0.050071735), (3, 0.05043289), (4, 0.7968705)]\n", "Question: How can you pass runtime parameters to the entry point of an MLflow Project?\n", "Topic: [(0, 0.025007373), (1, 0.025498485), (2, 0.8993807), (3, 0.02504522), (4, 0.025068246)]\n", "Question: How does MLflow run a Project on Kubernetes?\n", "Topic: [(0, 0.04000677), (1, 0.040007353), (2, 0.83931196), (3, 0.04012452), (4, 0.04054937)]\n", "Question: What fields are replaced when MLflow creates a Kubernetes Job for an MLflow Project?\n", "Topic: [(0, 0.022228329), (1, 0.022228856), (2, 0.023192631), (3, 0.02235802), (4, 0.90999216)]\n", "Question: What is the syntax for searching runs using the MLflow UI and API?\n", "Topic: [(0, 0.025003674), (1, 0.02500399), (2, 0.02527212), (3, 0.89956146), (4, 0.025158761)]\n", "Question: What is the syntax for searching runs using the MLflow UI and API?\n", "Topic: [(0, 0.025003672), (1, 0.025003988), (2, 0.025272164), (3, 0.8995614), (4, 0.025158769)]\n", "Question: What are the key parts of a search expression in MLflow?\n", "Topic: [(0, 0.03334423), (1, 0.03334517), (2, 0.8662702), (3, 0.033611353), (4, 0.033429127)]\n", "Question: What are the key attributes for the model with the run_id 'a1b2c3d4' and run_name 'my-run'?\n", "Topic: [(0, 0.05017508), (1, 0.05001634), (2, 0.05058142), (3, 0.7985237), (4, 0.050703418)]\n", "Question: What information does each run record in MLflow Tracking?\n", "Topic: [(0, 0.03333968), (1, 0.033340227), (2, 0.86639804), (3, 0.03349555), (4, 0.033426523)]\n", "Question: What are the two components used by MLflow for storage?\n", "Topic: [(0, 0.0334928), (1, 0.033938777), (2, 0.033719826), (3, 0.03357158), (4, 0.86527705)]\n", "Question: What interfaces does the MLflow client use to record MLflow entities and artifacts when running MLflow on a local machine with a SQLAlchemy-compatible database?\n", "Topic: [(0, 0.014289577), (1, 0.014289909), (2, 0.94276434), (3, 0.014325481), (4, 0.014330726)]\n", "Question: What is the default backend store used by MLflow?\n", "Topic: [(0, 0.033753525), (1, 0.03379533), (2, 0.033777602), (3, 0.86454684), (4, 0.0341267)]\n", "Question: What information does autologging capture when launching short-lived MLflow runs?\n", "Topic: [(0, 0.028579954), (1, 0.02858069), (2, 0.8851724), (3, 0.029027484), (4, 0.028639426)]\n", "Question: What is the purpose of the --serve-artifacts flag?\n", "Topic: [(0, 0.06670548), (1, 0.066708855), (2, 0.067003354), (3, 0.3969311), (4, 0.40265122)]\n", "\n", "Topics found are:\n", "Topic: 0 \n", "Words: 0.059*\"inference\" + 0.032*\"models\" + 0.032*\"used\" + 0.032*\"configuration\" + 0.032*\"common\" + 0.032*\"transformers\" + 0.032*\"total\" + 0.032*\"within\" + 0.032*\"pytorch\" + 0.032*\"pipelines\"\n", "\n", "Topic: 1 \n", "Words: 0.036*\"file\" + 0.035*\"mlproject\" + 0.035*\"used\" + 0.035*\"components\" + 0.035*\"entry\" + 0.035*\"parameters\" + 0.035*\"specify\" + 0.035*\"final\" + 0.035*\"points\" + 0.035*\"time\"\n", "\n", "Topic: 2 \n", "Words: 0.142*\"mlflow\" + 0.066*\"project\" + 0.028*\"information\" + 0.028*\"use\" + 0.028*\"record\" + 0.028*\"run\" + 0.015*\"key\" + 0.015*\"running\" + 0.015*\"artifacts\" + 0.015*\"client\"\n", "\n", "Topic: 3 \n", "Words: 0.066*\"models\" + 0.066*\"model\" + 0.066*\"mlflow\" + 0.041*\"using\" + 0.041*\"registry\" + 0.028*\"api\" + 0.028*\"registered\" + 0.028*\"runs\" + 0.028*\"syntax\" + 0.028*\"searching\"\n", "\n", "Topic: 4 \n", "Words: 0.089*\"model\" + 0.074*\"purpose\" + 0.074*\"mlflow\" + 0.046*\"registry\" + 0.031*\"used\" + 0.031*\"signatures\" + 0.017*\"kubernetes\" + 0.017*\"fields\" + 0.017*\"job\" + 0.017*\"replaced\"\n", "\n" ] } ], "source": [ "lda_model, corpus, dictionary = topical_analysis(hit_questions)\n", "vis_data = gensimvis.prepare(lda_model, corpus, dictionary)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "a0587a0f-b35d-488d-9054-55435a9585bf", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "# Uncomment the following line to render the interactive widget\n", "# pyLDAvis.display(vis_data)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "1375250d-9818-4503-87ec-f14020d87c81", "showTitle": false, "title": "" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Question: What is the purpose of the mlflow.sklearn.log_model() method?\n", "Topic: [(0, 0.0669118), (1, 0.06701085), (2, 0.06667974), (3, 0.73235476), (4, 0.06704286)]\n", "Question: How can you fetch a specific model version?\n", "Topic: [(0, 0.83980393), (1, 0.040003464), (2, 0.04000601), (3, 0.040101767), (4, 0.040084846)]\n", "Question: How can you fetch the latest model version in a specific stage?\n", "Topic: [(0, 0.88561153), (1, 0.028575428), (2, 0.028578365), (3, 0.0286214), (4, 0.028613236)]\n", "Question: What can you do to promote MLflow Models across environments?\n", "Topic: [(0, 0.8661927), (1, 0.0333396), (2, 0.03362743), (3, 0.033428304), (4, 0.033411972)]\n", "Question: What is the name of the model and its version details?\n", "Topic: [(0, 0.83978903), (1, 0.04000637), (2, 0.04001106), (3, 0.040105395), (4, 0.040088095)]\n", "Question: What is the purpose of saving the model in pickled format?\n", "Topic: [(0, 0.033948876), (1, 0.03339717), (2, 0.033340737), (3, 0.86575514), (4, 0.033558063)]\n", "Question: What is an MLflow Model and what is its purpose?\n", "Topic: [(0, 0.7940762), (1, 0.05068333), (2, 0.050770763), (3, 0.053328265), (4, 0.05114142)]\n", "Question: What are the flavors defined in the MLmodel file for the mlflow.sklearn library?\n", "Topic: [(0, 0.86628276), (1, 0.033341788), (2, 0.03334801), (3, 0.03368498), (4, 0.033342462)]\n", "Question: What command can be used to package and deploy models to AWS SageMaker?\n", "Topic: [(0, 0.89991224), (1, 0.025005225), (2, 0.025009066), (3, 0.025006713), (4, 0.025066752)]\n", "Question: What is the purpose of the --build-image flag when running mlflow run?\n", "Topic: [(0, 0.033957016), (1, 0.033506736), (2, 0.034095332), (3, 0.034164555), (4, 0.86427635)]\n", "Question: What is the relative path to the python_env YAML file within the MLflow project's directory?\n", "Topic: [(0, 0.02243), (1, 0.02222536), (2, 0.022470985), (3, 0.9105873), (4, 0.02228631)]\n", "Question: What are the additional local volume mounted and environment variables in the docker container?\n", "Topic: [(0, 0.022225259), (1, 0.9110914), (2, 0.02222932), (3, 0.022227468), (4, 0.022226628)]\n", "Question: What are some examples of entity names that contain special characters?\n", "Topic: [(0, 0.028575381), (1, 0.88568854), (2, 0.02858065), (3, 0.028578246), (4, 0.028577149)]\n", "Question: What type of constant does the RHS need to be if LHS is a metric?\n", "Topic: [(0, 0.028575381), (1, 0.8856886), (2, 0.028580645), (3, 0.028578239), (4, 0.028577147)]\n", "Question: How can you get all active runs from experiments IDs 3, 4, and 17 that used a CNN model with 10 layers and had a prediction accuracy of 94.5% or higher?\n", "Topic: [(0, 0.015563371), (1, 0.015387185), (2, 0.015389071), (3, 0.015427767), (4, 0.9382326)]\n", "Question: What is the purpose of the 'experimentIds' variable in the given paragraph?\n", "Topic: [(0, 0.040206533), (1, 0.8384999), (2, 0.040013183), (3, 0.040967643), (4, 0.040312726)]\n", "Question: What is the MLflow Tracking component used for?\n", "Topic: [(0, 0.8390845), (1, 0.04000697), (2, 0.040462855), (3, 0.04014182), (4, 0.040303845)]\n", "Question: How can you create an experiment in MLflow?\n", "Topic: [(0, 0.050333958), (1, 0.0500024), (2, 0.7993825), (3, 0.050153885), (4, 0.05012722)]\n", "Question: How can you create an experiment using MLflow?\n", "Topic: [(0, 0.04019285), (1, 0.04000254), (2, 0.8396381), (3, 0.040091105), (4, 0.04007539)]\n", "Question: What is the architecture depicted in this example scenario?\n", "Topic: [(0, 0.04000523), (1, 0.040007014), (2, 0.040012203), (3, 0.04000902), (4, 0.83996654)]\n", "\n", "Topics found are:\n", "Topic: 0 \n", "Words: 0.078*\"model\" + 0.059*\"mlflow\" + 0.059*\"version\" + 0.041*\"models\" + 0.041*\"fetch\" + 0.041*\"specific\" + 0.041*\"used\" + 0.022*\"command\" + 0.022*\"deploy\" + 0.022*\"sagemaker\"\n", "\n", "Topic: 1 \n", "Words: 0.030*\"local\" + 0.030*\"container\" + 0.030*\"variables\" + 0.030*\"docker\" + 0.030*\"mounted\" + 0.030*\"environment\" + 0.030*\"volume\" + 0.030*\"additional\" + 0.030*\"special\" + 0.030*\"names\"\n", "\n", "Topic: 2 \n", "Words: 0.096*\"experiment\" + 0.096*\"create\" + 0.096*\"mlflow\" + 0.051*\"using\" + 0.009*\"purpose\" + 0.009*\"model\" + 0.009*\"method\" + 0.009*\"file\" + 0.009*\"version\" + 0.009*\"used\"\n", "\n", "Topic: 3 \n", "Words: 0.071*\"purpose\" + 0.039*\"file\" + 0.039*\"mlflow\" + 0.039*\"yaml\" + 0.039*\"directory\" + 0.039*\"relative\" + 0.039*\"within\" + 0.039*\"path\" + 0.039*\"project\" + 0.039*\"format\"\n", "\n", "Topic: 4 \n", "Words: 0.032*\"purpose\" + 0.032*\"used\" + 0.032*\"model\" + 0.032*\"prediction\" + 0.032*\"get\" + 0.032*\"accuracy\" + 0.032*\"active\" + 0.032*\"layers\" + 0.032*\"higher\" + 0.032*\"experiments\"\n", "\n" ] } ], "source": [ "lda_model, corpus, dictionary = topical_analysis(miss_questions)\n", "vis_data = gensimvis.prepare(lda_model, corpus, dictionary)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "724db985-5382-43a6-ada5-0ac1c2d49c18", "showTitle": false, "title": "" } }, "outputs": [], "source": [ "# Uncomment the following line to render the interactive widget\n", "# pyLDAvis.display(vis_data)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "31945151-7cf9-4f25-af30-d9b9bd526e7b", "showTitle": false, "title": "" } }, "outputs": [], "source": [] } ], "metadata": { "application/vnd.databricks.v1+notebook": { "dashboards": [], "language": "python", "notebookMetadata": { "pythonIndentUnit": 4 }, "notebookName": "retriever-evaluation-tutorial", "widgets": {} }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.17" } }, "nbformat": 4, "nbformat_minor": 1 }