Skip to main content

Plugin Evaluators

MLflow's evaluation framework is designed for extensibility, allowing specialized evaluation plugins to seamlessly integrate with the core evaluation workflow. These plugins extend MLflow's capabilities with domain-specific validation, advanced vulnerability scanning, and specialized testing frameworks developed by the broader ML community.

Available Plugins​

MLflow currently supports two powerful evaluation plugins that bring specialized validation capabilities to your model evaluation workflows:

Giskard Plugin - Advanced Vulnerability Scanning​

The Giskard plugin extends MLflow's validation capabilities to help anticipate issues before they reach production. This comprehensive scanning tool detects hidden vulnerabilities that traditional metrics might miss.

Key Capabilities​

Vulnerability Detection: Giskard scans models to identify critical issues including:

Analysis Features:

  • πŸ” Sample Exploration: Examine specific data samples that highlight discovered vulnerabilities
  • πŸ“Š Quantified Metrics: Log vulnerabilities as well-defined, measurable metrics within MLflow
  • πŸ”„ Model Comparison: Compare vulnerability metrics across different model versions and architectures

Getting Started with Giskard​

Explore these example implementations to see Giskard in action:

For comprehensive documentation and setup instructions, visit the Giskard-MLflow integration docs.

Trubrics Plugin - Flexible Validation Framework​

The Trubrics plugin provides a flexible validation framework that extends MLflow's evaluation capabilities with custom validation logic and comprehensive result reporting.

Key Capabilities​

Validation Features:

  • πŸ“‹ Out-of-the-box Validations: Large library of pre-built validation checks for common ML scenarios
  • πŸ”§ Custom Python Functions: Validate runs using any custom Python function or business logic
  • πŸ“Š Comprehensive Reporting: View all validation results in structured JSON format for easy diagnosis

Workflow Integration:

  • ⚑ Flexible Validation Logic: Define validation criteria that match your specific use case requirements
  • πŸ” Detailed Diagnostics: Understand exactly why an MLflow run might have failed validation
  • πŸ“ˆ Result Tracking: Maintain complete validation history alongside your model experiments

Getting Started with Trubrics​

See the plugin in action with the official example notebook, which demonstrates common validation patterns and integration workflows.

For complete documentation and setup instructions, visit the Trubrics-MLflow integration docs.

Integration Benefits​

Plugin evaluators seamlessly integrate with MLflow's existing evaluation framework, providing:

  • πŸ”„ Unified Workflow: Use plugins alongside standard MLflow evaluators in the same evaluation run
  • πŸ“Š Consistent Reporting: Plugin results appear in MLflow's tracking interface with other evaluation metrics
  • πŸ—οΈ Extensible Architecture: Easy integration path for custom evaluation tools and frameworks
  • πŸ“ˆ Scalable Validation: Run plugin evaluations as part of automated model validation pipelines

Next Steps​

Ready to enhance your model evaluation with specialized plugins?

  1. Choose Your Plugin: Select Giskard for vulnerability scanning or Trubrics for flexible validation
  2. Review Examples: Explore the provided example notebooks to understand integration patterns
  3. Install and Configure: Follow the plugin-specific documentation for setup instructions
  4. Integrate with MLflow: Add plugin evaluators to your existing mlflow.evaluate() workflows

These powerful plugins demonstrate the extensibility of MLflow's evaluation framework and provide immediate access to specialized validation capabilities developed by domain experts in the ML community.