opik
Repository: comet-ml/opikDescription: Debug, evaluate, and monitor LLM applications, RAG systems, and agentic workflows. Key Features:
- Comprehensive Tracing: Log and visualize every step of your LLM pipeline.
- Automated Evaluations: Includes "LLM as a judge" metrics like Hallucination and Relevance.
- Production Monitoring: Dashboards for tracking performance and accuracy in real-time.
- Dataset Management: Manage and version test datasets for systematic experimentation.
- Broad Integrations: Supports LangChain, LlamaIndex, OpenAI, Anthropic, and more.
Primary Use Cases:
- Debugging complex multi-step agentic workflows.
- Benchmarking LLM application accuracy before deployment.
- Monitoring production systems for regressions or hallucinations.
Tags: #observability #evaluation #llmops #monitoring Added: 2026-06-18 Source: GitHub
Notes / Why Notable
Opik (by Comet ML) provides the necessary "LLMOps" infrastructure to move from "vibe coding" to systematic engineering and monitoring of AI applications.