Update what-is-agent-observability-and-evaluation.mdx

Cut a second "metrics" 

"In this chapters example notebook, we'll show you how these metrics metrics looks in real examples but first, we'll learn how a typical evaluation workflow looks like."

to

"In this chapter's example notebook, we'll show you how these metrics look in real examples, but first, we'll learn how a typical evaluation workflow looks."
This commit is contained in:
kvnbishop
2025-03-31 15:00:30 -07:00
committed by GitHub
parent 55a2387c6b
commit f8abe0b6df

View File

@@ -52,7 +52,7 @@ Here are some of the most common metrics that observability tools monitor:
**Automated Evaluation Metrics:** You can also set up automated evals. For instance, you can use an LLM to score the output of the agent e.g. if it is helpful, accurate, or not. There are also several open source libraries that help you to score different aspects of the agent. E.g. [RAGAS](https://docs.ragas.io/) for RAG agents or [LLM Guard](https://llm-guard.com/) to detect harmful language or prompt injection.
In practice, a combination of these metrics gives the best coverage of an AI agents health. In this chapters [example notebook](https://huggingface.co/learn/agents-course/en/bonus-unit2/monitoring-and-evaluating-agents-notebook), we'll show you how these metrics metrics looks in real examples but first, we'll learn how a typical evaluation workflow looks like.
In practice, a combination of these metrics gives the best coverage of an AI agents health. In this chapters [example notebook](https://huggingface.co/learn/agents-course/en/bonus-unit2/monitoring-and-evaluating-agents-notebook), we'll show you how these metrics looks in real examples but first, we'll learn how a typical evaluation workflow looks like.
## 👍 Evaluating AI Agents