A Generative AI Engineer has just deployed an LLM application at a manufacturing company…

Question

A Generative AI Engineer has just deployed an LLM application at a manufacturing company that assists with answering customer service inquiries. They need to identity the key enterprise metrics to monitor the application in production. Which is NOT a metric they will implement for their customer service LLM application in production?

Accepted Answer

Correct answer: A. A. Massive Multi-task Language Understanding (MMLU) score — The correct answer is A, as the MMLU score is more relevant for evaluating model performance in a research context rather than a production environment. The other options are practical metrics that directly relate to customer service efficiency and response quality, making them essential for monitoring the application in production.

Databricks Certified Generative AI Engineer Associate — Question 48

Answer options

Correct answer: A

Explanation