Databricks Certified Generative AI Engineer Associate — Question 48
A Generative AI Engineer has just deployed an LLM application at a manufacturing company that assists with answering customer service inquiries. They need to identity the key enterprise metrics to monitor the application in production.
Which is NOT a metric they will implement for their customer service LLM application in production?
Answer options
- A. Massive Multi-task Language Understanding (MMLU) score
- B. Number of customer inquiries processed per unit of time
- C. Factual accuracy of the response
- D. Time taken for LLM to generate a response
Correct answer: A
Explanation
The correct answer is A, as the MMLU score is more relevant for evaluating model performance in a research context rather than a production environment. The other options are practical metrics that directly relate to customer service efficiency and response quality, making them essential for monitoring the application in production.