Databricks Certified Generative AI Engineer Associate — Question 23
A Generative Al Engineer is building a RAG application that answers questions about internal documents for the company SnoPen AI.
The source documents may contain a significant amount of irrelevant content, such as advertisements, sports news, or entertainment news, or content about other companies.
Which approach is advisable when building a RAG application to achieve this goal of filtering irrelevant information?
Answer options
- A. Keep all articles because the RAG application needs to understand non-company content to avoid answering questions about them.
- B. Include in the system prompt that any information it sees will be about SnoPenAI, even if no data filtering is performed.
- C. Include in the system prompt that the application is not supposed to answer any questions unrelated to SnoPen AI.
- D. Consolidate all SnoPen AI related documents into a single chunk in the vector database.
Correct answer: C
Explanation
The correct answer is C because clarifying in the system prompt that the application should only respond to queries related to SnoPen AI helps filter out irrelevant information effectively. Option A is incorrect as retaining all articles would not help in filtering out unrelated content. Option B does not address the need for filtering, and option D, while helpful in organization, does not directly aid in answering questions accurately or filtering out irrelevant content.