Databricks Certified Generative AI Engineer Associate — Question 62

When developing an LLM application, it’s crucial to ensure that the data used for training the model complies with licensing requirements to avoid legal risks.

Which action is most appropriate to avoid legal risks?

Answer options

Correct answer: A

Explanation

Option A is correct because using data that is explicitly labeled with an open license ensures compliance with licensing terms, thus avoiding legal issues. Option B is misleading as LLM outputs may still derive from copyrighted material, posing potential risks. Option C, while a good practice, may not always be feasible or necessary if open-licensed data is available. Option D is incorrect since publicly available data can still have legal restrictions that must be adhered to.