PMI Certified in Artificial Intelligence (CPMAI) — Question 5
Your team has collected petabytes of data for your AI project. As the project lead, you understand this is too much data to use for this iteration of the project.
What is the best course of action to take with this data?
Answer options
- A. Data Deduping to reduce overall size and data complexity.
- B. Data integration focused on reducing the number of data sources.
- C. Data selection and attribute pruning to reduce overall size and data complexity.
- D. Careful algorithm selection that reduces the need for data.
Correct answer: C
Explanation
The correct answer is C because data selection and attribute pruning directly target reducing both the size and complexity of the dataset, making it more manageable for the current project iteration. Option A focuses only on deduplication, which may not address the overall size effectively. Option B is about integration, which could potentially increase complexity by combining sources. Option D suggests changing algorithms, but that does not directly reduce the data volume.