CompTIA DataX (DY0-001) — Question 81
A data scientist needs to analyze a company's chemical businesses and is using the master database of the conglomerate company. Nothing in the data differentiates the data observations for the different businesses. Which of the following is the most efficient way to identify the chemical businesses' observations?
Answer options
- A. Ingest the data from all of the hard drives and perform exploratory data analysis to identify which business is responsible for chemical operations.
- B. Perform analysis on all of the data and create a summary report on the results relevant to chemical operations.
- C. Consult with the business team to identify which sites are responsible for chemical operations and ingest only the relevant data for analysis.
- D. Ingest data from the hard drive containing the most data and present sample results on the chemical operations.
Correct answer: C
Explanation
Option C is the most efficient approach as it involves consulting with the business team to directly identify the relevant sites, thereby saving time and resources by focusing only on necessary data. Options A and B require analyzing all data, which is inefficient without prior knowledge of the relevant observations. Option D suggests ingesting the largest data set, which may contain irrelevant information and does not guarantee pertinent results.