Google Cloud Professional Data Engineer — Question 79
Your company is loading comma-separated values (CSV) files into Google BigQuery. The data is fully imported successfully; however, the imported data is not matching byte-to-byte to the source file. What is the most likely cause of this problem?
Answer options
- A. The CSV data loaded in BigQuery is not flagged as CSV.
- B. The CSV data has invalid rows that were skipped on import.
- C. The CSV data loaded in BigQuery is not using BigQuery's default encoding.
- D. The CSV data has not gone through an ETL phase before loading into BigQuery.
Correct answer: C
Explanation
The correct answer is C because if the CSV data does not use BigQuery's default encoding, it can lead to discrepancies in how the data is interpreted and stored. Option A is incorrect because the data can still be recognized as CSV, while option B suggests issues with data quality, which does not directly address the byte-to-byte match. Option D implies a separate process that is not necessary for achieving exact data matches.