CompTIA DataX (DY0-001) — Question 56

A data scientist is standardizing a large data set that contains website addresses. A specific string inside some of the web addresses needs to be extracted. Which of the following is the best method for extracting the desired string from the text data?

Answer options

Correct answer: A

Explanation

Regular expressions are specifically designed for pattern matching and extraction of substrings from text, making them ideal for this task. Named-entity recognition is more about identifying entities within text rather than extracting specific strings. A large language model can generate text but is not optimized for precise string extraction, and 'Find and replace' is too simplistic for complex patterns.