Which stage of the indexing pipeline divides text into tokens?

Question

Accepted Answer

Correct answer: B. B. LEXER — The correct answer is B, LEXER, as it is specifically responsible for breaking down the input text into manageable tokens. The TOKENIZER is often confused with the LEXER, but it typically refers to the process rather than the component that handles tokenization. SECTIONER and FILTER do not perform tokenization; SECTIONER organizes content into sections, while FILTER processes tokens after they have already been created.

Oracle Cloud Infrastructure 2023 Architect Associate — Question 43

Answer options

Correct answer: B

Explanation