This flashcard is just one of a free flashcard set. See all flashcards!
48
Tokenizing
• Separate a text (String) to its tokens
• Example — Input:
• „Natural language processing makes fun.“
• Result:
• „Natural“, „language“, „processing“, „makes“, „fun“, “.“
• Best practice is to work without punctuations and
lowercased tokens (normalization of tokens).
• Normalized result:
• „natural“, „language“, „processing“, „makes“, „fun“
• Example — Input:
• „Natural language processing makes fun.“
• Result:
• „Natural“, „language“, „processing“, „makes“, „fun“, “.“
• Best practice is to work without punctuations and
lowercased tokens (normalization of tokens).
• Normalized result:
• „natural“, „language“, „processing“, „makes“, „fun“
Flashcard info:
Author: CoboCards-User
Main topic: PTT
Topic: PTT
School / Univ.: Uni Koblenz
City: Koblenz
Published: 08.07.2016