Uses of Interface
org.languagetool.tokenizers.Tokenizer
Packages that use Tokenizer
Package
Description
-
Uses of Tokenizer in org.languagetool
Methods in org.languagetool that return TokenizerModifier and TypeMethodDescriptionLanguage.getWordTokenizer()
Get this language's word tokenizer implementation. -
Uses of Tokenizer in org.languagetool.language
Methods in org.languagetool.language that return Tokenizer -
Uses of Tokenizer in org.languagetool.noop
Methods in org.languagetool.noop that return Tokenizer -
Uses of Tokenizer in org.languagetool.rules.ngrams
Methods in org.languagetool.rules.ngrams that return TokenizerModifier and TypeMethodDescription(package private) static Tokenizer
LanguageModelUtils.getGoogleStyleWordTokenizer
(Language language) Return a tokenizer that works more like Google does for its ngram index (which doesn't seem to be properly documented).protected Tokenizer
NgramProbabilityRule.getGoogleStyleWordTokenizer()
Methods in org.languagetool.rules.ngrams with parameters of type TokenizerModifier and TypeMethodDescription(package private) static List<GoogleToken>
GoogleToken.getGoogleTokens
(String sentence, boolean addStartToken, Tokenizer wordTokenizer) (package private) static List<GoogleToken>
GoogleToken.getGoogleTokens
(AnalyzedSentence sentence, boolean addStartToken, Tokenizer wordTokenizer) -
Uses of Tokenizer in org.languagetool.tokenizers
Subinterfaces of Tokenizer in org.languagetool.tokenizersModifier and TypeInterfaceDescriptioninterface
Interface for components that take compound words and split them into their parts.interface
Tokenizes text into sentences.Classes in org.languagetool.tokenizers that implement TokenizerModifier and TypeClassDescriptionclass
A very simple sentence tokenizer that splits on[.!?…]
followed by whitespace or an uppercase letter.class
Class to tokenize sentences using rules from an SRX file.class
Tokenizes a sentence into words.