Package org.languagetool.rules.patterns
Class Unifier
java.lang.Object
org.languagetool.rules.patterns.Unifier
Implements unification of features over tokens.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate boolean
A Map that stores all possible equivalence types listed for features.Map of sets of matched equivalences in the unified sequence.private final Map<EquivalenceTypeLocator,
PatternToken> A Map for storing the equivalence types for features.private boolean
private int
private int
private final List<AnalyzedTokenReadings>
List of all equivalences matched per tokens in the sequence, kept exactly in sync with the list in tokSequence, so that a reading 2 of token 1 has its equivalence map addressable as tokSequenceEquivalences.get(1).get(2).private boolean
private static final String
private boolean
-
Constructor Summary
ConstructorsConstructorDescriptionUnifier
(Map<EquivalenceTypeLocator, PatternToken> equivalenceTypes, Map<String, List<String>> equivalenceFeatures) Instantiates the unifier. -
Method Summary
Modifier and TypeMethodDescriptionfinal void
addNeutralElement
(AnalyzedTokenReadings analyzedTokenReadings) Used to add neutral elements (AnalyzedTokenReadings
to the unified sequence.private void
addTokenToSequence
(List<AnalyzedTokenReadings> tokenSequence, AnalyzedToken token, int pos) private boolean
final boolean
getFinalUnificationValue
(Map<String, List<String>> uFeatures) Make sure that we really matched all the required features of the unification.final @Nullable AnalyzedTokenReadings[]
Used for getting a unified sequence in case when simple test methodisUnified(AnalyzedToken, Map, boolean)
} was used.final @Nullable AnalyzedTokenReadings[]
Gets a full sequence of filtered tokens.protected final boolean
isSatisfied
(AnalyzedToken aToken, Map<String, List<String>> uFeatures) Tests if a token has shared features with other tokens.final boolean
final boolean
isUnified
(AnalyzedToken matchToken, Map<String, List<String>> uFeatures, boolean lastReading, boolean isMatched) Tests if the token sequence is unified.final void
reset()
Resets after use of unification.final void
Call after every complete token (AnalyzedTokenReadings) checked.final void
Starts testing only those equivalences that were previously matched.
-
Field Details
-
UNIFY_IGNORE
- See Also:
-
tokSequence
-
tokSequenceEquivalences
List of all equivalences matched per tokens in the sequence, kept exactly in sync with the list in tokSequence, so that a reading 2 of token 1 has its equivalence map addressable as tokSequenceEquivalences.get(1).get(2). -
equivalenceTypes
A Map for storing the equivalence types for features. Features are specified as Strings, and map into types defined as maps from Strings to Elements. -
equivalenceFeatures
A Map that stores all possible equivalence types listed for features. -
equivalencesMatched
Map of sets of matched equivalences in the unified sequence. -
allFeatsIn
private boolean allFeatsIn -
tokCnt
private int tokCnt -
readingsCounter
private int readingsCounter -
featuresFound
-
tmpFeaturesFound
-
equivalencesToBeKept
-
unificationFeats
-
inUnification
private boolean inUnification -
uniMatched
private boolean uniMatched -
uniAllMatched
private boolean uniAllMatched
-
-
Constructor Details
-
Unifier
public Unifier(Map<EquivalenceTypeLocator, PatternToken> equivalenceTypes, Map<String, List<String>> equivalenceFeatures) Instantiates the unifier.
-
-
Method Details
-
isSatisfied
Tests if a token has shared features with other tokens.- Parameters:
aToken
- token to be testeduFeatures
- features to be tested- Returns:
- true if the token shares this type of feature with other tokens
-
checkNext
-
startNextToken
public final void startNextToken()Call after every complete token (AnalyzedTokenReadings) checked. -
startUnify
public final void startUnify()Starts testing only those equivalences that were previously matched. -
getFinalUnificationValue
Make sure that we really matched all the required features of the unification.- Parameters:
uFeatures
- Features to be checked- Returns:
- True if the token sequence has been found.
- Since:
- 2.5
-
reset
public final void reset()Resets after use of unification. Required. -
getUnifiedTokens
Gets a full sequence of filtered tokens.- Returns:
- Array of AnalyzedTokenReadings that match equivalence relation
defined for features tested, or
null
-
addTokenToSequence
private void addTokenToSequence(List<AnalyzedTokenReadings> tokenSequence, AnalyzedToken token, int pos) -
isUnified
public final boolean isUnified(AnalyzedToken matchToken, Map<String, List<String>> uFeatures, boolean lastReading, boolean isMatched) Tests if the token sequence is unified.Usage note: to test if the sequence of tokens is unified (i.e., shares a group of features, such as the same gender, number, grammatical case etc.), you need to test all tokens but the last one in the following way: call
To make it work in XML rules, the Elements built based onisUnified()
for every reading of a token, and setlastReading
totrue
. For the last token, check the truth value returned by this method. In previous cases, it may actually be discarded before the final check. SeeAbstractPatternRule
for an example.<token>
s inside the unify block have to be processed in a special way: namely the last Element has to be marked as the last one (by usingPatternToken.setLastInUnification()
).- Parameters:
matchToken
-AnalyzedToken
token to unifylastReading
- true when the matchToken is the last reading in theAnalyzedTokenReadings
isMatched
- true if the reading matches the element in the pattern rule, otherwise the reading is not considered in the unification- Returns:
- true if the tokens in the sequence are unified
-
isUnified
public final boolean isUnified(AnalyzedToken matchToken, Map<String, List<String>> uFeatures, boolean lastReading) -
addNeutralElement
Used to add neutral elements (AnalyzedTokenReadings
to the unified sequence. Useful if the sequence contains punctuation or connectives, for example.- Parameters:
analyzedTokenReadings
- A neutral element to be added.- Since:
- 2.5
-
getFinalUnified
Used for getting a unified sequence in case when simple test methodisUnified(AnalyzedToken, Map, boolean)
} was used.- Returns:
- An array of
AnalyzedTokenReadings
ornull
when not in unification
-