Logo Search packages:      
Sourcecode: icu version File versions  Download package

UBool TransliterationRule::masks ( const TransliterationRule r2  )  const [virtual]

Return true if this rule masks another rule. If r1 masks r2 then r1 matches any input string that r2 matches. If r1 masks r2 and r2 masks r1 then r1 == r2. Examples: "a>x" masks "ab>y". "a>x" masks "a[b]>y". "[c]a>x" masks "[dc]a>y".

Parameters:
r2 the given rule to be compared with.
Returns:
true if this rule masks 'r2'
Return true if this rule masks another rule. If r1 masks r2 then r1 matches any input string that r2 matches. If r1 masks r2 and r2 masks r1 then r1 == r2. Examples: "a>x" masks "ab>y". "a>x" masks "a[b]>y". "[c]a>x" masks "[dc]a>y".

Definition at line 251 of file rbt_rule.cpp.

References anteContextLength, UnicodeString::compare(), flags, keyLength, UnicodeString::length(), and pattern.

Referenced by TransliterationRuleSet::freeze().

                                                                    {
    /* Rule r1 masks rule r2 if the string formed of the
     * antecontext, key, and postcontext overlaps in the following
     * way:
     *
     * r1:      aakkkpppp
     * r2:     aaakkkkkpppp
     *            ^
     * 
     * The strings must be aligned at the first character of the
     * key.  The length of r1 to the left of the alignment point
     * must be <= the length of r2 to the left; ditto for the
     * right.  The characters of r1 must equal (or be a superset
     * of) the corresponding characters of r2.  The superset
     * operation should be performed to check for UnicodeSet
     * masking.
     *
     * Anchors:  Two patterns that differ only in anchors only
     * mask one another if they are exactly equal, and r2 has
     * all the anchors r1 has (optionally, plus some).  Here Y
     * means the row masks the column, N means it doesn't.
     *
     *         ab   ^ab    ab$  ^ab$
     *   ab    Y     Y     Y     Y
     *  ^ab    N     Y     N     Y
     *   ab$   N     N     Y     Y
     *  ^ab$   N     N     N     Y
     *
     * Post context: {a}b masks ab, but not vice versa, since {a}b
     * matches everything ab matches, and {a}b matches {|a|}b but ab
     * does not.  Pre context is different (a{b} does not align with
     * ab).
     */

    /* LIMITATION of the current mask algorithm: Some rule
     * maskings are currently not detected.  For example,
     * "{Lu}]a>x" masks "A]a>y".  This can be added later. TODO
     */

    int32_t len = pattern.length();
    int32_t left = anteContextLength;
    int32_t left2 = r2.anteContextLength;
    int32_t right = len - left;
    int32_t right2 = r2.pattern.length() - left2;
    int32_t cachedCompare = r2.pattern.compare(left2 - left, len, pattern);

    // TODO Clean this up -- some logic might be combinable with the
    // next statement.

    // Test for anchor masking
    if (left == left2 && right == right2 &&
        keyLength <= r2.keyLength &&
        0 == cachedCompare) {
        // The following boolean logic implements the table above
        return (flags == r2.flags) ||
            (!(flags & ANCHOR_START) && !(flags & ANCHOR_END)) ||
            ((r2.flags & ANCHOR_START) && (r2.flags & ANCHOR_END));
    }

    return left <= left2 &&
        (right < right2 ||
         (right == right2 && keyLength <= r2.keyLength)) &&
         (0 == cachedCompare);
}


Generated by  Doxygen 1.6.0   Back to index