Logo Search packages:      
Sourcecode: icu version File versions

BreakIterator Class Reference

#include <brkiter.h>

Inheritance diagram for BreakIterator:

RuleBasedBreakIterator DictionaryBasedBreakIterator

List of all members.

Detailed Description

The BreakIterator class implements methods for finding the location of boundaries in text. BreakIterator is an abstract base class. Instances of BreakIterator maintain a current position and scan over text returning the index of characters where boundaries occur.

Line boundary analysis determines where a text string can be broken when line-wrapping. The mechanism correctly handles punctuation and hyphenated words.

Sentence boundary analysis allows selection with correct interpretation of periods within numbers and abbreviations, and trailing punctuation marks such as quotation marks and parentheses.

Word boundary analysis is used by search and replace functions, as well as within text editing applications that allow the user to select words with a double click. Word selection provides correct interpretation of punctuation marks within and following words. Characters that are not part of a word, such as symbols or punctuation marks, have word-breaks on both sides.

Character boundary analysis allows users to interact with characters as they expect to, for example, when moving the cursor through a text string. Character boundary analysis provides correct navigation of through character strings, regardless of how the character is stored. For example, an accented character might be stored as a base character and a diacritical mark. What users consider to be a character can differ between languages.

This is the interface for all text boundaries.


Helper function to output text

    void printTextRange( BreakIterator& iterator, int32_t start, int32_t end )
        UnicodeString textBuffer, temp;
        CharacterIterator *strIter = iterator.createText();
        cout << " " << start << " " << end << " |" 
             << temp.extractBetween(start, end, textBuffer)
             << "|" << endl;
        delete strIter;
Print each element in order:
    void printEachForward( BreakIterator& boundary)
       int32_t start = boundary.first();
       for (int32_t end = boundary.next();
         end != BreakIterator::DONE;
         start = end, end = boundary.next())
             printTextRange( boundary, start, end );
 Print each element in reverse order:
    void printEachBackward( BreakIterator& boundary)
       int32_t end = boundary.last();
       for (int32_t start = boundary.previous();
         start != BreakIterator::DONE;
         end = start, start = boundary.previous())
             printTextRange( boundary, start, end );
Print first element
    void printFirst(BreakIterator& boundary)
        int32_t start = boundary.first();
        int32_t end = boundary.next();
        printTextRange( boundary, start, end );
Print last element
    void printLast(BreakIterator& boundary)
        int32_t end = boundary.last();
        int32_t start = boundary.previous();
        printTextRange( boundary, start, end );
Print the element at a specified position
    void printAt(BreakIterator &boundary, int32_t pos )
        int32_t end = boundary.following(pos);
        int32_t start = boundary.previous();
        printTextRange( boundary, start, end );
Creating and using text boundaries
       void BreakIterator_Example( void )
           BreakIterator* boundary;
           UnicodeString stringToExamine("Aaa bbb ccc. Ddd eee fff.");
           cout << "Examining: " << stringToExamine << endl;
           //print each sentence in forward and reverse order
           boundary = BreakIterator::createSentenceInstance( Locale::US );
           cout << "----- forward: -----------" << endl;
           cout << "----- backward: ----------" << endl;
           delete boundary;
           //print each word in order
           boundary = BreakIterator::createWordInstance();
           cout << "----- forward: -----------" << endl;
           //print first element
           cout << "----- first: -------------" << endl;
           //print last element
           cout << "----- last: --------------" << endl;
           //print word at charpos 10
           cout << "----- at pos 10: ---------" << endl;
           printAt(*boundary, 10 );
           delete boundary;

Definition at line 180 of file brkiter.h.

Public Member Functions

virtual void adoptText (CharacterIterator *it)=0
virtual BreakIteratorclone (void) const =0
virtual BreakIteratorcreateBufferClone (void *stackBuffer, int32_t &BufferSize, UErrorCode &status)=0
virtual int32_t current (void) const =0
virtual int32_t first (void)=0
virtual int32_t following (int32_t offset)=0
virtual UClassID getDynamicClassID (void) const =0
virtual const CharacterIteratorgetText (void) const =0
virtual UBool isBoundary (int32_t offset)=0
UBool isBufferClone (void)
virtual int32_t last (void)=0
virtual int32_t next (int32_t n)=0
virtual int32_t next (void)=0
UBool operator!= (const BreakIterator &rhs) const
virtual UBool operator== (const BreakIterator &) const =0
virtual int32_t preceding (int32_t offset)=0
virtual int32_t previous (void)=0
virtual void setText (const UnicodeString &text)=0

Static Public Member Functions

static BreakIteratorcreateCharacterInstance (const Locale &where, UErrorCode &status)
static BreakIteratorcreateLineInstance (const Locale &where, UErrorCode &status)
static BreakIteratorcreateSentenceInstance (const Locale &where, UErrorCode &status)
static BreakIteratorcreateTitleInstance (const Locale &where, UErrorCode &status)
static BreakIteratorcreateWordInstance (const Locale &where, UErrorCode &status)
static const Locale * getAvailableLocales (int32_t &count)
static UnicodeStringgetDisplayName (const Locale &objectLocale, UnicodeString &name)
static UnicodeStringgetDisplayName (const Locale &objectLocale, const Locale &displayLocale, UnicodeString &name)

Static Public Attributes

static const int32_t DONE = (int32_t)-1

Protected Attributes

UBool fBufferClone

Private Member Functions

 BreakIterator (const BreakIterator &)
BreakIteratoroperator= (const BreakIterator &)

The documentation for this class was generated from the following files:

Generated by  Doxygen 1.6.0   Back to index