Logo Search packages:      
Sourcecode: icu version File versions  Download package

BreakIterator Class Reference

#include <brkiter.h>

Inheritance diagram for BreakIterator:
Collaboration diagram for BreakIterator:

List of all members.

Public Types

enum  { DONE = (int32_t)-1 }

Public Member Functions

virtual void adoptText (CharacterIterator *it)=0
virtual BreakIteratorclone (void) const =0
virtual BreakIteratorcreateBufferClone (void *stackBuffer, int32_t &BufferSize, UErrorCode &status)=0
virtual int32_t current (void) const =0
virtual int32_t first (void)=0
virtual int32_t following (int32_t offset)=0
virtual UClassID getDynamicClassID (void) const =0
Locale getLocale (ULocDataLocaleType type, UErrorCode &status) const
const char * getLocaleID (ULocDataLocaleType type, UErrorCode &status) const
virtual CharacterIteratorgetText (void) const =0
virtual UTextgetUText (UText *fillIn, UErrorCode &status) const =0
virtual UBool isBoundary (int32_t offset)=0
UBool isBufferClone (void)
virtual int32_t last (void)=0
virtual int32_t next (void)=0
virtual int32_t next (int32_t n)=0
UBool operator!= (const BreakIterator &rhs) const
virtual UBool operator== (const BreakIterator &) const =0
virtual int32_t preceding (int32_t offset)=0
virtual int32_t previous (void)=0
virtual void setText (const UnicodeString &text)=0
virtual void setText (UText *text, UErrorCode &status)=0
virtual ~BreakIterator ()

Static Public Member Functions

static BreakIterator *U_EXPORT2 createCharacterInstance (const Locale &where, UErrorCode &status)
static BreakIterator *U_EXPORT2 createLineInstance (const Locale &where, UErrorCode &status)
static BreakIterator *U_EXPORT2 createSentenceInstance (const Locale &where, UErrorCode &status)
static BreakIterator *U_EXPORT2 createTitleInstance (const Locale &where, UErrorCode &status)
static BreakIterator *U_EXPORT2 createWordInstance (const Locale &where, UErrorCode &status)
static const Locale *U_EXPORT2 getAvailableLocales (int32_t &count)
static StringEnumeration *U_EXPORT2 getAvailableLocales (void)
static UnicodeString &U_EXPORT2 getDisplayName (const Locale &objectLocale, const Locale &displayLocale, UnicodeString &name)
static UnicodeString &U_EXPORT2 getDisplayName (const Locale &objectLocale, UnicodeString &name)
static void U_EXPORT2 operator delete (void *p) U_NO_THROW
static void U_EXPORT2 operator delete (void *, void *) U_NO_THROW
static void U_EXPORT2 operator delete[] (void *p) U_NO_THROW
static void *U_EXPORT2 operator new (size_t size) U_NO_THROW
static void *U_EXPORT2 operator new (size_t, void *ptr) U_NO_THROW
static void *U_EXPORT2 operator new[] (size_t size) U_NO_THROW
static URegistryKey U_EXPORT2 registerInstance (BreakIterator *toAdopt, const Locale &locale, UBreakIteratorType kind, UErrorCode &status)
static UBool U_EXPORT2 unregister (URegistryKey key, UErrorCode &status)

Protected Member Functions

 BreakIterator ()
 BreakIterator (const BreakIterator &other)

Protected Attributes

UBool fBufferClone

Private Member Functions

BreakIteratoroperator= (const BreakIterator &)

Static Private Member Functions

static BreakIteratorbuildInstance (const Locale &loc, const char *type, int32_t kind, UErrorCode &status)
static BreakIteratorcreateInstance (const Locale &loc, int32_t kind, UErrorCode &status)
static BreakIteratormakeInstance (const Locale &loc, int32_t kind, UErrorCode &status)

Private Attributes

char actualLocale [ULOC_FULLNAME_CAPACITY]
char validLocale [ULOC_FULLNAME_CAPACITY]

Friends

class ICUBreakIteratorFactory
class ICUBreakIteratorService

Detailed Description

The BreakIterator class implements methods for finding the location of boundaries in text. BreakIterator is an abstract base class. Instances of BreakIterator maintain a current position and scan over text returning the index of characters where boundaries occur.

Line boundary analysis determines where a text string can be broken when line-wrapping. The mechanism correctly handles punctuation and hyphenated words.

Sentence boundary analysis allows selection with correct interpretation of periods within numbers and abbreviations, and trailing punctuation marks such as quotation marks and parentheses.

Word boundary analysis is used by search and replace functions, as well as within text editing applications that allow the user to select words with a double click. Word selection provides correct interpretation of punctuation marks within and following words. Characters that are not part of a word, such as symbols or punctuation marks, have word-breaks on both sides.

Character boundary analysis allows users to interact with characters as they expect to, for example, when moving the cursor through a text string. Character boundary analysis provides correct navigation of through character strings, regardless of how the character is stored. For example, an accented character might be stored as a base character and a diacritical mark. What users consider to be a character can differ between languages.

The text boundary positions are found according to the rules described in Unicode Standard Annex #29, Text Boundaries, and Unicode Standard Annex #14, Line Breaking Properties. These are available at http://www.unicode.org/reports/tr14/ and http://www.unicode.org/reports/tr29/.

In addition to the C++ API defined in this header file, a plain C API with equivalent functionality is defined in the file ubrk.h

Code snippets illustrating the use of the Break Iterator APIs are available in the ICU User Guide, http://icu-project.org/userguide/boundaryAnalysis.html and in the sample program icu/source/samples/break/break.cpp

Definition at line 100 of file brkiter.h.


The documentation for this class was generated from the following files:

Generated by  Doxygen 1.6.0   Back to index