Logo Search packages:      
Sourcecode: icu version File versions

ForwardCharacterIterator Class Reference

#include <chariter.h>

Inheritance diagram for ForwardCharacterIterator:

CharacterIterator UCharCharacterIterator StringCharacterIterator

List of all members.

Detailed Description

Abstract class that defines an API for forward-only iteration on text objects. This is a minimal interface for iteration without random access or backwards iteration. It is especially useful for wrapping streams with converters into an object for collation or normalization.

Characters can be accessed in two ways: as code units or as code points. Unicode code points are 21-bit integers and are the scalar values of Unicode characters. ICU uses the type UChar32 for them. Unicode code units are the storage units of a given Unicode/UCS Transformation Format (a character encoding scheme). With UTF-16, all code points can be represented with either one or two code units ("surrogates"). String storage is typically based on code units, while properties of characters are typically determined using code point values. Some processes may be designed to work with sequences of code units, or it may be known that all characters that are important to an algorithm can be represented with single code units. Other processes will need to use the code point access functions.

ForwardCharacterIterator provides nextPostInc() to access a code unit and advance an internal position into the text object, similar to a return text[position++].
It provides next32PostInc() to access a code point and advance an internal position.

next32PostInc() assumes that the current position is that of the beginning of a code point, i.e., of its first code unit. After next32PostInc(), this will be true again. In general, access to code units and code points in the same iteration loop should not be mixed. In UTF-16, if the current position is on a second code unit (Low Surrogate), then only that code unit is returned even by next32PostInc().

For iteration with either function, there are two ways to check for the end of the iteration. When there are no more characters in the text object:


 void function1(ForwardCharacterIterator &it) {
     UChar32 c;
     while(it.hasNext()) {
         // use c

 void function1(ForwardCharacterIterator &it) {
     UChar c;
     while((c=it.nextPostInc())!=ForwardCharacterIterator::DONE) {
         // use c

Definition at line 84 of file chariter.h.

Public Types

enum  { DONE = 0xffff }

Public Member Functions

virtual UClassID getDynamicClassID (void) const =0
virtual int32_t hashCode (void) const =0
virtual UBool hasNext ()=0
virtual UChar32 next32PostInc (void)=0
virtual UChar nextPostInc (void)=0
UBool operator!= (const ForwardCharacterIterator &that) const
virtual UBool operator== (const ForwardCharacterIterator &that) const =0
virtual ~ForwardCharacterIterator ()

Protected Member Functions

 ForwardCharacterIterator (const ForwardCharacterIterator &)
ForwardCharacterIteratoroperator= (const ForwardCharacterIterator &)

The documentation for this class was generated from the following file:

Generated by  Doxygen 1.6.0   Back to index