Logo Search packages:      
Sourcecode: icu version File versions

UnicodeString Class Reference

#include <unistr.h>

Inheritance diagram for UnicodeString:

Replaceable

List of all members.


Detailed Description

UnicodeString is a string class that stores Unicode characters directly and provides similar functionality as the Java String class. It is a concrete implementation of the abstract class Replaceable (for transliteration).

In ICU, strings are stored and used as UTF-16. This means that a string internally consists of 16-bit Unicode code units.
UTF-16 is a variable-length encoding: A Unicode character may be stored with either one code unit — which is the most common case — or with a matched pair of special code units ("surrogates"). The data type for code units is UChar.
For single-character handling, a Unicode character code point is a scalar value in the range 0..0x10ffff. ICU uses the UChar32 type for code points.

Indexes and offsets into and lengths of strings always count code units, not code points. This is the same as with multi-byte char* strings in traditional string handling. Operations on partial strings typically do not test for code point boundaries. If necessary, the user needs to take care of such boundaries by testing for the code unit values or by using functions like UnicodeString::getChar32Start() and UnicodeString::getChar32Limit() (or, in C, the equivalent macros UTF_SET_CHAR_START() and UTF_SET_CHAR_LIMIT(), see utf.h).

UnicodeString uses four storage models:

  1. Short strings are normally stored inside the UnicodeString object itself. The object has fields for the "bookkeeping" and a small UChar array. When the object is copied, then the internal characters are copied into the destination object.
  2. Longer strings are normally stored in allocated memory. The allocated UChar array is preceeded by a reference counter. When the string object is copied, then the allocated buffer is shared by incrementing the reference counter.
  3. A UnicodeString can be constructed or setTo() such that it aliases a read-only buffer instead of copying the characters. In this case, the string object uses this aliased buffer for as long as it is not modified, and it will never attempt to modify or release the buffer. This has copy-on-write semantics: When the string object is modified, then the buffer contents is first copied into writable memory (inside the object for short strings, or allocated buffer for longer strings). When a UnicodeString with a read-only alias is assigned to another UnicodeString, then both string objects will share the same read-only alias.
  4. A UnicodeString can be constructed or setTo() such that it aliases a writable buffer instead of copying the characters. The difference from the above is that the string object will write through to this aliased buffer for write operations. Only when the capacity of the buffer is not sufficient is a new buffer allocated and the contents copied. An efficient way to get the string contents into the original buffer is to use the extract(..., UChar *dst, ...) function: It will only copy the string contents if the dst buffer is different from the buffer of the string object itself. If a string grows and shrinks during a sequence of operations, then it will not use the same buffer any more, but may fit into it again. When a UnicodeString with a writable alias is assigned to another UnicodeString, then the contents is always copied. The destination string will not alias to the buffer that the source string aliases.

See also:
Unicode

Definition at line 145 of file unistr.h.


Public Member Functions

UnicodeStringappend (UChar32 srcChar)
UnicodeStringappend (UChar srcChar)
UnicodeStringappend (const UChar *srcChars, int32_t srcLength)
UnicodeStringappend (const UChar *srcChars, int32_t srcStart, int32_t srcLength)
UnicodeStringappend (const UnicodeString &srcText)
UnicodeStringappend (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
int8_t caseCompare (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength, uint32_t options) const
int8_t caseCompare (int32_t start, int32_t length, const UChar *srcChars, uint32_t options) const
int8_t caseCompare (const UChar *srcChars, int32_t srcLength, uint32_t options) const
int8_t caseCompare (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, uint32_t options) const
int8_t caseCompare (int32_t start, int32_t length, const UnicodeString &srcText, uint32_t options) const
int8_t caseCompare (const UnicodeString &text, uint32_t options) const
int8_t caseCompareBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit, uint32_t options) const
UChar32 char32At (int32_t offset) const
UChar charAt (int32_t offset) const
int8_t compare (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
int8_t compare (int32_t start, int32_t length, const UChar *srcChars) const
int8_t compare (const UChar *srcChars, int32_t srcLength) const
int8_t compare (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
int8_t compare (int32_t start, int32_t length, const UnicodeString &srcText) const
int8_t compare (const UnicodeString &text) const
int8_t compareBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit) const
int8_t compareCodePointOrder (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
int8_t compareCodePointOrder (int32_t start, int32_t length, const UChar *srcChars) const
int8_t compareCodePointOrder (const UChar *srcChars, int32_t srcLength) const
int8_t compareCodePointOrder (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
int8_t compareCodePointOrder (int32_t start, int32_t length, const UnicodeString &srcText) const
int8_t compareCodePointOrder (const UnicodeString &text) const
int8_t compareCodePointOrderBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit) const
virtual void copy (int32_t start, int32_t limit, int32_t dest)
int32_t countChar32 (int32_t start=0, int32_t length=0x7fffffff) const
UBool empty (void) const
UBool endsWith (const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
UBool endsWith (const UChar *srcChars, int32_t srcLength) const
UBool endsWith (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
UBool endsWith (const UnicodeString &text) const
int32_t extract (char *dest, int32_t destCapacity, UConverter *cnv, UErrorCode &errorCode) const
int32_t extract (int32_t start, int32_t startLength, char *target, uint32_t targetLength, const char *codepage=0) const
int32_t extract (int32_t start, int32_t startLength, char *target, const char *codepage=0) const
void extract (int32_t start, int32_t length, UnicodeString &target) const
int32_t extract (UChar *dest, int32_t destCapacity, UErrorCode &errorCode) const
void extract (int32_t start, int32_t length, UChar *dst, int32_t dstStart=0) const
virtual void extractBetween (int32_t start, int32_t limit, UnicodeString &target) const
void extractBetween (int32_t start, int32_t limit, UChar *dst, int32_t dstStart=0) const
UnicodeStringfindAndReplace (int32_t start, int32_t length, const UnicodeString &oldText, int32_t oldStart, int32_t oldLength, const UnicodeString &newText, int32_t newStart, int32_t newLength)
UnicodeStringfindAndReplace (int32_t start, int32_t length, const UnicodeString &oldText, const UnicodeString &newText)
UnicodeStringfindAndReplace (const UnicodeString &oldText, const UnicodeString &newText)
UnicodeStringfoldCase (uint32_t options=0)
const UChargetBuffer () const
UChargetBuffer (int32_t minCapacity)
int32_t getCapacity (void) const
int32_t getChar32Limit (int32_t offset) const
int32_t getChar32Start (int32_t offset) const
int32_t getCharLimit (int32_t offset) const
int32_t getCharStart (int32_t offset) const
virtual void handleReplaceBetween (int32_t start, int32_t limit, const UnicodeString &text)
int32_t hashCode (void) const
int32_t indexOf (UChar32 c, int32_t start, int32_t length) const
int32_t indexOf (UChar c, int32_t start, int32_t length) const
int32_t indexOf (UChar32 c, int32_t start) const
int32_t indexOf (UChar c, int32_t start) const
int32_t indexOf (UChar32 c) const
int32_t indexOf (UChar c) const
int32_t indexOf (const UChar *srcChars, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
int32_t indexOf (const UChar *srcChars, int32_t srcLength, int32_t start, int32_t length) const
int32_t indexOf (const UChar *srcChars, int32_t srcLength, int32_t start) const
int32_t indexOf (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
int32_t indexOf (const UnicodeString &text, int32_t start, int32_t length) const
int32_t indexOf (const UnicodeString &text, int32_t start) const
int32_t indexOf (const UnicodeString &text) const
UnicodeStringinsert (int32_t start, UChar32 srcChar)
UnicodeStringinsert (int32_t start, UChar srcChar)
UnicodeStringinsert (int32_t start, const UChar *srcChars, int32_t srcLength)
UnicodeStringinsert (int32_t start, const UChar *srcChars, int32_t srcStart, int32_t srcLength)
UnicodeStringinsert (int32_t start, const UnicodeString &srcText)
UnicodeStringinsert (int32_t start, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
UBool isBogus (void) const
UBool isEmpty (void) const
int32_t lastIndexOf (UChar32 c, int32_t start, int32_t length) const
int32_t lastIndexOf (UChar c, int32_t start, int32_t length) const
int32_t lastIndexOf (UChar32 c, int32_t start) const
int32_t lastIndexOf (UChar c, int32_t start) const
int32_t lastIndexOf (UChar32 c) const
int32_t lastIndexOf (UChar c) const
int32_t lastIndexOf (const UChar *srcChars, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
int32_t lastIndexOf (const UChar *srcChars, int32_t srcLength, int32_t start, int32_t length) const
int32_t lastIndexOf (const UChar *srcChars, int32_t srcLength, int32_t start) const
int32_t lastIndexOf (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
int32_t lastIndexOf (const UnicodeString &text, int32_t start, int32_t length) const
int32_t lastIndexOf (const UnicodeString &text, int32_t start) const
int32_t lastIndexOf (const UnicodeString &text) const
int32_t length (void) const
int32_t moveIndex32 (int32_t index, int32_t delta) const
int32_t numDisplayCells (int32_t start=0, int32_t length=INT32_MAX, UBool asian=TRUE) const
UBool operator!= (const UnicodeString &text) const
UnicodeStringoperator+= (const UnicodeString &srcText)
UnicodeStringoperator+= (UChar32 ch)
UnicodeStringoperator+= (UChar ch)
UBool operator< (const UnicodeString &text) const
UBool operator<= (const UnicodeString &text) const
UnicodeStringoperator= (UChar32 ch)
UnicodeStringoperator= (UChar ch)
UnicodeStringoperator= (const UnicodeString &srcText)
UBool operator== (const UnicodeString &text) const
UBool operator> (const UnicodeString &text) const
UBool operator>= (const UnicodeString &text) const
UCharReference operator[] (int32_t pos)
UChar operator[] (int32_t offset) const
UBool padLeading (int32_t targetLength, UChar padChar=0x0020)
UBool padTrailing (int32_t targetLength, UChar padChar=0x0020)
void releaseBuffer (int32_t newLength=-1)
UnicodeStringremove (int32_t start, int32_t length=(int32_t) INT32_MAX)
UnicodeStringremove (void)
UnicodeStringremoveBetween (int32_t start, int32_t limit=(int32_t) INT32_MAX)
UnicodeStringreplace (int32_t start, int32_t length, UChar32 srcChar)
UnicodeStringreplace (int32_t start, int32_t length, UChar srcChar)
UnicodeStringreplace (int32_t start, int32_t length, const UChar *srcChars, int32_t srcLength)
UnicodeStringreplace (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength)
UnicodeStringreplace (int32_t start, int32_t length, const UnicodeString &srcText)
UnicodeStringreplace (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
UnicodeStringreplaceBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit)
UnicodeStringreplaceBetween (int32_t start, int32_t limit, const UnicodeString &srcText)
UnicodeStringreverse (int32_t start, int32_t length)
UnicodeStringreverse (void)
UnicodeStringsetCharAt (int32_t offset, UChar ch)
UnicodeStringsetTo (UChar *buffer, int32_t buffLength, int32_t buffCapacity)
UnicodeStringsetTo (UBool isTerminated, const UChar *text, int32_t textLength)
UnicodeStringsetTo (UChar32 srcChar)
UnicodeStringsetTo (UChar srcChar)
UnicodeStringsetTo (const UChar *srcChars, int32_t srcLength)
UnicodeStringsetTo (const UnicodeString &srcText)
UnicodeStringsetTo (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
void setToBogus ()
UBool startsWith (const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
UBool startsWith (const UChar *srcChars, int32_t srcLength) const
UBool startsWith (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
UBool startsWith (const UnicodeString &text) const
UnicodeStringtoLower (const Locale &locale)
UnicodeStringtoLower (void)
UnicodeStringtoTitle (BreakIterator *titleIter, const Locale &locale)
UnicodeStringtoTitle (BreakIterator *titleIter)
UnicodeStringtoUpper (const Locale &locale)
UnicodeStringtoUpper (void)
UnicodeStringtrim (void)
UBool truncate (int32_t targetLength)
UnicodeString unescape () const
UChar32 unescapeAt (int32_t &offset) const
 UnicodeString (const UnicodeString &that)
 UnicodeString (const char *src, int32_t srcLength, UConverter *cnv, UErrorCode &errorCode)
 UnicodeString (const char *codepageData, int32_t dataLength, const char *codepage=0)
 UnicodeString (const char *codepageData, const char *codepage=0)
 UnicodeString (UChar *buffer, int32_t buffLength, int32_t buffCapacity)
 UnicodeString (UBool isTerminated, const UChar *text, int32_t textLength)
 UnicodeString (const UChar *text, int32_t textLength)
 UnicodeString (const UChar *text)
 UnicodeString (UChar32 ch)
 UnicodeString (UChar ch)
 UnicodeString (int32_t capacity, UChar32 c, int32_t count)
 UnicodeString ()
 ~UnicodeString ()

Protected Member Functions

virtual UChar32 getChar32At (int32_t offset) const
virtual UChar getCharAt (int32_t offset) const
virtual int32_t getLength () const

Private Types

enum  {
  US_STACKBUF_SIZE = 3, kInvalidUChar = 0xffff, kGrowSize = 128, kInvalidHashCode = 0,
  kEmptyHashCode = 1, kIsBogus = 1, kUsingStackBuffer = 2, kRefCounted = 4,
  kBufferIsReadonly = 8, kOpenGetBuffer = 16, kShortString = kUsingStackBuffer, kLongString = kRefCounted,
  kReadonlyAlias = kBufferIsReadonly, kWritableAlias = 0
}

Private Member Functions

void addRef (void)
UBool allocate (int32_t capacity)
UnicodeStringcaseMap (BreakIterator *titleIter, const Locale &locale, uint32_t options, int32_t toWhichCase)
UBool cloneArrayIfNeeded (int32_t newCapacity=-1, int32_t growCapacity=-1, UBool doCopyArray=TRUE, int32_t **pBufferToDelete=0, UBool forceClone=FALSE)
int8_t doCaseCompare (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength, uint32_t options) const
int8_t doCaseCompare (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, uint32_t options) const
UChar doCharAt (int32_t offset) const
void doCodepageCreate (const char *codepageData, int32_t dataLength, UConverter *converter, UErrorCode &status)
void doCodepageCreate (const char *codepageData, int32_t dataLength, const char *codepage)
int8_t doCompare (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
int8_t doCompare (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
int8_t doCompareCodePointOrder (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
int8_t doCompareCodePointOrder (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
int32_t doExtract (int32_t start, int32_t length, char *dest, int32_t destCapacity, UConverter *cnv, UErrorCode &errorCode) const
void doExtract (int32_t start, int32_t length, UnicodeString &target) const
void doExtract (int32_t start, int32_t length, UChar *dst, int32_t dstStart) const
int32_t doHashCode (void) const
int32_t doIndexOf (UChar32 c, int32_t start, int32_t length) const
int32_t doIndexOf (UChar c, int32_t start, int32_t length) const
int32_t doLastIndexOf (UChar32 c, int32_t start, int32_t length) const
int32_t doLastIndexOf (UChar c, int32_t start, int32_t length) const
UnicodeStringdoReplace (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength)
UnicodeStringdoReplace (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
UnicodeStringdoReverse (int32_t start, int32_t length)
const UChargetArrayStart (void) const
UChargetArrayStart (void)
void pinIndices (int32_t &start, int32_t &length) const
int32_t refCount (void) const
void releaseArray (void)
int32_t removeRef (void)
int32_t setRefCount (int32_t count)

Private Attributes

UCharfArray
int32_t fCapacity
uint16_t fFlags
int32_t fLength
UChar fStackBuffer [US_STACKBUF_SIZE]

Friends

class StringCharacterIterator
class StringThreadTest
class UnicodeConverter

The documentation for this class was generated from the following files:

Generated by  Doxygen 1.6.0   Back to index