Logo Search packages:      
Sourcecode: icu version File versions  Download package

MessageFormat Class Reference

#include <msgfmt.h>

Inheritance diagram for MessageFormat:

Format UObject UMemory

List of all members.


Detailed Description

A MessageFormat produces concatenated messages in a language-neutral way. It should be used for all string concatenations that are visible to end users.

A MessageFormat contains an array of subformats arranged within a template string. Together, the subformats and template string determine how the MessageFormat will operate during formatting and parsing.

Typically, both the subformats and the template string are specified at once in a pattern. By using different patterns for different locales, messages may be localized.

During formatting, the MessageFormat takes an array of arguments and produces a user-readable string. Each argument is a Formattable object; they may be passed in in an array, or as a single Formattable object which itself contains an array. Each argument is matched up with its corresponding subformat, which then formats it into a string. The resultant strings are then assembled within the string template of the MessageFormat to produce the final output string.

Note: In ICU 4.0 MessageFormat supports named arguments. If a named argument is used, all arguments must be named. Names start with a character in UCHAR_ID_START and continue with characters in UCHARID_CONTINUE, in particular they do not start with a digit. If named arguments are used, usesNamedArguments() will return true.

The other new methods supporting named arguments are getFormatNames(UErrorCode& status), getFormat(const UnicodeString& formatName, UErrorCode& status) setFormat(const UnicodeString& formatName, const Format& format, UErrorCode& status), adoptFormat(const UnicodeString& formatName, Format* formatToAdopt, UErrorCode& status), format(const Formattable* arguments, const UnicodeString *argumentNames, int32_t cnt, UnicodeString& appendTo, FieldPosition& status, int32_t recursionProtection, UErrorCode& success), format(const UnicodeString* argumentNames, const Formattable* arguments, int32_t count, UnicodeString& appendTo,UErrorCode& status). These methods are all compatible with patterns that do not used named arguments-- in these cases the keys in the input or output use UnicodeStrings that name the argument indices, e.g. "0", "1", "2"... etc.

When named arguments are used, certain methods on MessageFormat that take or return arrays do not perform any action, since it is not possible to identify positions in an array using a name. UErrorCode is set to U_ARGUMENT_TYPE_MISMATCH if there is a status/success field in the method. These methods are adoptFormats(Format** formatsToAdopt, int32_t count), setFormats(const Format** newFormats,int32_t count), adoptFormat(int32_t n, Format *newFormat), getFormats(int32_t& count), format(const Formattable* source,int32_t cnt,UnicodeString& appendTo, FieldPosition& ignore, UErrorCode& success), format(const UnicodeString& pattern,const Formattable* arguments,int32_t cnt,UnicodeString& appendTo,UErrorCode& success), format(const Formattable& source, UnicodeString& appendTo,FieldPosition& ignore, UErrorCode& success), format(const Formattable* arguments, int32_t cnt, UnicodeString& appendTo, FieldPosition& status, int32_t recursionProtection,UErrorCode& success), parse(const UnicodeString& source, ParsePosition& pos,int32_t& count), parse(const UnicodeString& source, int32_t& cnt, UErrorCode& status)

During parsing, an input string is matched against the string template of the MessageFormat to produce an array of Formattable objects. Plain text of the template string is matched directly against intput text. At each position in the template string where a subformat is located, the subformat is called to parse the corresponding segment of input text to produce an output argument. In this way, an array of arguments is created which together constitute the parse result.

Parsing may fail or produce unexpected results in a number of circumstances.

Here are some examples of usage:

Example 1:

 
     UErrorCode success = U_ZERO_ERROR;
     GregorianCalendar cal(success);
     Formattable arguments[] = {
         7L,
         Formattable( (Date) cal.getTime(success), Formattable::kIsDate),
         "a disturbance in the Force"
     };

     UnicodeString result;
     MessageFormat::format(
          "At {1,time} on {1,date}, there was {2} on planet {0,number}.",
          arguments, 3, result, success );

     cout << "result: " << result << endl;
     //<output>: At 4:34:20 PM on 23-Mar-98, there was a disturbance
     //             in the Force on planet 7.
Typically, the message format will come from resources, and the arguments will be dynamically set at runtime.

Example 2:

  
     success = U_ZERO_ERROR;
     Formattable testArgs[] = {3L, "MyDisk"};

     MessageFormat form(
         "The disk \"{1}\" contains {0} file(s).", success );

     UnicodeString string;
     FieldPosition fpos = 0;
     cout << "format: " << form.format(testArgs, 2, string, fpos, success ) << endl;

     // output, with different testArgs:
     // output: The disk "MyDisk" contains 0 file(s).
     // output: The disk "MyDisk" contains 1 file(s).
     // output: The disk "MyDisk" contains 1,273 file(s).

The pattern is of the following form. Legend:

 
       {optional item}
       (group that may be repeated)*
Do not confuse optional items with items inside quotes braces, such as this: "{". Quoted braces are literals.
  
       messageFormatPattern := string ( "{" messageFormatElement "}" string )*

       messageFormatElement := argumentIndex | argumentName { "," elementFormat }

       elementFormat := "time" { "," datetimeStyle }
                      | "date" { "," datetimeStyle }
                      | "number" { "," numberStyle }
                      | "choice" "," choiceStyle

       datetimeStyle := "short"
                      | "medium"
                      | "long"
                      | "full"
                      | dateFormatPattern

       numberStyle :=   "currency"
                      | "percent"
                      | "integer"
                      | numberFormatPattern

       choiceStyle :=   choiceFormatPattern
 
       pluralStyle := pluralFormatPattern
If there is no elementFormat, then the argument must be a string, which is substituted. If there is no dateTimeStyle or numberStyle, then the default format is used (e.g. NumberFormat::createInstance(), DateFormat::createTimeInstance(DateFormat::kDefault, ...) or DateFormat::createDateInstance(DateFormat::kDefault, ...). For a ChoiceFormat, the pattern must always be specified, since there is no default.

In strings, single quotes can be used to quote syntax characters. A literal single quote is represented by '', both within and outside of single-quoted segments. Inside a messageFormatElement, quotes are not removed. For example, {1,number,$'#',##} will produce a number format with the pound-sign quoted, with a result such as: "$#31,45".

If a pattern is used, then unquoted braces in the pattern, if any, must match: that is, "ab {0} de" and "ab '}' de" are ok, but "ab {0'}' de" and "ab } de" are not.

Warning:
The rules for using quotes within message format patterns unfortunately have shown to be somewhat confusing. In particular, it isn't always obvious to localizers whether single quotes need to be doubled or not. Make sure to inform localizers about the rules, and tell them (for example, by using comments in resource bundle source files) which strings will be processed by MessageFormat. Note that localizers may need to use single quotes in translated strings where the original version doesn't have them.
Note also that the simplest way to avoid the problem is to use the real apostrophe (single quote) character U+2019 (') for human-readable text, and to use the ASCII apostrophe (U+0027 ' ) only in program syntax, like quoting in MessageFormat. See the annotations for U+0027 Apostrophe in The Unicode Standard.

The argumentIndex is a non-negative integer, which corresponds to the index of the arguments presented in an array to be formatted. The first argument has argumentIndex 0.

It is acceptable to have unused arguments in the array. With missing arguments or arguments that are not of the right class for the specified format, a failing UErrorCode result is set.

For more sophisticated patterns, you can use a ChoiceFormat to get output:

 
     UErrorCode success = U_ZERO_ERROR;
     MessageFormat* form("The disk \"{1}\" contains {0}.", success);
     double filelimits[] = {0,1,2};
     UnicodeString filepart[] = {"no files","one file","{0,number} files"};
     ChoiceFormat* fileform = new ChoiceFormat(filelimits, filepart, 3);
     form.setFormat(1, *fileform); // NOT zero, see below

     Formattable testArgs[] = {1273L, "MyDisk"};

     UnicodeString string;
     FieldPosition fpos = 0;
     cout << form.format(testArgs, 2, string, fpos, success) << endl;

     // output, with different testArgs
     // output: The disk "MyDisk" contains no files.
     // output: The disk "MyDisk" contains one file.
     // output: The disk "MyDisk" contains 1,273 files.
You can either do this programmatically, as in the above example, or by using a pattern (see ChoiceFormat for more information) as in:
 
    form.applyPattern(
      "There {0,choice,0#are no files|1#is one file|1<are {0,number,integer} files}.");

Note: As we see above, the string produced by a ChoiceFormat in MessageFormat is treated specially; occurences of '{' are used to indicated subformats, and cause recursion. If you create both a MessageFormat and ChoiceFormat programmatically (instead of using the string patterns), then be careful not to produce a format that recurses on itself, which will cause an infinite loop.

Note: Subformats are numbered by their order in the pattern. This is not the same as the argumentIndex.

 
    For example: with "abc{2}def{3}ghi{0}...",

    format0 affects the first variable {2}
    format1 affects the second variable {3}
    format2 affects the second variable {0}

User subclasses are not supported. While clients may write subclasses, such code will not necessarily work and will not be guaranteed to work stably from release to release.

Definition at line 306 of file msgfmt.h.


Public Types

enum  EFormatNumber { kMaxFormat = 10 }

Public Member Functions

virtual void adoptFormat (const UnicodeString &formatName, Format *formatToAdopt, UErrorCode &status)
virtual void adoptFormat (int32_t formatNumber, Format *formatToAdopt)
virtual void adoptFormats (Format **formatsToAdopt, int32_t count)
virtual void applyPattern (const UnicodeString &pattern, UParseError &parseError, UErrorCode &status)
virtual void applyPattern (const UnicodeString &pattern, UErrorCode &status)
virtual Formatclone (void) const
UnicodeStringformat (const UnicodeString *argumentNames, const Formattable *arguments, int32_t count, UnicodeString &appendTo, UErrorCode &status) const
UnicodeStringformat (const Formattable &obj, UnicodeString &appendTo, UErrorCode &status) const
virtual UnicodeStringformat (const Formattable &obj, UnicodeString &appendTo, FieldPosition &pos, UErrorCode &status) const
UnicodeStringformat (const Formattable *source, int32_t count, UnicodeString &appendTo, FieldPosition &ignore, UErrorCode &status) const
int32_t getArgTypeCount () const
virtual UClassID getDynamicClassID (void) const
virtual FormatgetFormat (const UnicodeString &formatName, UErrorCode &status)
virtual StringEnumerationgetFormatNames (UErrorCode &status)
virtual const Format ** getFormats (int32_t &count) const
Locale getLocale (ULocDataLocaleType type, UErrorCode &status) const
virtual const LocalegetLocale (void) const
const char * getLocaleID (ULocDataLocaleType type, UErrorCode &status) const
 MessageFormat (const MessageFormat &)
 MessageFormat (const UnicodeString &pattern, const Locale &newLocale, UParseError &parseError, UErrorCode &status)
 MessageFormat (const UnicodeString &pattern, const Locale &newLocale, UErrorCode &status)
 MessageFormat (const UnicodeString &pattern, UErrorCode &status)
UBool operator!= (const Format &other) const
const MessageFormatoperator= (const MessageFormat &)
virtual UBool operator== (const Format &other) const
virtual Formattableparse (const UnicodeString &source, int32_t &count, UErrorCode &status) const
virtual Formattableparse (const UnicodeString &source, ParsePosition &pos, int32_t &count) const
void parseObject (const UnicodeString &source, Formattable &result, UErrorCode &status) const
virtual void parseObject (const UnicodeString &source, Formattable &result, ParsePosition &pos) const
virtual void setFormat (const UnicodeString &formatName, const Format &format, UErrorCode &status)
virtual void setFormat (int32_t formatNumber, const Format &format)
virtual void setFormats (const Format **newFormats, int32_t cnt)
virtual void setLocale (const Locale &theLocale)
virtual UnicodeStringtoPattern (UnicodeString &appendTo) const
UBool usesNamedArguments () const
virtual ~MessageFormat ()

Static Public Member Functions

static UnicodeString autoQuoteApostrophe (const UnicodeString &pattern, UErrorCode &status)
static UnicodeStringformat (const UnicodeString &pattern, const Formattable *arguments, int32_t count, UnicodeString &appendTo, UErrorCode &status)
static UClassID U_EXPORT2 getStaticClassID (void)
static void U_EXPORT2 operator delete (void *, void *) U_NO_THROW
static void U_EXPORT2 operator delete (void *p) U_NO_THROW
static void U_EXPORT2 operator delete[] (void *p) U_NO_THROW
static void *U_EXPORT2 operator new (size_t, void *ptr) U_NO_THROW
static void *U_EXPORT2 operator new (size_t size) U_NO_THROW
static void *U_EXPORT2 operator new[] (size_t size) U_NO_THROW

Protected Member Functions

void setLocaleIDs (const char *valid, const char *actual)

Static Protected Member Functions

static void syntaxError (const UnicodeString &pattern, int32_t pos, UParseError &parseError)

Private Member Functions

UBool allocateArgTypes (int32_t capacity)
UBool allocateSubformats (int32_t capacity)
NumberFormatcreateIntegerFormat (const Locale &locale, UErrorCode &status) const
UnicodeStringformat (const Formattable *arguments, const UnicodeString *argumentNames, int32_t cnt, UnicodeString &appendTo, FieldPosition &status, int32_t recursionProtection, UErrorCode &success) const
UnicodeStringformat (const Formattable *arguments, int32_t cnt, UnicodeString &appendTo, FieldPosition &status, int32_t recursionProtection, UErrorCode &success) const
const Formattable::TypegetArgTypeList (int32_t &listCount) const
const DateFormatgetDefaultDateFormat (UErrorCode &) const
const NumberFormatgetDefaultNumberFormat (UErrorCode &) const
UBool isLegalArgName (const UnicodeString &argName) const
void makeFormat (int32_t offsetNumber, UnicodeString *segments, UParseError &parseError, UErrorCode &success)

Static Private Member Functions

static void copyAndFixQuotes (const UnicodeString &appendTo, int32_t start, int32_t end, UnicodeString &target)
static int32_t findKeyword (const UnicodeString &s, const UChar *const *list)

Private Attributes

int32_t argTypeCapacity
int32_t argTypeCount
Formattable::TypeargTypes
DateFormatdefaultDateFormat
NumberFormatdefaultNumberFormat
Locale fLocale
Format ** formatAliases
int32_t formatAliasesCapacity
UnicodeString fPattern
UProperty idContinue
UProperty idStart
UBool isArgNumeric
int32_t subformatCapacity
int32_t subformatCount
Subformat * subformats

Friends

class MessageFormatAdapter

Classes

class  Subformat

The documentation for this class was generated from the following files:

Generated by  Doxygen 1.6.0   Back to index