[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Sometimes it is necessary to manipulate PO files in a way that is better
performed automatically than by hand. GNU gettext
includes a
complete set of tools for this purpose.
When merging two packages into a single package, the resulting POT file will be the concatenation of the two packages' POT files. Thus the maintainer must concatenate the two existing package translations into a single translation catalog, for each language. This is best performed using ‘msgcat’. It is then the translators' duty to deal with any possible conflicts that arose during the merge.
When a translator takes over the translation job from another translator, but she uses a different character encoding in her locale, she will convert the catalog to her character encoding. This is best done through the ‘msgconv’ program.
When a maintainer takes a source file with tagged messages from another package, he should also take the existing translations for this source file (and not let the translators do the same job twice). One way to do this is through ‘msggrep’, another is to create a POT file for that source file and use ‘msgmerge’.
When a translator wants to adjust some translation catalog for a special dialect or orthography — for example, German as written in Switzerland versus German as written in Germany — she needs to apply some text processing to every message in the catalog. The tool for doing this is ‘msgfilter’.
Another use of msgfilter
is to produce approximately the POT file for
which a given PO file was made. This can be done through a filter command
like ‘msgfilter sed -e d | sed -e '/^# /d'’. Note that the original
POT file may have had different comments and different plural message counts,
that's why it's better to use the original POT file if available.
When a translator wants to check her translations, for example according to orthography rules or using a non-interactive spell checker, she can do so using the ‘msgexec’ program.
When third party tools create PO or POT files, sometimes duplicates cannot
be avoided. But the GNU gettext
tools give an error when they
encounter duplicate msgids in the same file and in the same domain.
To merge duplicates, the ‘msguniq’ program can be used.
‘msgcomm’ is a more general tool for keeping or throwing away duplicates, occurring in different files.
‘msgcmp’ can be used to check whether a translation catalog is completely translated.
‘msgattrib’ can be used to select and extract only the fuzzy or untranslated messages of a translation catalog.
‘msgen’ is useful as a first step for preparing English translation catalogs. It copies each message's msgid to its msgstr.
Finally, for those applications where all these various programs are not sufficient, a library ‘libgettextpo’ is provided that can be used to write other specialized programs that process PO files.
msgcat
Program msgcat [option] [inputfile]... |
The msgcat
program concatenates and merges the specified PO files.
It finds messages which are common to two or more of the specified PO files.
By using the --more-than
option, greater commonality may be requested
before messages are printed. Conversely, the --less-than
option may be
used to specify less commonality before messages are printed (i.e.
‘--less-than=2’ will only print the unique messages). Translations,
comments, extracted comments, and file positions will be cumulated, except that
if --use-first
is specified, they will be taken from the first PO file
to define them.
To concatenate POT files, better use xgettext
, not msgcat
,
because msgcat
would choke on the undefined charsets in the specified
POT files.
Input files.
Read the names of the input files from file instead of getting them from the command line.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If inputfile is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
Print messages with less than number definitions, defaults to infinite if not set.
Print messages with more than number definitions, defaults to 0 if not set.
Shorthand for ‘--less-than=2’. Requests that only unique messages be printed.
Assume the input files are Java ResourceBundles in Java .properties
syntax, not in PO file syntax.
Assume the input files are NeXTstep/GNUstep localized resource files in
.strings
syntax, not in PO file syntax.
Specify encoding for output.
Use first available translation for each message. Don't merge several translations into one.
Specify the ‘Language’ field to be used in the header entry. See Filling in the Header Entry for the meaning of this field. Note: The ‘Language-Team’ and ‘Plural-Forms’ fields are left unchanged.
Specify whether or when to use colors and other text attributes.
See The --color
option for details.
Specify the CSS style rule file to use for --color
.
See The --style
option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location
).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Display this help and exit.
Output version information and exit.
msgconv
Program msgconv [option] [inputfile] |
The msgconv
program converts a translation catalog to a different
character encoding.
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
Specify encoding for output.
The default encoding is the current locale's encoding.
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
Specify whether or when to use colors and other text attributes.
See The --color
option for details.
Specify the CSS style rule file to use for --color
.
See The --style
option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location
).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Display this help and exit.
Output version information and exit.
msggrep
Program msggrep [option] [inputfile] |
The msggrep
program extracts all messages of a translation catalog
that match a given pattern or belong to some given source files.
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[-N sourcefile]... [-M domainname]... [-J msgctxt-pattern] [-K msgid-pattern] [-T msgstr-pattern] [-C comment-pattern] |
A message is selected if
When more than one selection criterion is specified, the set of selected messages is the union of the selected messages of each criterion.
msgctxt-pattern or msgid-pattern or msgstr-pattern syntax:
[-E | -F] [-e pattern | -f file]... |
patterns are basic regular expressions by default, or extended regular expressions if -E is given, or fixed strings if -F is given.
Select messages extracted from sourcefile. sourcefile can be either a literal file name or a wildcard pattern.
Select messages belonging to domain domainname.
Start of patterns for the msgctxt.
Start of patterns for the msgid.
Start of patterns for the msgstr.
Start of patterns for the translator's comment.
Start of patterns for the extracted comments.
Specify that pattern is an extended regular expression.
Specify that pattern is a set of newline-separated strings.
Use pattern as a regular expression.
Obtain pattern from file.
Ignore case distinctions.
Output only the messages that do not match any selection criterion, instead of the messages that match a selection criterion.
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
Specify whether or when to use colors and other text attributes.
See The --color
option for details.
Specify the CSS style rule file to use for --color
.
See The --style
option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location
).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Display this help and exit.
Output version information and exit.
To extract the messages that come from the source files
gnulib-lib/error.c
and gnulib-lib/getopt.c
:
msggrep -N gnulib-lib/error.c -N gnulib-lib/getopt.c input.po |
To extract the messages that contain the string “Please specify” in the original string:
msggrep --msgid -F -e 'Please specify' input.po |
To extract the messages that have a context specifier of either “Menu>File” or “Menu>Edit” or a submenu of them:
msggrep --msgctxt -E -e '^Menu>(File|Edit)' input.po |
To extract the messages whose translation contains one of the strings in the
file wordlist.txt
:
msggrep --msgstr -F -f wordlist.txt input.po |
msgfilter
Program msgfilter [option] filter [filter-option] |
The msgfilter
program applies a filter to all translations of a
translation catalog.
During each filter invocation, the environment variable
MSGFILTER_MSGID
is bound to the message's msgid, and the environment
variable MSGFILTER_LOCATION
is bound to the location in the PO file
of the message. If the message has a context, the environment variable
MSGFILTER_MSGCTXT
is bound to the message's msgctxt, otherwise it is
unbound. If the message has a plural form, environment variable
MSGFILTER_MSGID_PLURAL
is bound to the message's msgid_plural and
MSGFILTER_PLURAL_FORM
is bound to the order number of the plural
actually processed (starting with 0), otherwise both are unbound.
If the message has a previous msgid (added by msgmerge
),
environment variable MSGFILTER_PREV_MSGCTXT
is bound to the
message's previous msgctxt, MSGFILTER_PREV_MSGID
is bound to
the previous msgid, and MSGFILTER_PREV_MSGID_PLURAL
is bound to
the previous msgid_plural.
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
The filter can be any program that reads a translation from standard input and writes a modified translation to standard output. A frequently used filter is ‘sed’. A few particular built-in filters are also recognized.
Add newline at the end of each input line and also strip the ending newline from the output line.
Note: If the filter is not a built-in filter, you have to care about encodings:
It is your responsibility to ensure that the filter can cope
with input encoded in the translation catalog's encoding. If the
filter wants input in a particular encoding, you can in a first step
convert the translation catalog to that encoding using the ‘msgconv’
program, before invoking ‘msgfilter’. If the filter wants input
in the locale's encoding, but you want to avoid the locale's encoding, then
you can first convert the translation catalog to UTF-8 using the
‘msgconv’ program and then make ‘msgfilter’ work in an UTF-8
locale, by using the LC_ALL
environment variable.
Note: Most translations in a translation catalog don't end with a
newline character. For this reason, unless the --newline
option is used, it is important that the filter recognizes its
last input line even if it ends without a newline, and that it doesn't
add an undesired trailing newline at the end. The ‘sed’ program on
some platforms is known to ignore the last line of input if it is not
terminated with a newline. You can use GNU sed
instead; it does
not have this limitation.
Add script to the commands to be executed.
Add the contents of scriptfile to the commands to be executed.
Suppress automatic printing of pattern space.
The filter ‘recode-sr-latin’ is recognized as a built-in filter. The command ‘recode-sr-latin’ converts Serbian text, written in the Cyrillic script, to the Latin script. The command ‘msgfilter recode-sr-latin’ applies this conversion to the translations of a PO file. Thus, it can be used to convert an ‘sr.po’ file to an ‘sr@latin.po’ file.
The filter ‘quot’ is recognized as a built-in filter. The command ‘msgfilter quot’ converts any quotations surrounded by a pair of ‘"’, ‘'’, and ‘`’.
The filter ‘boldquot’ is recognized as a built-in filter. The command ‘msgfilter boldquot’ converts any quotations surrounded by a pair of ‘"’, ‘'’, and ‘`’, also adding the VT100 escape sequences to the text to decorate it as bold.
The use of built-in filters is not sensitive to the current locale's encoding. Moreover, when used with a built-in filter, ‘msgfilter’ can automatically convert the message catalog to the UTF-8 encoding when needed.
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
Specify whether or when to use colors and other text attributes.
See The --color
option for details.
Specify the CSS style rule file to use for --color
.
See The --style
option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Keep the header entry, i.e. the message with ‘msgid ""’, unmodified, instead of filtering it. By default, the header entry is subject to filtering like any other message.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location
).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Display this help and exit.
Output version information and exit.
To convert German translations to Swiss orthography (in an UTF-8 locale):
msgconv -t UTF-8 de.po | msgfilter sed -e 's/ß/ss/g' |
To convert Serbian translations in Cyrillic script to Latin script:
msgfilter recode-sr-latin < sr.po |
msguniq
Program msguniq [option] [inputfile] |
The msguniq
program unifies duplicate translations in a translation
catalog. It finds duplicate translations of the same message ID. Such
duplicates are invalid input for other programs like msgfmt
,
msgmerge
or msgcat
. By default, duplicates are merged
together. When using the ‘--repeated’ option, only duplicates are
output, and all other messages are discarded. Comments and extracted
comments will be cumulated, except that if ‘--use-first’ is
specified, they will be taken from the first translation. File positions
will be cumulated. When using the ‘--unique’ option, duplicates are
discarded.
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
Print only duplicates.
Print only unique messages, discard duplicates.
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
Specify encoding for output.
Use first available translation for each message. Don't merge several translations into one.
Specify whether or when to use colors and other text attributes.
See The --color
option for details.
Specify the CSS style rule file to use for --color
.
See The --style
option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location
).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Display this help and exit.
Output version information and exit.
msgcomm
Program msgcomm [option] [inputfile]... |
The msgcomm
program finds messages which are common to two or more
of the specified PO files.
By using the --more-than
option, greater commonality may be requested
before messages are printed. Conversely, the --less-than
option may be
used to specify less commonality before messages are printed (i.e.
‘--less-than=2’ will only print the unique messages). Translations,
comments and extracted comments will be preserved, but only from the first
PO file to define them. File positions from all PO files will be
cumulated.
Input files.
Read the names of the input files from file instead of getting them from the command line.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If inputfile is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
Print messages with less than number definitions, defaults to infinite if not set.
Print messages with more than number definitions, defaults to 1 if not set.
Shorthand for ‘--less-than=2’. Requests that only unique messages be printed.
Assume the input files are Java ResourceBundles in Java .properties
syntax, not in PO file syntax.
Assume the input files are NeXTstep/GNUstep localized resource files in
.strings
syntax, not in PO file syntax.
Specify whether or when to use colors and other text attributes.
See The --color
option for details.
Specify the CSS style rule file to use for --color
.
See The --style
option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location
).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Don't write header with ‘msgid ""’ entry.
Display this help and exit.
Output version information and exit.
msgcmp
Program msgcmp [option] def.po ref.pot |
The msgcmp
program compares two Uniforum style .po files to check that
both contain the same set of msgid strings. The def.po file is an
existing PO file with the translations. The ref.pot file is the last
created PO file, or a PO Template file (generally created by xgettext
).
This is useful for checking that you have translated each and every message
in your program. Where an exact match cannot be found, fuzzy matching is
used to produce better diagnostics.
Translations.
References to the sources.
Add directory to the list of directories. Source files are searched relative to this list of directories.
Apply ref.pot to each of the domains in def.po.
Do not use fuzzy matching when an exact match is not found. This may speed up the operation considerably.
Consider fuzzy messages in the def.po file like translated messages. Note that using this option is usually wrong, because fuzzy messages are exactly those which have not been validated by a human translator.
Consider untranslated messages in the def.po file like translated messages. Note that using this option is usually wrong.
Assume the input files are Java ResourceBundles in Java .properties
syntax, not in PO file syntax.
Assume the input files are NeXTstep/GNUstep localized resource files in
.strings
syntax, not in PO file syntax.
Display this help and exit.
Output version information and exit.
msgattrib
Program msgattrib [option] [inputfile] |
The msgattrib
program filters the messages of a translation catalog
according to their attributes, and manipulates the attributes.
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
Keep translated messages, remove untranslated messages.
Keep untranslated messages, remove translated messages.
Remove ‘fuzzy’ marked messages.
Keep ‘fuzzy’ marked messages, remove all other messages.
Remove obsolete #~ messages.
Keep obsolete #~ messages, remove all other messages.
Attributes are modified after the message selection/removal has been performed. If the ‘--only-file’ or ‘--ignore-file’ option is specified, the attribute modification is applied only to those messages that are listed in the only-file and not listed in the ignore-file.
Set all messages ‘fuzzy’.
Set all messages non-‘fuzzy’.
Set all messages obsolete.
Set all messages non-obsolete.
When setting ‘fuzzy’ mark, keep “previous msgid” of translated messages.
Remove the “previous msgid” (‘#|’) comments from all messages.
When removing ‘fuzzy’ mark, also set msgstr empty.
Limit the attribute changes to entries that are listed in file. file should be a PO or POT file.
Limit the attribute changes to entries that are not listed in file. file should be a PO or POT file.
Synonym for ‘--only-fuzzy --clear-fuzzy’: It keeps only the fuzzy messages and removes their ‘fuzzy’ mark.
Synonym for ‘--only-obsolete --clear-obsolete’: It keeps only the obsolete messages and makes them non-obsolete.
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
Specify whether or when to use colors and other text attributes.
See The --color
option for details.
Specify the CSS style rule file to use for --color
.
See The --style
option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location
).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Display this help and exit.
Output version information and exit.
msgen
Program msgen [option] inputfile |
The msgen
program creates an English translation catalog. The
input file is the last created English PO file, or a PO Template file
(generally created by xgettext). Untranslated entries are assigned a
translation that is identical to the msgid.
Note: ‘msginit --no-translator --locale=en’ performs a very similar
task. The main difference is that msginit
cares specially about
the header entry, whereas msgen
doesn't.
Input PO or POT file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If inputfile is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
Specify the ‘Language’ field to be used in the header entry. See Filling in the Header Entry for the meaning of this field. Note: The ‘Language-Team’ and ‘Plural-Forms’ fields are not set by this option.
Specify whether or when to use colors and other text attributes.
See The --color
option for details.
Specify the CSS style rule file to use for --color
.
See The --style
option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location
).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Display this help and exit.
Output version information and exit.
msgexec
Program msgexec [option] command [command-option] |
The msgexec
program applies a command to all translations of a
translation catalog.
The command can be any program that reads a translation from standard
input. It is invoked once for each translation. Its output becomes
msgexec's output. msgexec
's return code is the maximum return code
across all invocations.
A special builtin command called ‘0’ outputs the translation, followed by a null byte. The output of ‘msgexec 0’ is suitable as input for ‘xargs -0’.
Add newline at the end of each input line.
During each command invocation, the environment variable
MSGEXEC_MSGID
is bound to the message's msgid, and the environment
variable MSGEXEC_LOCATION
is bound to the location in the PO file
of the message. If the message has a context, the environment variable
MSGEXEC_MSGCTXT
is bound to the message's msgctxt, otherwise it is
unbound. If the message has a plural form, environment variable
MSGEXEC_MSGID_PLURAL
is bound to the message's msgid_plural and
MSGEXEC_PLURAL_FORM
is bound to the order number of the plural
actually processed (starting with 0), otherwise both are unbound.
If the message has a previous msgid (added by msgmerge
),
environment variable MSGEXEC_PREV_MSGCTXT
is bound to the
message's previous msgctxt, MSGEXEC_PREV_MSGID
is bound to
the previous msgid, and MSGEXEC_PREV_MSGID_PLURAL
is bound to
the previous msgid_plural.
Note: It is your responsibility to ensure that the command can cope
with input encoded in the translation catalog's encoding. If the
command wants input in a particular encoding, you can in a first step
convert the translation catalog to that encoding using the ‘msgconv’
program, before invoking ‘msgexec’. If the command wants input
in the locale's encoding, but you want to avoid the locale's encoding, then
you can first convert the translation catalog to UTF-8 using the
‘msgconv’ program and then make ‘msgexec’ work in an UTF-8
locale, by using the LC_ALL
environment variable.
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
Display this help and exit.
Output version information and exit.
Translators are usually only interested in seeing the untranslated and fuzzy messages of a PO file. Also, when a message is set fuzzy because the msgid changed, they want to see the differences between the previous msgid and the current one (especially if the msgid is long and only few words in it have changed). Finally, it's always welcome to highlight the different sections of a message in a PO file (comments, msgid, msgstr, etc.).
Such highlighting is possible through the options ‘--color’ and
‘--style’. They are supported by all the programs that produce
a PO file on standard output, such as msgcat
, msgmerge
,
and msgunfmt
.
--color
option The ‘--color=when’ option specifies under which conditions colorized output should be generated. The when part can be one of the following:
always
yes
The output will be colorized.
never
no
The output will not be colorized.
auto
tty
The output will be colorized if the output device is a tty, i.e. when the output goes directly to a text screen or terminal emulator window.
html
The output will be colorized and be in HTML format.
test
This is a special value, understood only by the msgcat
program. It
is explained in the next section (The environment variable TERM
).
‘--color’ is equivalent to ‘--color=yes’. The default is ‘--color=auto’.
Thus, a command like ‘msgcat vi.po’ will produce colorized output when called by itself in a command window. Whereas in a pipe, such as ‘msgcat vi.po | less -R’, it will not produce colorized output. To get colorized output in this situation nevertheless, use the command ‘msgcat --color vi.po | less -R’.
The ‘--color=html’ option will produce output that can be viewed in a browser. This can be useful, for example, for Indic languages, because the renderic of Indic scripts in browsers is usually better than in terminal emulators.
Note that the output produced with the --color
option is not
a valid PO file in itself. It contains additional terminal-specific escape
sequences or HTML tags. A PO file reader will give a syntax error when
confronted with such content. Except for the ‘--color=html’ case,
you therefore normally don't need to save output produced with the
--color
option in a file.
TERM
The environment variable TERM
contains a identifier for the text
window's capabilities. You can get a detailed list of these cababilities
by using the ‘infocmp’ command, using ‘man 5 terminfo’ as a
reference.
When producing text with embedded color directives, msgcat
looks
at the TERM
variable. Text windows today typically support at least
8 colors. Often, however, the text window supports 16 or more colors,
even though the TERM
variable is set to a identifier denoting only
8 supported colors. It can be worth setting the TERM
variable to
a different value in these cases:
xterm
xterm
is in most cases built with support for 16 colors. It can also
be built with support for 88 or 256 colors (but not both). You can try to
set TERM
to either xterm-16color
, xterm-88color
, or
xterm-256color
.
rxvt
rxvt
is often built with support for 16 colors. You can try to set
TERM
to rxvt-16color
.
konsole
konsole
too is often built with support for 16 colors. You can try to
set TERM
to konsole-16color
or xterm-16color
.
After setting TERM
, you can verify it by invoking
‘msgcat --color=test’ and seeing whether the output looks like a
reasonable color map.
--style
option The ‘--style=style_file’ option specifies the style file to use
when colorizing. It has an effect only when the --color
option is
effective.
If the --style
option is not specified, the environment variable
PO_STYLE
is considered. It is meant to point to the user's
preferred style for PO files.
The default style file is ‘$prefix/share/gettext/styles/po-default.css’,
where $prefix
is the installation location.
A few style files are predefined:
This style imitates the look used by vim 7.
This style imitates the look used by GNU Emacs 21 and 22 in an X11 window.
This style imitates the look used by GNU Emacs 22 in a terminal of type ‘xterm’ (8 colors) or ‘xterm-16color’ (16 colors) or ‘xterm-256color’ (256 colors), respectively.
You can use these styles without specifying a directory. They are actually
located in ‘$prefix/share/gettext/styles/’, where $prefix
is the
installation location.
You can also design your own styles. This is described in the next section.
The same style file can be used for styling of a PO file, for terminal output and for HTML output. It is written in CSS (Cascading Style Sheet) syntax. See https://www.w3.org/TR/css2/cover.html for a formal definition of CSS. Many HTML authoring tutorials also contain explanations of CSS.
In the case of HTML output, the style file is embedded in the HTML output.
In the case of text output, the style file is interpreted by the
msgcat
program. This means, in particular, that when
@import
is used with relative file names, the file names are
@import
, in the case of
text output. (Actually, @import
s are not yet supported in this case,
due to a limitation in libcroco
.)
CSS rules are built up from selectors and declarations. The declarations specify graphical properties; the selectors specify when they apply.
In PO files, the following simple selectors (based on "CSS classes", see the CSS2 spec, section 5.8.3) are supported.
.header
This matches the header entry of a PO file.
.translated
This matches a translated message.
.untranslated
This matches an untranslated message (i.e. a message with empty translation).
.fuzzy
This matches a fuzzy message (i.e. a message which has a translation that needs review by the translator).
.obsolete
This matches an obsolete message (i.e. a message that was translated but is not needed by the current POT file any more).
white-space # translator-comments #. extracted-comments #: reference… #, flag… #| msgid previous-untranslated-string msgid untranslated-string msgstr translated-string |
.comment
This matches all comments (translator comments, extracted comments, source file reference comments, flag comments, previous message comments, as well as the entire obsolete messages).
.translator-comment
This matches the translator comments.
.extracted-comment
This matches the extracted comments, i.e. the comments placed by the programmer at the attention of the translator.
.reference-comment
This matches the source file reference comments (entire lines).
.reference
This matches the individual source file references inside the source file reference comment lines.
.flag-comment
This matches the flag comment lines (entire lines).
.flag
This matches the individual flags inside flag comment lines.
.fuzzy-flag
This matches the `fuzzy' flag inside flag comment lines.
.previous-comment
This matches the comments containing the previous untranslated string (entire lines).
.previous
This matches the previous untranslated string including the string delimiters,
the associated keywords (msgid
etc.) and the spaces between them.
.msgid
This matches the untranslated string including the string delimiters,
the associated keywords (msgid
etc.) and the spaces between them.
.msgstr
This matches the translated string including the string delimiters,
the associated keywords (msgstr
etc.) and the spaces between them.
.keyword
This matches the keywords (msgid
, msgstr
, etc.).
.string
This matches strings, including the string delimiters (double quotes).
.text
This matches the entire contents of a string (excluding the string delimiters, i.e. the double quotes).
.escape-sequence
This matches an escape sequence (starting with a backslash).
.format-directive
This matches a format string directive (starting with a ‘%’ sign in the
case of most programming languages, with a ‘{’ in the case of
java-format
and csharp-format
, with a ‘~’ in the case of
lisp-format
and scheme-format
, or with ‘$’ in the case of
sh-format
).
.invalid-format-directive
This matches an invalid format string directive.
.added
In an untranslated string, this matches a part of the string that was not present in the previous untranslated string. (Not yet implemented in this release.)
.changed
In an untranslated string or in a previous untranslated string, this matches a part of the string that is changed or replaced. (Not yet implemented in this release.)
.removed
In a previous untranslated string, this matches a part of the string that is not present in the current untranslated string. (Not yet implemented in this release.)
These selectors can be combined to hierarchical selectors. For example,
.msgstr .invalid-format-directive { color: red; } |
will highlight the invalid format directives in the translated strings.
In text mode, pseudo-classes (CSS2 spec, section 5.11) and pseudo-elements (CSS2 spec, section 5.12) are not supported.
The declarations in HTML mode are not limited; any graphical attribute supported by the browsers can be used.
The declarations in text mode are limited to the following properties. Other properties will be silently ignored.
color
(CSS2 spec, section 14.1)background-color
(CSS2 spec, section 14.2.1)These properties is supported. Colors will be adjusted to match the terminal's capabilities. Note that many terminals support only 8 colors.
font-weight
(CSS2 spec, section 15.2.3)This property is supported, but most terminals can only render two different
weights: normal
and bold
. Values >= 600 are rendered as
bold
.
font-style
(CSS2 spec, section 15.2.3)This property is supported. The values italic
and oblique
are
rendered the same way.
text-decoration
(CSS2 spec, section 16.3.1)This property is supported, limited to the values none
and
underline
.
less
for viewing PO files The ‘less’ program is a popular text file browser for use in a text screen or terminal emulator. It also supports text with embedded escape sequences for colors and text decorations.
You can use less
to view a PO file like this (assuming an UTF-8
environment):
msgcat --to-code=UTF-8 --color xyz.po | less -R |
You can simplify this to this simple command:
less xyz.po |
after these three preparations:
LESS
environment
variable. In sh shells:
$ LESS="$LESS -R -f" $ export LESS |
LESSOPEN
and
LESSCLOSE
environment variables, as indicated in the manual page
(‘man less’).
msgcat
on them, producing
a temporary file. Like this:
case "$1" in *.po) tmpfile=`mktemp "${TMPDIR-/tmp}/less.XXXXXX"` msgcat --to-code=UTF-8 --color "$1" > "$tmpfile" echo "$tmpfile" exit 0 ;; esac |
The “Pology” package is a Free Software package for manipulating PO files. It features, in particular:
Its home page is at http://pology.nedohodnik.net/.
For the tasks for which a combination of ‘msgattrib’, ‘msgcat’ etc. is not sufficient, a set of C functions is provided in a library, to make it possible to process PO files in your own programs. When you use this library, you don't need to write routines to parse the PO file; instead, you retrieve a pointer in memory to each of messages contained in the PO file. Functions for writing those memory structures to a file after working with them are provided too.
The functions are declared in the header file ‘<gettext-po.h>’, and are defined in a library called ‘libgettextpo’.
The following example shows code how these functions can be used. Error handling code is omitted, as its implementation is delegated to the user provided functions.
struct po_xerror_handler handler = { .xerror = …, .xerror2 = … }; const char *filename = …; /* Read the file into memory. */ po_file_t file = po_file_read (filename, &handler); { const char * const *domains = po_file_domains (file); const char * const *domainp; /* Iterate the domains contained in the file. */ for (domainp = domains; *domainp; domainp++) { po_message_t *message; const char *domain = *domainp; po_message_iterator_t iterator = po_message_iterator (file, domain); /* Iterate each message inside the domain. */ while ((message = po_next_message (iterator)) != NULL) { /* Read data from the message … */ const char *msgid = po_message_msgid (message); const char *msgstr = po_message_msgstr (message); … /* Modify its contents … */ if (perform_some_tests (msgid, msgstr)) po_message_set_fuzzy (message, 1); … } /* Always release returned po_message_iterator_t. */ po_message_iterator_free (iterator); } /* Write back the result. */ po_file_t result = po_file_write (file, filename, &handler); } /* Always release the returned po_file_t. */ po_file_free (file); |
Error management is performed through callbacks provided by the user of the library. They are provided through a parameter with the following type:
Its pointer is defined as po_xerror_handler_t
. Contains
two fields, xerror
and xerror2
, with the following function
signatures.
This function is called to signal a problem of the given severity.
It must not return if severity is
PO_SEVERITY_FATAL_ERROR
.
message_text is the problem description. When multiline_p is true, it can contain multiple lines of text, each terminated with a newline, otherwise a single line.
message and/or filename and lineno indicate where the problem occurred:
NULL
, filename and lineno and
column should be ignored.
(size_t)(-1)
, lineno and column
should be ignored.
(size_t)(-1)
, it should be ignored.
This function is called to signal a problem of the given severity
that refers to two messages. It must not return if
severity is PO_SEVERITY_FATAL_ERROR
.
It is similar to two calls to xerror. If possible, an ellipsis can be appended to message_text1 and prepended to message_text2.
This is a pointer type that refers to the contents of a PO file, after it has been read into memory.
The po_file_create
function creates an empty PO file representation in
memory.
The po_file_read
function reads a PO file into memory. The file name
is given as argument. The return value is a handle to the PO file's contents,
valid until po_file_free
is called on it. In case of error, the
functions from handler are called to signal it.
This function is exported as ‘po_file_read_v3’ at ABI level, but is
defined as po_file_read
in C code after the inclusion of
‘<gettext-po.h>’.
The po_file_write
function writes the contents of the memory
structure file the filename given. The return value is
file after a successful operation. In case of error, the
functions from handler are called to signal it.
This function is exported as ‘po_file_write_v2’ at ABI level, but
is defined as po_file_write
in C code after the inclusion of
‘<gettext-po.h>’.
The po_file_free
function frees a PO file's contents from memory,
including all messages that are only implicitly accessible through iterators.
The po_file_domains
function returns the domains for which the given
PO file has messages. The return value is a NULL
terminated array
which is valid as long as the file handle is valid. For PO files which
contain no ‘domain’ directive, the return value contains only one domain,
namely the default domain "messages"
.
This is a pointer type that refers to an iterator that produces a sequence of messages.
The po_message_iterator
returns an iterator that will produce the
messages of file that belong to the given domain. If domain
is NULL
, the default domain is used instead. To list the messages,
use the function po_next_message
repeatedly.
The po_message_iterator_free
function frees an iterator previously
allocated through the po_message_iterator
function.
The po_next_message
function returns the next message from
iterator and advances the iterator. It returns NULL
when the
iterator has reached the end of its message list.
This is a pointer type that refers to a message of a PO file, including its translation.
Returns a freshly constructed message. To finish initializing the
message, you must set the msgid
and msgstr
. It must be
inserted into a file to manage its memory, as there is no
po_message_free
available to the user of the library.
The following functions access details of a po_message_t
. Recall
that the results are valid as long as the file handle is valid.
The po_message_msgctxt
function returns the msgctxt
, the
context of message. Returns NULL
for a message not restricted
to a context.
The po_message_set_msgctxt
function changes the msgctxt
,
the context of the message, to the value provided through
msgctxt. The value NULL
removes the restriction.
The po_message_msgid
function returns the msgid
(untranslated
English string) of message. This is guaranteed to be non-NULL
.
The po_message_set_msgid
function changes the msgid
(untranslated English string) of message to the value provided through
msgid, a non-NULL
string.
The po_message_msgid_plural
function returns the msgid_plural
(untranslated English plural string) of message, a message with plurals,
or NULL
for a message without plural.
The po_message_set_msgid_plural
function changes the
msgid_plural
(untranslated English plural string) of a message to
the value provided through msgid_plural, or removes the plurals if
NULL
is provided as msgid_plural.
The po_message_msgstr
function returns the msgstr
(translation)
of message. For an untranslated message, the return value is an empty
string.
The po_message_set_msgstr
function changes the msgstr
(translation) of message to the value provided through msgstr, a
non-NULL
string.
The po_message_msgstr_plural
function returns the
msgstr[index]
of message, a message with plurals, or
NULL
when the index is out of range or for a message without
plural.
The po_message_set_msgstr_plural
function changes the
msgstr[index]
of message, a message with plurals, to
the value provided through msgstr_plural. message must be a
message with plurals.
Use NULL
as the value of msgstr_plural with
index pointing to the last element to reduce the number of plural
forms.
The po_message_comments
function returns the comments of message,
a multiline string, ending in a newline, or a non-NULL
empty string.
The po_message_set_comments
function changes the comments of
message to the value comments, a multiline string, ending in a
newline, or a non-NULL
empty string.
The po_message_extracted_comments
function returns the extracted
comments of message, a multiline string, ending in a newline, or a
non-NULL
empty string.
The po_message_set_extracted_comments
function changes the
comments of message to the value extracted_comments, a multiline
string, ending in a newline, or a non-NULL
empty string.
The po_message_prev_msgctxt
function returns the previous
msgctxt
, the previous context of message. Return
NULL
for a message that does not have a previous context.
The po_message_set_prev_msgctxt
function changes the previous
msgctxt
, the context of the message, to the value provided
through prev_msgctxt. The value NULL
removes the stored
previous msgctxt.
The po_message_prev_msgid
function returns the previous
msgid
(untranslated English string) of message, or
NULL
if there is no previous msgid
stored.
The po_message_set_prev_msgid
function changes the previous
msgid
(untranslated English string) of message to the value
provided through prev_msgid, or removes the message when it is
NULL
.
The po_message_prev_msgid_plural
function returns the previous
msgid_plural
(untranslated English plural string) of
message, a message with plurals, or NULL
for a message
without plural without any stored previous msgid_plural
.
The po_message_set_prev_msgid_plural
function changes the
previous msgid_plural
(untranslated English plural string) of a
message to the value provided through prev_msgid_plural, or
removes the stored previous msgid_plural
if NULL
is
provided as prev_msgid_plural.
The po_message_is_obsolete
function returns true when message
is marked as obsolete.
The po_message_set_obsolete
function changes the obsolete mark of
message.
The po_message_is_fuzzy
function returns true when message
is marked as fuzzy.
The po_message_set_fuzzy
function changes the fuzzy mark of
message.
The po_message_is_format
function returns true when the message
is marked as being a format string of format_type.
The po_message_set_fuzzy
function changes the format mark of
the message for the format_type provided.
The po_message_is_range
function returns true when the message
has a numeric range set, and stores the minimum and maximum value in the
locations pointed by minp and maxp respectively.
The po_message_set_range
function changes the numeric range of
the message. min and max must be non-negative, with
min < max. Use min and max with value -1
to remove the numeric range of message.
The following functions provide an interface to extract and manipulate
the header entry (see section Filling in the Header Entry) from a file loaded in memory.
The meta information must be written back into the domain message with
the empty string as msgid
.
Returns the header entry of a domain from file, a PO file loaded in
memory. The value NULL
provided as domain denotes the
default domain. Returns NULL
if there is no header entry.
Returns the value of field in the header entry. The return
value is either a freshly allocated string, to be freed by the caller,
or NULL
.
Returns a freshly allocated string which contains the entry from header with field set to value. The field is added if necessary.
This is a pointer type that refers to a string's position within a source file.
The following functions provide an interface to extract and manipulate these references.
Returns the file reference in position index from the message. If
index is out of range, returns NULL
.
Removes the file reference in position index from the message. It moves all references following index one position backwards.
Adds a reference to the string from file starting at
start_line, if it is not already present for the message. The
value (size_t)(-1)
for start_line denotes that the line
number is not available.
Returns a NULL
terminated array of the supported format types.
Returns the pretty name associated with format_type. For example,
it returns “C#” when format_type is “csharp_format”.
Return NULL
if format_type is not a supported format type.
Tests whether the entire file is valid, like msgfmt
does it. If it
is invalid, passes the reasons to handler.
Tests message, to be inserted at iterator in a PO file in memory,
like msgfmt
does it. If it is invalid, passes the reasons to
handler. iterator is not modified by this call; it only
specifies the file and the domain.
Tests whether the message translation from message is a valid format string if the message is marked as being a format string. If it is invalid, passes the reasons to handler.
This function is exported as ‘po_message_check_format_v2’ at ABI
level, but is defined as po_message_check_format
in C code after
the inclusion of ‘<gettext-po.h>’.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Bruno Haible on October, 9 2022 using texi2html 1.78a.