Multilingual lexicon for GeneWeb

From GeneWeb
Revision as of 17:05, 3 May 2020 by Henri83 (Talk | contribs) (Add shortcuts for sex, nb_families, nb_children, ... selection)

Jump to: navigation, search
150px-Geographylogo svg.png Language: English • français
GeneWeb performs text translations according to rules contained in a file called lex-utf8.txt and start_utf8.txt in the folder gw/lang of your installation. The second file contains text specific to the welcome page.

These files are encoded according to UTF-8, and it is important to maintain this encoding if had you decided to edit their content.

GeneWeb versions 5 and before were using the file lexicon.txt encoded in Latin-1.

The full lexicon can be displayed from the welcome page of your browser with the URL http://localhost:2317/mabase?m=LEX, or a button to do so may be available as in Roglo [1].

Goodwill to expand GeneWeb translations capabilities is welcome!!

Language codes

The language of the user interface of GeneWeb is selectable from the welcome page, or can be changed at any time by adding the lang=xx text in the URL, where xx is the two letter code (ISO 639-1) of the language of your choice.

af=afrikaans
bg=bulgarian
br=breton
ca=catalan
co=corsican
cs=czech
da=danish
de=german
en=english
eo=esperanto
es=spanish
et=estonian
fi=finnish
fr=french
he=hebrew
is=icelandic
it=italian
lv=latvian
nl=dutch
no=norwegian
oc=occitan
pl=polish
pt=portuguese
pt-br=brazilian-portuguese
ro=romanian
ru=russian
sl=slovenian
sv=swedish
zh=chinese

Note that if the command lang= appears multiple times in the URL, only the first one will be taken into account.

Description of the lexicon

The lexicon is a text file which defined translation in several languages (hopefully all the ones supported by GeneWeb, but not necessarily) for terms that appear at the head of each translation item.

    item
l1: item translated in language 1
l2: item translated in language 2

    item 2
l1: item 2 ranslated in language 1
    ...

The item translation for each language is proposed on a single line, one per language as designated by its two letter code.

Some languages offer multiple translations for a single item:

    (french revolution month)
fr: vendémiaire/brumaire/frimaire/nivôse/pluviôse/ventôse/germinal/floréal/prairial/messidor/thermidor/fructidor/complémentaire

    son/daughter/child
fr: fils/fille/enfant

In the case of multiple entries, the selection is performed according to a number attached to the brackets as in:

He had one [son/daughter/child]0 and one [son/daughter/child]1

This number based feature is extended to cover several specific cases relative to the sex and number of families and children:

  • [family/families]f or [family/families]nb_families will select according to the value of nb_families
  • [child/children]c or [child/children]nb_children will select according to the value of nb_children
  • [son/daughter/child]s will select according to the sex of the child
  • [man/woman/unknown]n or [man/woman/unknown]sex will select according to the sex of the person
  • [witness/witnesses]w will select according to the numbre of witnesses of an event.
  • [zero/one/two/three/four]x will select according to the value of count


The items may contain variables as in:

    %d years ago
fr: il y a %d ans

The strings to be substituted are separated from the main string by a sequence of three colons, a single colon separating substitution strings from each other.

[living between %s and %s:::xxx:yyy] 

will be transformed into

living between xxx and yyy

GeneWeb is also capable of handling declensions as in:

    parents
eo: gepatroj:a:+n
pl: rodzice:a:-ów

Several translation items may appear between brackets as in [$add::parents]. In this case, the terms add and parents have each their own entry in the lexicon.

Some languages requires some adaptation when a word begins by a vowel. For instance, one says in french: "au baptème" but "à l'inhumation. To cater for these cases, groups of letters between square brackets, separated by a vertical bar are selected depending on the presence of a vowel at the beginning of the following word.

    to %1
en: to %1
fr: [au |à l']%1

    %1 of %2
en: %1 of %2
fr: %1 d[e |’]%2

Note : previous versions of GeneWeb handle a single character substitution: fr: %1 d[e’]%2, the space necessary after the "de" being added by the system.

Translations may contain basic HTML code for text formatting. This may be necessary for high quality typography of cardinal numbering: (nième/1er/2e/3e/4e/…):

    nth
fr: n<sup>ième</sup>/1<sup>er</sup>/2<sup>e</sup>/3<sup>e</sup>/4<sup>e</sup>/…

Using the lexicon

To ask for a translation, it is sufficient to include the item between brackets: [item]. It will be translated into the current language. If the translation is not available, the bracketed form will appear instead.

It is possible to the "upper case" the first letter (GeneWeb will determine if this is appropriate according to the position in the sentence) by putting a * after the opening bracket. For instance, in french, [*3rd cousins] will translate into Cousins issus d’issus de germains

To obtain the nth element of a list, add a number after the closing backet, starting at "0". For instance, the french republican month (0 to 12) will display (with lang=fr) in the following fashion:

  • [(french revolution month)]2 translate into frimaire.
  • [*(french revolution month)]12 translate into Complémentaire.
  • [(french revolution month)]13 and [(french revolution month)]0 translate into vendémiaire: If the index exceeds the limit, the first element is selected.

GeneWeb Manual

Rembrandt Old Man Reading a Book.jpg

Use and manage genealogical databases

Technical annex