Difference between revisions of "lexicon"

From GeneWeb
Jump to: navigation, search
(Description of the lexicon)
m
 
(6 intermediate revisions by 2 users not shown)
Line 7: Line 7:
 
GeneWeb versions 5 and before were using the file {{git|hd/lang|lexicon.txt}} encoded in [https://fr.wikipedia.org/wiki/ISO_8859-1 Latin-1].
 
GeneWeb versions 5 and before were using the file {{git|hd/lang|lexicon.txt}} encoded in [https://fr.wikipedia.org/wiki/ISO_8859-1 Latin-1].
  
The full lexicon can be displayed from the welcome page of your browser with the URL {{c|1=http://localhost:2317/mabase?m=LEX}}, or a button to do so may be available as in Roglo [http://roglo.eu/roglo_f?lang=fr;m=LEX].
+
Goodwill to expand GeneWeb translations capabilities is welcome!! See the [[contribute]] page.
 
+
Goodwill to expand GeneWeb translations capabilities is welcome!!
+
  
 
== Language codes ==
 
== Language codes ==
Line 49: Line 47:
  
 
== Description of the lexicon ==
 
== Description of the lexicon ==
 +
=== Direct translation ===
 
The lexicon is a text file which defined translation in several languages (hopefully all the ones supported by GeneWeb, but not necessarily) for terms that appear at the head of each translation item.
 
The lexicon is a text file which defined translation in several languages (hopefully all the ones supported by GeneWeb, but not necessarily) for terms that appear at the head of each translation item.
  
Line 57: Line 56:
  
 
     item 2
 
     item 2
l1: item 2 ranslated in language 1
+
l1: item 2 translated in language 1
 
     ...
 
     ...
 
</pre>
 
</pre>
Line 72: Line 71:
 
fr: fils/fille/enfant
 
fr: fils/fille/enfant
 
</pre>
 
</pre>
In the case of multiple entries, the selection is performed according to a number attached to the brackets as in:
+
 
 +
=== Multiple translations ===
 +
 
 +
Several translation items may appear between brackets, separated by two colons as in {{c|[$add::parents]}}. In this case, the terms {{c|add}} and {{c|parents}} have each their own entry in the lexicon.
 +
 
 +
Some languages (german for instance) invert the verb and the object. In this case, adding "+before" at the end of the definition of the first string will trigger this inversion:
 +
 
 
<pre>
 
<pre>
He had one [son/daughter/child]0 and one [son/daughter/child]1
+
    add
 +
de: hinzufügen +before
 +
fr: ajouter
 
</pre>
 
</pre>
This number based feature is extended to cover several specific cases relative to the sex and number of families and children:
 
*{{c|[family/families]f}} or {{c|[family/families]nb_families}} will select according to the value of {{c|nb_families}}
 
*{{c|[child/children]c}} or {{c|[child/children]nb_children}} will select according to the value of {{c|nb_children}}
 
*{{c|[son/daughter/child]s}} will select according to the sex of the child
 
*{{c|[man/woman/unknown]n}} or {{c|[man/woman/unknown]sex}} will select according to the sex of the person
 
*{{c|[witness/witnesses]w}} will select according to the numbre of witnesses of an event.
 
*{{c|[zero/one/two/three/four]x}} will select according to the value of {{c|count}}
 
  
Note: the one character variant is currently the only one available (WIP).
+
{{c|[*add::parents]}} will translate in french as {{c|Ajouter parents}} and in german as
 +
{{c|Eltern hinzufügen}}.
 +
 
 +
=== Translation with substitution ===
  
 
The items may contain variables as in:  
 
The items may contain variables as in:  
Line 97: Line 100:
 
  living between xxx and yyy
 
  living between xxx and yyy
  
 +
 +
=== Declension ===
 
GeneWeb is also capable of handling [[Declension/fr|declensions]] as in:
 
GeneWeb is also capable of handling [[Declension/fr|declensions]] as in:
 
<pre>
 
<pre>
Line 104: Line 109:
 
</pre>
 
</pre>
  
Several translation items may appear between brackets as in {{c|[$add::parents]}}. In this case, the terms {{c|add}} and {{c|parents}} have each their own entry in the lexicon.
 
  
 +
=== Vowels and apostrophe ===
 
Some languages requires some adaptation when a word begins by a vowel. For instance, one says in french: "au baptème" but "à l'inhumation. To cater for these cases, groups of letters between square brackets, separated by a vertical bar are selected depending on the presence of a vowel at the beginning of the following word.
 
Some languages requires some adaptation when a word begins by a vowel. For instance, one says in french: "au baptème" but "à l'inhumation. To cater for these cases, groups of letters between square brackets, separated by a vertical bar are selected depending on the presence of a vowel at the beginning of the following word.
  
Line 118: Line 123:
 
</pre>
 
</pre>
  
Note : previous versions of GeneWeb handle a single character substitution: {{c|fr: %1 d[e’]%2}}, the space necessary after the "de" being added by the system.
+
Note : previous versions of GeneWeb handle a single character substitution: {{c|fr: %1 d[e’]%2}}, the space necessary after the "de" being added by the system. Implementation of the better solution described above is underway.
  
 +
=== Html ===
 
Translations may contain basic HTML code for text formatting. This may be necessary for high quality typography of cardinal numbering:
 
Translations may contain basic HTML code for text formatting. This may be necessary for high quality typography of cardinal numbering:
 
(n<sup>ième</sup>/1<sup>er</sup>/2<sup>e</sup>/3<sup>e</sup>/4<sup>e</sup>/…):
 
(n<sup>ième</sup>/1<sup>er</sup>/2<sup>e</sup>/3<sup>e</sup>/4<sup>e</sup>/…):
Line 136: Line 142:
 
* {{c|[*(french revolution month)]12}} translate into {{c|Complémentaire}}.
 
* {{c|[*(french revolution month)]12}} translate into {{c|Complémentaire}}.
 
* {{c|[(french revolution month)]13}} and {{c|[(french revolution month)]0}} translate into {{c|vendémiaire}}: If the index exceeds the limit, the first element is selected.
 
* {{c|[(french revolution month)]13}} and {{c|[(french revolution month)]0}} translate into {{c|vendémiaire}}: If the index exceeds the limit, the first element is selected.
 +
 +
This number based feature is extended to cover several specific cases relative to the sex and number of families and children:
 +
*{{c|[family/families]f}} or {{c|[family/families]nb_families}} will select according to the value of {{c|nb_families}}
 +
*{{c|[child/children]c}} or {{c|[child/children]nb_children}} will select according to the value of {{c|nb_children}}
 +
*{{c|[son/daughter/child]s}} will select according to the sex of the child
 +
*{{c|[man/woman/unknown]n}} or {{c|[man/woman/unknown]sex}} will select according to the sex of the person
 +
*{{c|[witness/witnesses]w}} will select according to the numbre of witnesses of an event.
 +
*{{c|[zero/one/two/three/four]x}} will select according to the value of {{c|count}}
 +
 +
Note: the one character variant is currently the only one available (WIP).
 +
 +
== Extension of lexicon ==
 +
 +
When developing his own templates, a user may want to extend the available lexicon. Such an extension is provided through a file following the same syntax as above, and is supplied to GeneWeb with the parameter  {{c|-add_lexicon}} at the launch of [[gwd|gwd]]. The supplemental lexicon must be placed in one of the {{c|lang}} folder.
 +
-add_lexicon supplementary_lexicon.txt
 +
 +
  
 
{{manual}}
 
{{manual}}
  
 
[[Category:Manual]]
 
[[Category:Manual]]

Latest revision as of 10:18, 9 November 2020

150px-Geographylogo svg.png Language: English • français
GeneWeb performs text translations according to rules contained in a file called lex-utf8.txt and start_utf8.txt in the folder gw/lang of your installation. The second file contains text specific to the welcome page.

These files are encoded according to UTF-8, and it is important to maintain this encoding if had you decided to edit their content.

GeneWeb versions 5 and before were using the file lexicon.txt encoded in Latin-1.

Goodwill to expand GeneWeb translations capabilities is welcome!! See the contribute page.

Language codes

The language of the user interface of GeneWeb is selectable from the welcome page, or can be changed at any time by adding the lang=xx text in the URL, where xx is the two letter code (ISO 639-1) of the language of your choice.

af=afrikaans
bg=bulgarian
br=breton
ca=catalan
co=corsican
cs=czech
da=danish
de=german
en=english
eo=esperanto
es=spanish
et=estonian
fi=finnish
fr=french
he=hebrew
is=icelandic
it=italian
lv=latvian
nl=dutch
no=norwegian
oc=occitan
pl=polish
pt=portuguese
pt-br=brazilian-portuguese
ro=romanian
ru=russian
sl=slovenian
sv=swedish
zh=chinese

Note that if the command lang= appears multiple times in the URL, only the first one will be taken into account.

Description of the lexicon

Direct translation

The lexicon is a text file which defined translation in several languages (hopefully all the ones supported by GeneWeb, but not necessarily) for terms that appear at the head of each translation item.

    item
l1: item translated in language 1
l2: item translated in language 2

    item 2
l1: item 2 translated in language 1
    ...

The item translation for each language is proposed on a single line, one per language as designated by its two letter code.

Some languages offer multiple translations for a single item:

    (french revolution month)
fr: vendémiaire/brumaire/frimaire/nivôse/pluviôse/ventôse/germinal/floréal/prairial/messidor/thermidor/fructidor/complémentaire

    son/daughter/child
fr: fils/fille/enfant

Multiple translations

Several translation items may appear between brackets, separated by two colons as in [$add::parents]. In this case, the terms add and parents have each their own entry in the lexicon.

Some languages (german for instance) invert the verb and the object. In this case, adding "+before" at the end of the definition of the first string will trigger this inversion:

    add
de: hinzufügen +before
fr: ajouter

[*add::parents] will translate in french as Ajouter parents and in german as Eltern hinzufügen.

Translation with substitution

The items may contain variables as in:

    %d years ago
fr: il y a %d ans

The strings to be substituted are separated from the main string by a sequence of three colons, a single colon separating substitution strings from each other.

[living between %s and %s:::xxx:yyy] 

will be transformed into

living between xxx and yyy


Declension

GeneWeb is also capable of handling declensions as in:

    parents
eo: gepatroj:a:+n
pl: rodzice:a:-ów


Vowels and apostrophe

Some languages requires some adaptation when a word begins by a vowel. For instance, one says in french: "au baptème" but "à l'inhumation. To cater for these cases, groups of letters between square brackets, separated by a vertical bar are selected depending on the presence of a vowel at the beginning of the following word.

    to %1
en: to %1
fr: [au |à l']%1

    %1 of %2
en: %1 of %2
fr: %1 d[e |’]%2

Note : previous versions of GeneWeb handle a single character substitution: fr: %1 d[e’]%2, the space necessary after the "de" being added by the system. Implementation of the better solution described above is underway.

Html

Translations may contain basic HTML code for text formatting. This may be necessary for high quality typography of cardinal numbering: (nième/1er/2e/3e/4e/…):

    nth
fr: n<sup>ième</sup>/1<sup>er</sup>/2<sup>e</sup>/3<sup>e</sup>/4<sup>e</sup>/…

Using the lexicon

To ask for a translation, it is sufficient to include the item between brackets: [item]. It will be translated into the current language. If the translation is not available, the bracketed form will appear instead.

It is possible to the "upper case" the first letter (GeneWeb will determine if this is appropriate according to the position in the sentence) by putting a * after the opening bracket. For instance, in french, [*3rd cousins] will translate into Cousins issus d’issus de germains

To obtain the nth element of a list, add a number after the closing backet, starting at "0". For instance, the french republican month (0 to 12) will display (with lang=fr) in the following fashion:

  • [(french revolution month)]2 translate into frimaire.
  • [*(french revolution month)]12 translate into Complémentaire.
  • [(french revolution month)]13 and [(french revolution month)]0 translate into vendémiaire: If the index exceeds the limit, the first element is selected.

This number based feature is extended to cover several specific cases relative to the sex and number of families and children:

  • [family/families]f or [family/families]nb_families will select according to the value of nb_families
  • [child/children]c or [child/children]nb_children will select according to the value of nb_children
  • [son/daughter/child]s will select according to the sex of the child
  • [man/woman/unknown]n or [man/woman/unknown]sex will select according to the sex of the person
  • [witness/witnesses]w will select according to the numbre of witnesses of an event.
  • [zero/one/two/three/four]x will select according to the value of count

Note: the one character variant is currently the only one available (WIP).

Extension of lexicon

When developing his own templates, a user may want to extend the available lexicon. Such an extension is provided through a file following the same syntax as above, and is supplied to GeneWeb with the parameter -add_lexicon at the launch of gwd. The supplemental lexicon must be placed in one of the lang folder.

-add_lexicon supplementary_lexicon.txt



GeneWeb Manual

Rembrandt Old Man Reading a Book.jpg

Use and manage genealogical databases

Technical annex