Module:si-headword/documentation
This module is used for Sinhalese headword-line templates. This module currently implements {{si-noun}}
, {{si-proper noun}}
, {{si-verb}}
, {{si-adj}}
, {{si-adv}}
, {{si-intj}}
and {{si-con}}
(for conjunctions). See the documentation of those templates for more information. Other Sinhalese headword templates are in the process of being converted to use this module.
The module is always invoked the same way, by passing a single parameter to the "show" function. This parameter is the name of the part of speech, but in plural (examples given are for nouns, and for adjective forms respectively):
{{#invoke:si-headword|show|nouns}} {{#invoke:si-headword|show|adjective forms}}
The template will, by default, accept the following parameters (specific parts of speech may accept or require others):
|head=
,|head2=
,|head3=
, ...- Override the headword display; used to add links to individual words in a multiword term.
|id=
- Sense ID for linking to this headword. See
{{senseid}}
for more information. |nolink=1
or|nolinkhead=1
- Don't link individual words in the headword of a multiword term. Useful for foreign or otherwise unanalyzable terms like a posteriori and yabba dabba doo where the expression functions as a whole in Sinhalese but the individual parts are not Sinhalese words.
|splithyph=1
- Indicate that automatic splitting and linking of words should split on hyphens in multiword expressions with spaces in them, even if the hyphenated component would normally be linked as-is or with hyphens converted to spaces. See #Autosplitting below.
|nosplithyph=1
- Indicate that automatic splitting and linking of words should not split on hyphens in multiword expressions with spaces in them, even if this would normally happen. See #Autosplitting below.
|hyphspace=1
- Indicate that hyphenated components should be linked as a whole using the space-separated equivalent, even if this would not normally happen (i.e. because the space-separated equivalent is not defined as an Sinhalese term). See #Autosplitting below.
|nosuffix=1
- Prevent terms beginning with a hyphen from being interpreted as suffixes. See #Suffix handling below.
|nomultiwordcat=1
- Prevent multiword terms (those with spaces or with hyphens in the middle) from being added to Category:Sinhalese multiword terms.
|pagename=
- Override the page name used to compute default values of various sorts. Useful when testing, for documentation pages, etc.
|sort=
- Sort key. Rarely needs to be specified, as it is normally automatically generated.
Autosplitting
[සංස්කරණය]All templates using this module use an intelligent autosplitting algorithm to link portions of multipart and multiword expressions, as follows:
- If there are spaces in the term but no apostrophes or hyphens, the module will automatically split and link distinct space-separated words, similarly to
{{head}}
; hence, absent without leave will be linked as[[absent]] [[without]] [[leave]]
. - If there are spaces and apostrophes but no hyphens, the module will likewise split and link distinct space-separated words, but may also split up words with apostrophes in them. Specifically:
- If a word ends in
's
, the part before the's
will be linked as a word, and the's
will be linked separately to -'s, on the assumption that the's
is functioning as a possessive. For example, Abel's impossibility theorem will be linked as[[Abel]][[-'s|'s]] [[impossibility]] [[theorem]]
. (An exception is made for one's, someone's, he's, she's and it's, which are linked as-is without splitting.) - If a word ends in
'
, the apostrophe will be linked to -' (on the assumption that the'
is functioning as a plural possessive, similarly to above), and the part before will be separately linked. If the part before ends in ans
, the module converts it to its singular equivalent and looks that up to see if it exists and has a definition as an Sinhalese term. If so, the term is linked to the singular form; otherwise, it is linked to the plural form. (Converting to the singular means that-ies
becomes-y
;-es
is dropped aftersh
,ch
andx
; and otherwises
is dropped.) For example, flies' graveyard will be linked as[[fly|flies]][[-'|']] [[graveyard]]
because fly exists as an Sinhalese term, but Achilles' heel will be linked as[[Achilles]][[-'|']] [[heel]]
because Achille does not exist as an Sinhalese term. - All other terms containing apostrophes are linked unsplit.
- If a word ends in
- If there are hyphens in the term but no spaces or apostrophes, the hyphenated components will be linked individually. For example, beggar-thy-neighbor will be linked as
[[beggar]]-[[thy]]-[[neighbor]]
.- An exception to this occurs with certain recognized prefixes, which are linked with the hyphen included in the prefix. For example, Afro-American is linked as
[[Afro-]][[American]]
and co-occurrence is linked as[[co-]][[occurrence]]
, because Afro- and co- are in the list of recognized prefixes. (For the full list, see below.)
- An exception to this occurs with certain recognized prefixes, which are linked with the hyphen included in the prefix. For example, Afro-American is linked as
- If there are hyphens and apostrophes but no spaces, the effect is similar to the situation with spaces and apostrophes. For example, beggar's-lice is linked as
[[beggar]][[-'s|'s]]-[[lice]]
. - If there are both hyphens and spaces, the space-separated components that do not have hyphens will be linked separately, as above. Any hyphen-separated components may be linked in one of three ways:
- If
|hyphspace=1
is specified or the hyphen-separated component exists as an Sinhalese term when the hyphens are converted to spaces, it will be linked to that term. For example, closed-circuit television will be linked as[[closed circuit|closed-circuit]] [[television]]
because closed circuit exists as an Sinhalese term. (In this case, closed-circuit also exists but is approximately a soft redirect to closed circuit, as is often the case with such attributive compounds. This is why we prefer the space-separated variant.) - If
|nosplithyph=1
is specified or the hyphen-separated component exists as an Sinhalese term in its unmodified form but not when the hyphens are converted to spaces, it will be linked as an unmodified whole. For example, coin-operated laundry will be linked as[[coin-operated]] [[laundry]]
because coin-operated exists as an Sinhalese term but coin operated does not. (An example that requires|nosplithyph=1
is close-up lens, where the default algorithm would incorrectly link the first component to close up. Here, close up [a verb] and close-up [an adjective] both exist but refer to different things.) - If
|splithyph=1
is specified or the hyphen-separated component does not exist as an Sinhalese term (either unmodified or when the hyphens are converted to spaces), each hyphenated component is linked separately. Examples where this happens are adult-onset diabetes (linked as[[adult]]-[[onset]] [[diabetes]]
) and Bombieri-Friedlander-Iwaniec theorem linked as[[Bombieri]]-[[Friedlander]]-[[Iwaniec]] [[theorem]]
). Note that when separately linking hyphenated components, prefixes are recognized and handled specially, as documented below.
- If
Special prefix handling
[සංස්කරණය]As described above, when splitting hyphenated components, if a component is not the last component and looks like one of the following prefixes, the following hyphen will be included inside of the link.
acro
acousto
Afro
agro
anarcho
angio
Anglo
ante
anti
arch
auto
bi
bio
cis
co
cryo
crypto
de
demi
eco
electro
Euro
ex
Greco
hemi
hydro
hyper
hypo
infra
Indo
inter
intra
Judeo
macro
meta
micro
mini
multi
neo
neuro
non
para
peri
post
pre
pro
proto
pseudo
re
semi
sub
super
trans
un
vice
Suffix handling
[සංස්කරණය]If the term begins with a hyphen (-
), it is assumed to be a suffix rather than a base form, and is categorized into Category:Sinhalese suffixes and Category:Sinhalese POS-forming suffixes rather than Category:Sinhalese POSs (e.g. Category:Sinhalese noun-forming suffixes rather than Category:Sinhalese nouns). This can be overridden using |nosuffix=1
. (An example where this is necessary is -ussification, which refers to a linguistic process of blending words with the suffix -ussy but is not itself a suffix.)
Link modifications
[සංස්කරණය]The default behavior described above under #Autosplitting is sufficient in most circumstances, but some multiword terms need special linking behavior to handle things like inflected terms (e.g. those ending in -ing or -s), capitalized terms, multiword subexpressions, etc. One way to handle that is to use |head=
and spell out the entire headword, appropriately linked, effectively ignoring the default linking behavior. But this can be awkward for long multiword terms. For cases like this, a shortcut syntax is provided to apply link modifications on top of the autolinked term. To enable this, put a tilde (~
) at the beginning of the value specified to |head=
, followed by the changes to individual words.
For example, for the term acute necrotising ulcerative gingivitis, we would like to link necrotising to necrotise. This can be done as follows:
{{si-noun|head=~necrotising:necrotise}}
or more compactely as
{{si-noun|head=~necrotis[ing:e]}}
This is equivalent to writing {{si-noun|head=[[acute]] [[necrotise|necrotising]] [[ulcerative]] [[gingivitis]]}}
, but shorter. In general, syntax of the form prefix[from:to]
is equivalent to writing prefixfrom:prefixto
, and says to replace prefixfrom
with prefixto
in the default output produced by the #Autosplitting mechanism described above.
The same syntax works on the beginning of a word, which is especially useful when linking to the lowercase equivalent of a capitalized term. For example, for admiral of the Swiss Navy, use the following to link Navy to navy:
{{si-noun|head=~[N:n]avy}}
This is equivalent to writing {{si-noun|head=[[admiral]] [[of]] [[the]] [[Swiss]] [[navy|Navy]]}}
but shorter.
Modifications need to match full words, but can be applied to multiple words. A ~
on the right-hand side is a shortcut that stands for the left-hand side, which is especially useful when multiple words are given on the left-hand side, and causes the words to be linked together. For example, for acute respiratory distress syndrome, to link respiratory distress as a single entity, use the following:
{{si-noun|head=~respiratory distress:~}}
which is equivalent to {{si-noun|head=[[acute]] [[respiratory distress]] [[syndrome]]}}
. The right-hand side need not consist solely of a tilde, but can contain other surrounding text. For example, for Charlie Brown Christmas tree, use the following to link to the Wikipedia entry for Charlie Brown:
{{si-noun|head=~Charlie Brown:w:~}}
This is equivalent to writing {{si-noun|head=[[w:Charlie Brown|Charlie Brown]] [[Christmas]] [[tree]]}}
.
Multiple modifications can be specified, separated by a semicolon (optionally with surrounding spaces). For example, for Admiral of the Fleet, use:
{{si-noun|head=~[A:a]dmiral; [F:f]leet}}
This is equivalent to writing {{si-noun|head=[[admiral|Admiral]] [[of]] [[the]] [[fleet|Fleet]]}}
.