Template:ca-IPA

From Wiktionary, the free dictionary
Jump to navigation Jump to search

IPA(key): (Central, Balearic) [kəˈi.pə]


This template automatically converts Catalan words into their IPA pronunciation. The template uses Module:ca-IPA as a back-end.

The template works for both single words and multiword expressions, and supports Balearic, Central Catalan and Valencian.

Parameters

|1=
Gives the phonemic respelling of the word whose pronunciation should be generated. If not provided, the word is taken from the name of the current page, so you can in some cases leave this empty (particularly if there is a written accent in the word or if the word ends in certain recognized endings). Multiple comma-separated respellings can be given (without a space after the comma; otherwise, the comma will be treated as part of the respelling).
|cen=
Gives the phonemic respelling for Central Catalan only.
|bal=
Gives the phonemic respelling for Balearic only.
|val=
Gives the phonemic respelling for Valencian only.
|east=
Gives the phonemic respelling for Central Catalan and Balearic only.
|pagename=
Override the page name used to generate the respelling when the respelling is omitted or specified partially (e.g. using substitution notation, as described below). This is useful in testing and documentation pages.

You can mix a respelling in |1= with respellings using named parameters, and the latter will override the former.

A value of + explicitly requests the default (i.e. the same as the page name), and a value of ? indicates that the pronunciation is unknown. For example, for liquidàmbar (sweetgum), a template invocation {{ca-IPA|[arr]|bal=?}} specifies to use the respelling liquidàmbarr, except for Balearic, where the pronunciation will display as unknown (because the primary source for Balearic pronunciation, the DCVB [Diccionari català-valencià-balear, i.e. "Dictionary of Catalan, Valencian and Balearic"], doesn't give the pronunciation of this word, and whether the final r is pronounced isn't clear).

Substitution notation

In place of a fully respelled term, you can use a substitution to replace just part of the term and leave the rest as-is. For example, for the page subalternar, in place of a respelling sub.alternar, you could write [ba:b.a], or just [b.a] (see below). Enough context needs to be provided to make the "from" part of the replacement unique; otherwise, an error will be thrown.

Because of the commonness of certain substitutions such as [ar:a(r)], a shorter form known as a single-part substitution is allowed. If just the "to" part of the replacement is specified between brackets, the "from" part will be generated as follows:

  • Convert (r) and (rr) to r.
  • Convert rr and ss to r and s, respectively (they also match against themselves in the original spelling).
  • Convert ks and gz to x.
  • Convert accented vowels to their equivalent unaccented vowels (they also match against themselves in the original spelling). This includes acute and grave accents, diaereses, overdots and underlines.
  • Remove _, . and - (- also matches against itself in the original spelling).
  • Convert the tie symbol to a space (it also matches against a hyphen in the original spelling).
  • Convert bbl and ggl to bl and gl, respectively.
  • Convert ʃ to x (it also matches against capitalized X).

For example, a spec like [irr] when applied to a word like papir (papyrus) is short for [ir:irr], which in turn is short for a full respelling papirr. Similarly, the spec [eï] when applied to a word like reivindicar (to reclaim) is short for [ei:eï], which in turn is short for a full respelling reïvindicar; and [ks] when applied to a word like boxejador (boxer) is short for [ks:x], which in turn is short for a full respelling boksejador. Note that "partially converted" matches also work in single-part substitutions; e.g. the spec àrr will match both against any of ar, arr, àr and àrr in the original spelling (although an error is thrown for substitutions that have no effect, as when àrr is specified with an original spelling that already contains àrr).

You can include more than one comma-separated substitution between brackets. E.g. for semicircular (semicircular), you could use {{cs-IPA|[sèm,à(rr)]}} to specify the respelling sèmicirculà(rr).

You can include a mid-vowel hint (see #Mid vowels below) as one of the substitutions between brackets. For example, for vexatori (vexatious), you could use {{ca-IPA|[ks,ò]}} to specify the respelling veksatòri. These hints work correctly when secondary stresses are also specified; e.g. for pseudohalogen (pseudohalogen), you can use {{ca-IPA|[sèu,ò]}} and it will work correctly to generate the respelling psèudohalògen (essentially, the mid-vowel hint is processed first regardless of its position in the list of comma-separated substitutions).

Note that an error will be thrown if a substitution can match in multiple places. For example, for particular (particular), the spec [a(rr)] will throw an error, because the string ar occurs twice. In such a case, specify more context, e.g. use [la(rr)] instead. If you intended to match both places, you need to specify two substitutions with differing context, or spell out the respelling in full.

Mid vowels

The pronunciation of mid vowels (written e and o) is variable, and only sometimes indicated by written accents. To simplify the frequent case of a word whose respelling is the same as the spelling other than having an accent on the stressed vowel, shortcuts known as mid-vowel hints are provided. Specifically, specify ONLY the stressed accented vowel, according to the following chart:

Letter Meaning Typical origin
é /e/ in all dialects Inherited Latin words with stressed short ĕ; certain borrowings; names of letters
è /ɛ/ in all dialects Many recent borrowings from other Romance languages or Latin, and some inherited words, especially before l or before r + consonant
ê /ɛ/ in Central Catalan, /ə/ in Balearic, /e/ in Valencian Inherited Latin words with stressed long ē or short ĭ
ë /ɛ/ in Central Catalan and Balearic, /e/ in Valencian Some recent borrowings
ó /o/ in all dialects Inherited Latin words with stressed long ō or short ŭ; certain borrowings
ò /ɔ/ in all dialects Inherited Latin words with stressed short ŏ; many recent borrowings from other Romance languages or Latin
ô /ɔ/ in Central Catalan and Balearic, /o/ in Valencian Some words borrowed independently in Valencian and Eastern Catalan?

Thus, for sec (dry), use {{ca-IPA|ê}}.

In some cases, defaults are provided for mid vowels when not specifically notated. (There used to be many more such defaults, but most of them were not always correct and have been removed.) In the following situations, a default is provided:

Condition Default vowel provided
words in -ent and -ents é
words in -er and -ers é
words in -or and -ors ó
words in and -ès ê
words in -esa, -esos and -eses ê

Note in particular that terms ending in written and -ès are overridden to use ê. This applies even when manual respelling is given. This is done because ê is virtually always correct (it changes the pronunciation only for Valencian and Balearic, not for Central Catalan). For words where this transformation needs to be overridden, place an underscore (_) directly after the stressed vowel. For example, for a collibè (piggyback), write {{ca-IPA|a còllibè_}}, since the word has secondary stress on the o (or {{ca-IPA|[cò,bè_]}} for short, using substitution notation).

Final -r

The pronunciation of final written -r is complex and varies per dialect; it is lost in some words in Central Catalan and Balearic (more so in Balearic than in Central Catalan), but almost always retained in Valencian. Defaults are provided in some situations; in other cases, you must specify the pronunciation of final written -r using respelling, or an error will be thrown. The following conventions are used for respelling r, to indicate the pronunciation in the different dialects:

Spelling Valencian Central Catalan Balearic Typical uses
rr Recent loanwords ending in -r; most monosyllables in -r
(rr) Adjectives in -ar
(r) Infinitives; agent nouns in -dor/-tor/-sor; nouns in -ar referring to places; nouns and adjectives ending in -er

Thus, for example, amor (love) should be respelled as amo(rr), as the final -r is pronounced in Central Catalan but not Balearic, while vampir (vampire) should be respelled as vampirr, as the final -r is pronounced everywhere.

Terms ending in multisyllabic stressed -ar, -er(s), -ir, and -dor(s)/-tor(s)/-sor(s)/-çor(s) default to (r), which is usually correct for infinitives in -ar and -ir, agent nouns in -dor/-tor/-sor/-çor, and nouns and adjectives in -er. Beware that the defaults aren't always correct, and some terms need respelling:

Monosyllables and nouns ending in unstressed final -r must be respelled in all circumstances, as their pronunciation in Central Catalan and especially Balearic is too unpredictable to have a default provided.

Adverbs in -ment and part of speech hints

In Catalan, adverbs ending in -ment are constructed on the feminine singular form of adjectives and have secondary stress on this adjective form. This stress is marked with an accent when the stress on the adjective form doesn't follow the default stress rules; for example, patriòticament (patriotically) requires an accent on the o because patriòtica by itself requires an accent. (In this case, this does *NOT* indicate a primary stress.) The module has special handling for such adverbs. The mid-vowel hints described above under #Mid vowels apply to the adjective form preceding -ment, not to -ment itself (which is always pronounced /men(t)/ with close-mid /e/). In some cases, this mid-vowel hint or equivalent respelling is required, and an error will be thrown if omitted. Specifically, if the default stress rules apply and the stressed vowel is e or o, a mid-vowel hint or equivalent respelling must be provided except if a default would be supplied if the adjective form were to stand alone. For example, terms in -ent have a default close-mid /e/ vowel supplied; this means that adverbs such as competentment (competently) don't need any respelling or mid-vowel hint, similar to how competent (competent) needs no such hints. But a term like bojament (crazily) needs to have the o marked for quality or an error will be thrown; this could be done, for example, using a mid-vowel hint: {{ca-IPA|ò}}.

Not all terms in -ment are adverbs. There are in fact a large number of nouns in -ment, such as abandonament (abandonment), and a few adjectives such as vehement (vehement). Such terms need a part of speech hint so they aren't treated as adverbs. The part of speech hint consists of a part of speech abbreviation followed by a slash. For example, to mark abandonament as a noun, use {{ca-IPA|n/}} or {{ca-IPA|n/+}}. To supply a respelling along with the part of speech hint, place it after the slash. For example, restabliment (reestablishment) should use {{ca-IPA|n/[bbl]}}; this is equivalent to {{ca-IPA|[bbl]}} (which says to pronounce the written -bl- with geminate /b/) with an attached part of speech hint indicating that the term is a noun. Another example is sobreescalfament (overheating), which has secondary stress on sobre-; to indicate that, use a respelling like {{ca-IPA|n/[sób,mén]}}, placing accents explicitly on the -o- of sobre- and the -e- of -ment (which is not treated specially since this is a noun, not an adverb).

The following part of speech abbreviations are recognized:

Abbreviation Expansion
n noun
noun noun
v verb
vb verb
verb verb
a adjective
adj adjective
adjective adjective
av adverb
adv adverb
adverb adverb
o other
other other

It is recommended to use the short forms for conciseness, but as shown, you can spell out the part of speech in full if desired.

Inline modifiers

You can specify inline modifiers after respellings, using standard inline-modifier notation (see {{col}}, {{alt}}, etc.). For example, for car (expensive), the specification {{ca-IPA|carr|bal=ca<q:Mallorca, Menorca>,carr<q:Ibiza>}} specifies two pronunciations ca and carr for Balearic, the former with the qualifier Mallorca, Menorca and the latter with the qualifier Ibiza. This displays as follows:

Note that the template is not confused by the comma in the qualifier, because it occurs inside of angle brackets. The following inline modifiers are recognized:

  • q: qualifier, e.g. rare; this appears *BEFORE* the term, parenthesized and italicized
  • qq: qualifier, e.g. rare; this appears *AFTER* the term, parenthesized and italicized
  • a: accent qualifier, e.g. Ibiza; this appears *BEFORE* the term, parenthesized and italicized
  • aa: accent qualifier, e.g. Ibiza; this appears *AFTER* the term, parenthesized and italicized
  • ref: reference; if specified, it displays as a footnote number, and the footnote itself will display in a ==References== section; use <references /> to request display of the footnotes

Special symbols

  • Use a dot (.) to force a syllable break at the specified point. This is useful, for example, in words beginning with the prefix sub- followed by a vowel, l or r, as in subaltern (subordinate), where the default syllabification would be su.bal.tern, producing an incorrect pronunciation of the b as [β] instead of [p].
  • Use an underscore (_) to disable various sorts of context-dependent interpretations of letters. For example, writing b_l in a word like bíblia (bible) prevents gemination of the bl (see #Other hints below), and writing _bl would prevent both gemination and lenition to [β]. Another useful case is after è in final or -ès, which prevents transformation of the è into ê (see #Mid vowels above).
  • Use a tie symbol () to join words where liaison occurs, as in Sant Antoni (St. Anthony), which should be respelled SàntAntòni, causing the t to be generated as [t] instead of made silent in Central Catalan.

Other hints

The letter <x>

The pronunciation of the letter x is ambiguous. The default is as follows:

  1. /ks/ in syllable codas directly following a vowel (e.g. extracció (extraction) and annex (annex));
  2. /ɡz/ in words beginning (h)ex- or inex- followed by a vowel or h (e.g. exèrcit (army), exhalar (to exhale), hexadecimal (hexadecimal), inexacte (inexact));
  3. /t͡ʃ/ in Valencian when word-initial or directly after a consonant other than /j/;
  4. /ʃ/ elsewhere.

If necessary, use respellings with ks, gz or ʃ. Thus, for fixar (to fasten, to decide), use {{ca-IPA|fiksar}} (or the shortcut {{ca-IPA|[ks]}}, using substitution notation, as described above), and for ídix (Yiddish), use {{ca-IPA|ídiʃ}} (or just {{ca-IPA|[ʃ]}}).

The spellings <bl> and <gl>

In Central Catalan, the spellings <bl> and <gl> between vowels sometimes indicate [βl]/[ɣl] and sometimes indicate [b.bl]/[g.gl] replace g with ɡ, invalid IPA characters (gg). The default pronunciation is as follows:

  1. Geminate [b.bl]/[ɡ.ɡl] directly after a stressed vowel;
  2. single [βl]/[ɣl] after other vowels.

In general, words that are derived from or related to a term that contains <bl> or <gl> after a stressed vowel maintain the geminate pronunciation even if the preceding vowel is unstressed, as in reglament (rule) and població (population). These words need respelling with bbl or ggl (the substitution notation {{ca-IPA|[bbl]}} or {{ca-IPA|[ggl]}} is usually sufficient). This respelling does not affect the Valencian pronunciation, which maintains ungeminated pronunciations in these words. Contrariwise, a few words with <bl>/<gl> after a stressed vowel do not have geminate pronunciations (e.g. bíblia (bible)). To force a non-geminated pronunciation, use the respelling b_l or g_l (the substitution notation {{ca-IPA|[b_l]}} or {{ca-IPA|[g_l]}} is usually sufficient). This is a particular use of the underscore, which in general prevents special interpretation of the letters it stands between.

Unstressed words

The module has a list of all the normally unstressed words in Catalan (e.g. de, es, ni) and will not attempt to add a stress to them or complain about non-disambiguated mid vowels in such words. In addition, prefixes are normally treated as unstressed unless an accent is specifically given (in which case it will render as secondary stress). To force an interpretation of such words as stressed, simply respell with an explicit accent on the vowel as appropriate. Contrarily, to force interpretation of a word as unstressed (e.g. a contraction such as d'el, which the module doesn't know about, or a suffix that does not bear stress, such as -fob), put a dot over any of the vowels in the word: ȧ ė i̇ ȯ u̇ ẏ Ȧ Ė İ Ȯ U̇ Ẏ.

Secondary stress

Normally, all stressed vowels in a word other than the last one are converted to secondary stress. In addition, stressed vowels in prefixes are converted to secondary stress, and the portion of a word before adverbial -ment gets default secondary stress (see #Adverbs in -ment and part of speech hints above). These rules nearly always suffice to handle secondary stress, but in rare cases there are secondary stresses after the primary stress, which the defaults don't properly handle. In such a case, place a combining underscore underneath the vowel needing secondary stress: à̱ è̱ é̱ í̱ ò̱ ó̱ ú̱. An example of such a case is Valencian dèneu (nineteen), where the second e bears secondary stress with the pronunciation /ɛ/. For this example, write {{ca-IPA|val=dènè̱u}}.

Sources of pronunciation

There are four primary sources of pronunciation available on the Internet:

  1. The GDLC (Gran Diccionari de la llengua catalana, i.e. "Large Dictionary of the Catalan Language"), for Central Catalan. The pronunciations can be found in the bilingual Catalan-French section; see for example the section on the word subaltern (https://www.diccionari.cat/catala-frances/subaltern).
  2. The DNV (Diccionari normatiu valencià, i.e. "Normative Valencian Dictionary"), for Valencian. See for example the section on the word subaltern (https://www.avl.gva.es/lexicval/xhtml/dnv.xhtml?paraula=subaltern).
  3. The DCVB (Diccionari català-valencià-balear, i.e. "Catalan-Valencian-Balearic Dictionary"), for Balearic (as well as for Central Catalan and Valencian, but see below). See for example the section on the word subaltern (https://dcvb.iec.cat/results.asp?word=subaltern).
  4. esAdir for proper nouns and foreign words. See for example the section on the word Angola (https://esadir.cat/Toponims/Toponims_del_mon/Africa/Angola) and the section on the word kosher (https://esadir.cat/entrades/fitxa/node/kosher).

Note the following provisos:

  1. cawikt (Catalan Wiktionary) should not be trusted. To the extent it is correct, its pronunciations tend to be based on the DCVB, with all the resultant issues of this source, but it contains many mistakes beyond this.
  2. The DCVB should be trusted only for Balearic pronunciations (for Central Catalan and Valencian, use it only as a last resort). The DCVB is based on fieldwork from the early 20th century; hence a lot of the pronunciations are outdated. It should not be used authoritatively, especially for Central Catalan and Valencian. Pronunciation information is also somewhat spotty, especially for less-common terms (e.g. the above entry on subaltern (subordinate) does not include pronunciation). Note that the abbreviation or. (oriental) specifically means Central Catalan, not Balearic. For Balearic, look for Bal., Mall. (Mallorca), Men. (Menorca), Eiv. (Ibiza) or a specific city in the Balearic Islands, especially Palma (the capital of Mallorca) or Ciutadella (one of the major cities on Menorca); pronunciation for Maó (the other primary city on Menorca) is less useful because its dialect lacks /ə/.
  3. The DNV's coverage of secondary stress is inaccurate; use the GDLC for this information.
  4. The GDLC contains occasional typos. If you see a pronunciation that doesn't make sense (e.g. [b] when [β] would be expected, or mid-vowel qualities in rare words that contradict what would be expected from the ending), it may be a typo.
  5. esAdir's pronunciations are sometimes untrustworthy, especially of terms of Spanish origin, where the supposedly "Catalan" pronunciation given is actually a Spanish pronunciation. These can often be identified by the lack of expected vowel reduction in unstressed syllables and the presence of only high-mid vowels for stressed e and o.
  6. The DNV includes the pronunciation of the root vowel in root-stressed forms of verbs, especially those in -ar. The GDLC unfortunately does not include this information (nor does the DCVB, except in rare cases); however, it can often be inferred from the pronunciation of related nouns (e.g. subaltern in the case of the verb subalternar (to subordinate)). In addition, sometimes the Valencian pronunciation is a good indicator of the Central Catalan pronunciation; see below under #Dialect variation for more information.

Dialect variation

The template supports the pronunciation of three main dialects: Central Catalan (the pronunciation of the majority of Catalan-speaking areas, including Barcelona); Valencian (the pronunciation of Valencia, to the southwest of the main Catalan-speaking area); and Balearic (the pronunciation of the Balearic Islands, to the east of the main Catalan-speaking area). These dialects differ in various ways, most notably the pronunciation of vowels. For example, Central Catalan and (to a lesser extent) Balearic have reduction of unstressed vowels, while Valencian generally does not (and is more conservative in other ways as well). Stressed vowels may differ in complex ways between dialects, but the following generalizations can be noted:

  1. If the stressed vowel in Valencian is an open-mid vowel /ɛ/ or /ɔ/, the corresponding stressed vowel in Central Catalan is nearly always the same; conversely, if the stressed vowel in Central Catalan is a close-mid vowel /e/ or /o/, the corresponding stressed vowel in Valencian is nearly always the same. This is especially helpful in working out the root vowel pronunciation of verbs in -ar, because this information is not directly available on the Internet for Central Catalan (see #Sources of pronunciation above).
  2. If the stressed vowel in Central Catalan is /e/, /ɔ/ or /o/, the corresponding stressed vowel in Balearic is nearly always the same (especially in less-common terms). If the stressed vowel in Central Catalan is /ɛ/, however, the stressed vowel in Balearic may be either /ɛ/ or /ə/. Usually, the schwa /ə/ occurs in inherited terms and the mid vowel /ɛ/ occurs in borrowed terms, but there are several exceptions to this generalization, particularly for inherited terms (borrowed terms rarely have /ə/).


Common mid-vowel suffixes

The following may be of aid in producing the correct mid-vowel specification for terms ending in common suffixes:

Suffix Meaning Examples Vowel
chemicals, minerals and other scientific terms in English -ene benzè (benzene), piroxè (pyroxene), diplotè (diplotene), duodè (duodenum), eocè (Eocene), epicè (epicene) ê (handled automatically)
ordinals, corresponding to English -th catorzè (fourteenth) ê (handled automatically)
demonyms xilè (Chilean) ê (handled automatically)
-enc relational adjectives abrilenc (of April) ê
-enc demonyms atenenc (Athenian) ê
-eny demonyms caribeny (Caribbean) ê
-esc relational adjectives; terms corresponding to English -ish and -esque detectivesc (of detectives), quixotesc (Quixotic) ê
-et, -eta diminutives aneguet (ducking) ê
-oma diseases, corresponding to English -oma; other scientific terms in English -ome carcinoma (carcinoma), genoma (genome) ó