This is the complete documentation for Nesca version 1.0.7
Nesca, a "sound change applier", applies transformation rules to words to change them. It can be used for historical or fictional sound changes, to spell words differently, or to convert words to other alphabets. Nesca is an easy to use but powerful tool for conlangers and linguists.
Nesca is one of the applications belonging to The Conlangers Suite. You can install Nesca to be used in your own projects or as a command-line-interface here.
Apply button to see Nesca apply your sound changes to your words. Yes, there are two apply buttons.Input words textbox is where you list all the words you want sound changes appliedHelp button shows this documentOutput words textbox is where your changed words will appearOutput words to your clipboardOutput words to your systemWord-list mode will produce a list of changed wordsOld-to-new mode will produce a list of changed words in the format old word -> new wordDebug mode will show, line by line, each step in changing each wordInput divider sets the delimiter, or in other words, what the content is between each input word. It is a newline by default. Use \n for newlineOutput divider sets the delimiter, or in other words, what the content is between each output word. It is a newline by default. Use \n for newlineSort words sorts the output words in alphabetical order, or the order defined in the alphabet: directiveEditor wrap lines will make the definition-build editor jump to the next line if the line escapes the width of the definition-build editorShow keyboard will reveal a 'keyboard', a character selector, below the options. Clicking on a character will insert that character into the editorThemes dropdown to change the colour theme of the editorSave button to download your sound changes as a file called 'Nesca.txt', or what you named your file in the File name: field. The file is always a ".txt" type.Load button to load a file on your system into the file editor.Examples dropdown to load an example into the definition-build editorA definition-build is comprised of two top-level concepts: 'directives' and 'decorators'.
Directives are laid out like blocks and define the functions of Nesca. The primary directive is the stage directive, which modifies each word with transforms. The other directives define concepts that are used by this primary directives.
Directives are written with their name on a newline, followed by a colon : on the same line, then followed by a newline. The payload after declaring a directive is interpreted according to the directive's semantics. A directive ends when a new directive begins, or when there are no more lines in the definition-build. For example:
stage:Decorators change a property of a directive to modify the directive's behaviour.
Decorators start on a new line above the directive they are modifying with an at sign @, followed by the directive, a ., the property, optional whitespace, =, optional whitespace, and then the new value of the property. Or just the property if it's a boolean flag. For example:
@stage.name = "Latin-to-Portuguese"stage:To disable any directive, use the disabled flag decorator. This has the same effect as commenting out all the lines inside the directive:
@stage.disabledstage:If a line contains a semicolon ; everything after it on that line is ignored and not interpreted as Nesca syntax -- unless ; is escaped. You can use this to leave notes about what something does or why you made certain decisions.
Graphemes are indivisible meaningful characters that make a word in Nesca. Phonemes can be thought of as graphemes. If we use English words sky and shy as examples to illustrate this, sky is made up by the graphemes s + k + y, while shy is made up by sh + y.
A single-length character following the syntax character \ ignores any meaning it might have had in the program, including backslashes themselves. This way, anything including capital letters that have already been defined as categories and brackets (but not whitespace) can be graphemes.
These are the characters you must escape if you want to use them in the stage directive:
| Characters | Meaning |
|---|---|
; |
Comment |
\ |
Escapes a character after it |
<routine, = and > |
A routine is placed after the equals sign |
< and a space |
Begins a cluster-field |
&[ and ] |
Named escape |
>>, ->, =>, ⇒ or → |
Indicates change |
, or |
Separates choices |
{ and } |
Alternator-set |
( and ) |
Optionalator-set |
C, D, K, ... |
Any one-length capital letter can refer to a category |
[ and ] |
Feature matrix |
^ |
Insertion when in TARGET, deletion when in REPLACEMENT |
0 |
Rejects a word |
! or // |
An exception follows this character |
_ |
The underscore _ is a reference to the target |
# |
Word boundary |
$ |
Syllable boundary |
+ |
Quantifier, matches as 1 or more of the previous grapheme |
?[ and ] |
Bounded quantifier |
: |
Ditto-mark, duplicates the previous grapheme |
* |
Wildcard, matches exactly 1 of any grapheme |
%[ and ] |
Anythings-mark, matches 1 or more wildcards |
%[ and | and ] |
Anythings-mark with degrees or 'cowardliness' |
&T |
Target-mark |
&M |
Metathesis-mark |
&E |
Empty-mark |
&= |
Begins Reference-capture of a sequence of graphemes |
= and positive digit |
Reference-capture |
| A positive digit | Reference |
~ |
Based-mark |
Named escapes, enclosed in &[ and ] allow space and combining diacritics to be used without needing to insert these characters.
The supported characters are:
| Escape Name | Unicode Character |
|---|---|
&[Space] |
|
&[Tab] |
|
&[Newline] |
|
&[Acute] |
◌́ |
&[DoubleAcute] |
◌̋ |
&[Grave] |
◌̀ |
&[DoubleGrave] |
◌̏ |
&[Circumflex] |
◌̂ |
&[Caron] |
◌̌ |
&[Breve] |
◌̆ |
&[BreveBelow] |
◌̮ |
&[InvertedBreve] |
◌̑ |
&[InvertedBreveBelow] |
◌̯ |
&[TildeAbove] |
◌̃ |
&[TildeBelow] |
◌̰ |
&[Macron] |
◌̄ |
&[MacronBelow] |
◌̠ |
&[MacronBelowStandalone] |
◌˗ |
&[Dot] |
◌̇ |
&[DotBelow] |
◌̣ |
&[Diaeresis] |
◌̈ |
&[DiaeresisBelow] |
◌̤ |
&[Ring] |
◌̊ |
&[RingBelow] |
◌̥ |
&[Horn] |
◌̛ |
&[Hook] |
◌̉ |
&[CommaAbove] |
◌̓ |
&[CommaBelow] |
◌̦ |
&[Cedilla] |
◌̧ |
&[Ogonek] |
◌̨ |
&[VerticalLineBelow] |
◌̩ |
&[VerticalLineAbove] |
◌̍ |
&[DoubleVerticalLineBelow] |
◌͈ |
&[PlusSignBelow] |
◌̟ |
&[PlusSignStandalone] |
◌˖ |
&[uptackBelow] |
◌̝ |
&[UpTackStandalone] |
◌˔ |
&[LeftTackBelow] |
◌̘ |
&[rightTackBelow] |
◌̙ |
&[DownTackBelow] |
◌̞ |
&[DownTackStandalone] |
◌˕ |
&[BridgeBelow] |
◌̪ |
&[BridgeAbove] |
◌͆ |
&[InvertedBridgeBelow] |
◌̺ |
&[SquareBelow] |
◌̻ |
&[SeagullBelow] |
◌̼ |
&[LeftBracketBelow] |
◌͉ |
If you are using this, you should be very interested in the Compose routine.
Categories are declared inside the categories directive on a line each. A category is a set of graphemes with a key. The key is a singular-length capital letter. For example:
:This creates two groups of graphemes. C is the group of all consonants and V is the group of all vowels.
These graphemes are separated by commas, however an alternative is to use spaces: C = t n k m ch l ꞌ s r d h w b y p g.
Need more than 26 categories? Nesca supports the following additional characters as the key of a category or unit: Á Ć É Ǵ Í Ḱ Ĺ Ḿ Ń Ó Ṕ Ŕ Ś Ú Ẃ Ý Ź À È Ì Ǹ Ò Ù Ẁ Ỳ Ǎ Č Ď Ě Ǧ Ȟ Ǐ Ǩ Ľ Ň Ǒ Ř Š Ť Ǔ Ž Ä Ë Ḧ Ï Ö Ü Ẅ Ẍ Ÿ Γ Δ Θ Λ Ξ Π Σ Φ Ψ Ω
You can use categories inside categories, as long as the referenced category has previously been defined. For example:
:
L = aa, ii, ee, oo
V = a, i, e, o, LFor each input word, it is possible to split the input word into various fields such as "word" and "class" and any other fields in the input word, such as "meaning" or "gloss", with the schema directive.
For example, let's say we have input words in this format, and this is one of the words:
To indicate where a field is, the field is put between less-than and greater-than signs. Any other characters are parsed as delimiting characters. word is a required field. For example:
schema:
input = <word> [<meaning>] "<class>"
output = <word> "<class>"
This means for the input we have the following fields word, meaning, class, split by [ , ] " and "
The word will then be recomposed according to the format you choose in output. You can leave out any fields you do not want.
The alphabet directive gives provides a custom alphabetisation order for words, when the sort words checkbox is selected.
:
a, b, c, e, f, h, i, k, l, m, n, o, p, p', r, s, t, t', yThis would order generated words like so: cat, chat, cumin, frog, tray, t'a, yanny
Sometimes you will want characters, such as syllable dividers, to be invisible to alphabetisation order. You can do this by listing these characters in the invisible directive.
:
., ' This will make these generated words: za'ta, 'ba.ta, 'za.ta be reordered into: 'ba.ta, za'ta, 'za.ta
The graphemes directive dictates which (multi)graphs, including character + combining diacritics, are to be treated as grapheme units when using transformations.
:
a, b, c, ch, e, f, h, i, k, l, m, n, o, p, p', r, s, t, t', yIn the above example, we defined ch as a grapheme. This would stop a rule such as c -> g changing the word chat into ghat, but it will make cobra change into gobra.
Which graphemes are 'associatemes' of their 'bases' are declared in the graphemes directive. Read more about this in this section of the documentation.
Once words are generated, you might want to modify them to prevent certain sequences, outright reject certain words, or simulate historical sound changes. This is the purpose of transforms, which are all declared in the stage directive:
:
; Your transforms go hereThe default transform is a rule. These should be familiar to anyone who knows a little about phonological rules. The two other types of transforms are cluster-fields and routines.
A rule can be summarised as four fields: CHANGE / CONDITION ! EXCEPTION. The characters / and ! that precede each field (except for the CHANGE) are necessary for signalling each field. For example, including a ! will signal that this rule contains an exception, and all text following it until the next field marker will be interpreted as such.
Every rule begins on a new line and must contain a CHANGE. The CONDITION or EXCEPTION fields are optional.
If you want to capture graphemes that are normally syntax characters in transforms, you will need to escape them.
When this document uses examples to explain transformations, the last comment shows an example word transforming. For example ; amda ==> ampa means the rule will transform the word amda into ampa
Using a decorator on a stage, you can name that stage. In debug mode, the name of the stage will be printed out when each word is being processed on that stage.
@stage.name = My Transform Stage
stage:
; Your transforms go hereThe format of a rule's CHANGE can be expressed as TARGET -> REPLACEMENT.
TARGET specifies which part of the word is being changed->. -> can be swapped with either >>, =>, ⇒ or → if you preferREPLACEMENT is what TARGET is changing into, or in other words, replacingLet's look at a simple unconditional rule:
In this rule, we see every instance of o become x.
Concurrent change is achieved by listing multiple graphemes in TARGET separated by commas, and listing the same amount of replacement graphemes in REPLACEMENT separated by commas. Changes in a concurrent change execute at the same time:
Notice that the above example is different to the example below:
Where each change is on its own line, o merges with a, then a becomes o.
Instead of listing each REPLACEMENT in a concurrent change, we can instead list just one that all the TARGETs will merge into:
This is equivalent to:
To remove, or in other words, reject a word, you use a zero 0 in REPLACEMENT:
In the above example, any word that contains a or bi will be rejected.
Insertion requires a condition to be present, and for a caret ^ to be present in TARGET, representing nothing.
Deletion happens when ^ is present in REPLACEMENT:
Conditions follow the change and are placed after a forward slash. When a transform has a condition, the target must meet the environment described in the condition to execute.
The format of a condition is / BEFORE_AFTER
/ begins a conditionBEFORE is anything in the word before the target_ is a reference to the target in a conditionAFTER is anything in the word after the targetFor example:
Multiple conditions for a single rule can be made by separating each condition with additional forward slashes. The change will happen if it meets either, or both of the conditions:
Hash # matches to word boundaries. Either the beginning of the word if it is in BEFORE, or the end of the word if it is in AFTER
Dollar-sign $ matches to either the character ., to any of the syllable-divider graphemes stated in the syllable-boundary directive, or if no match, tries to match word boundaries. Either the beginning of the word if it is in BEFORE, or the end of the word if it is in AFTER
The syllable-boundaries directive lets you define which graphemes are to be treated as a syllable-boundary:
:
., '
stage:
o -> x / p_p$Exceptions are placed following an exclamation mark ! and go after the condition, if there is one. Exceptions function exactly like the opposite of the condition -- when a rule has an exception, the target must meet the environment described in the exception to prevent execution:
In the above example, the transformation will not execute if aa is at the end of the word.
If there are multiple exceptions, the transform must meet all of the exceptions for it not to execute.
An alternative to using an exclamation mark is to use two forward slashes //.
These are sets just like the sets in word-creation, but they cannot be nested.
Enclosed in curly braces, { and }, only one Item in an alternator set will be part of each sequence. For example:
The above example is equivalent to:
These can also be used in exceptions and conditions:
Items in an optionalator, enclosed in ( and ) can be captured whether or not they appear as part of a grapheme or as part of a sequence of graphemes:
Optional-set can also attach to an alternator-set:
Optionalator-set cannot be used on its own, it must be connected to other content.
You can reference categories in transforms. The category will behave in the same way as an alternator set:
; xapay ==> apaIf the category is inside a set, it MUST be listed as an item on its own:
; xvayazv ==> ayaThis is to say {Bz}v -> ^ is invalid.
Let's say you had the grapheme, or rather, phoneme /i/ and wanted to capture it by its distinctive vowel features, +high, -round and +front, and turn it into a phoneme marked with +high, -round and +back features, perhaps /ɯ/. The features directive and feature matrices let you do this. The features can be described as binary and 'fully-specified'.
The key of all features must consist of lowercase letters a to z, uppercase letters a to z, ., - or +
A feature prepended with a plus sign + is a 'pro-feature'. For example +voice. We can define a set of graphemes that are marked by this feature by using this pro-feature. For example:
A feature prepended with a minus sign - is an 'anti-feature'. For example -voice. We can define a set of graphemes that are marked by a lack of this feature by using this anti-feature. For example:
A feature prepended with a greater-than-sign > is a 'para-feature'. A para-feature is simply a pro-feature where the graphemes marked as the anti-feature of this feature are the graphemes in the graphemes: directive that are not not marked by this para-feature:
Is equivalent to the below example:
'Where does this leave graphemes that are not marked by either the pro-feature or the anti-feature of a feature?', you might ask. Such graphemes are unmarked by that feature.
Features can be referenced inside features. For example:
Use a caret in front of a grapheme to ensure that that grapheme is not part of the pro/anti/para-feature. In the example above, the pro-feature '+non-yod' is composed of the graphemes a and o -- the grapheme i is not part of this pro-feature. Due to the recursive nature of nested features, this removed grapheme will be removed... aggressively. For example, If +non-yod were to be referenced in a different feature, that feature would always not have i as a grapheme.
To capture graphemes that are marked by features in a transform, the features must be listed in a 'feature-matrix' surrounded by [ and ]. The graphemes in a word must be marked by each pro-/anti-feature in the feature-matrix to be captured. For example if a feature-matrix [+high, +back] captures the graphemes: u, ɯ, another feature-matrix [+high, +back, -round] would capture ɯ only.
The very simple example below is written to change all voiceless graphemes that have a voiced counterpart into their voiced counterparts:
In this rule, in REPLACEMENT, [+voice] has a symmetrical one-to-one change of graphemes from the graphemes in [-voice] in TARGET, leading to a concurrent change. Let's quickly imagine a scenario where the only [+voice] grapheme was b. The result will be a merging of all -voice graphemes into b: tamepfa ==> bamebba.
It should be noted that feature-matrices in TARGET have no carryover to feature-matrices in REPLACEMENT. For example, in a bogus rule such as o -> [+high] the program will not try to transform <o> into it's [+high] counterpart, it will try and replace <o> with some grapheme marked as [+high] and will probably fail unless only one grapheme is marked as [+high].
If the category is inside a set, it MUST be listed as an item on its own:
[+example], z}v -> ^; xvayazv ==> ayaThis is to say {[+voiced]z}v -> ^ is invalid.
Feature-fields allow graphemes to be easily marked by multiple features in table format.
The graphemes being marked by the features are listed on the first row. The features are listed in the first column.
For example:
feature-field:
m n p b t d k g s h l j
voice + + - + - + - + - - + +
plosive - - + + + + + + - - - -
nasal + + - - - - - - - - - -
fricative - - - - - - - - + + - -
approx - - - - - - - - - - + +
labial + - + + - - - - - - - -
alveolar - + - - + + - - + - + -
palatal - - - - - - - - - - - +
velar - - - - - - + + - - - -
glottal - - - - - - - - - + - -
feature-field:
a e i o
high - - + -
mid - + - +
low + - - -
front - + + -
back + - - +
round - - - ++ means to mark the grapheme by that feature's pro-feature- means to mark the grapheme by that feature's anti-feature. means to leave the grapheme unmarked by that featureHere are some matrices of these features and which graphemes they would capture:
[+plosive] captures the graphemes b, d, g, p, t, k[+voiced, +plosive] captures the graphemes b, d, g[+voiced, +labial, +plosive] captures the grapheme bWildcards and the like in this section are special tokens that can represent arbitrary amounts of arbitrary graphemes, which is especially useful when you don't know precisely how many, or of what kind of grapheme there will be between two target graphemes in a word.
Quantifier, using +, will match once or as many times as possible to the grapheme to the left of it. Quantifier cannot be used in REPLACEMENT:
; raraaaaa ==> roroThe bounded quantifier matches as many times its digit(s), enclosed in ?[ and ], to the things to the left.
; ororrro ==> ororrrxThe digits in the quantifier can also be a range:
; tootooooo ==> txtxoAt the beginning of the list, , represents all the possible numbers lower than the number to the right, not including zero.
; tootooooo ==> txtxAnd finally at the end of the list, , represents all possible numbers larger than the number to the the left
; toootooooo ==> toootxA bounded quantifier can be used in REPLACEMENT as long as there is a definite maximum quantity. Or in other words, you cannot produce an infinite amount of something!
Ditto-mark using colon :, will duplicate the grapheme, or grapheme from a set or category, to the left of it. In other words, you can capture an item only when it is doubled using the ditto-mark:
; aaata => oataA ditto-mark can be used in REPLACEMENT:
; tat => taatWildcard, using asterisk *, will match once to any grapheme. Wildcard does not match word boundaries. Wildcard cannot be used in REPLACEMENT:
; Any grapheme becomes <x> when any grapheme follows it; aomp ==> xxxpWildcard can be placed by itself inside an optionalator (*), thereby allowing it to match nothing as well.
The anythings-mark uses percent sign % and a pair of square brackets [ and ]. It will match as many (but not zero) times to any grapheme. For example:
; abitto => axAs we can see, the rule matched b and greedily matched every and any grapheme after it.
The example below uses an anythings-mark in the condition:
By listing graphemes and grapheme sequences inside the square brackets, we can alter the "greedy" behaviour of an anythings-mark with degrees of "laziness" and 'cowardliness'.
Consuming negative lookahead, AKA "laziness":
Sometimes it is necessary to for the anythings mark to consume graphemes we are monitoring for, and then stop consuming:
; babitto => xtoAs we can see, the rule matched b followed by anything else until it reached the first t, consumed that, then stopped matching. This behaviour in Regular Expression terminology is called "lazy".
As already stated, the items to check for greediness can be a sequence of graphemes:
; batitro => xoSets, categories and features can also be used when monitoring for laziness and cowardliness:
Negative lookahead, AKA 'cowardliness':
Sometimes it is necessary for graphemes to block the spread without having them be consumed, which I have dubbed 'cowardliness'. To do this put a pipe | after the lazy items and list the cowardly items. For example we might want the graphemes k or g to prevent the rightward spread of nasal vowels to non nasal vowels:
Cluster-field is a way to target sequences of graphemes and change them. They are laid out as tables, and start with < followed by a space. The first part of a sequence is in the first column, and the second part is in the first row. The clusterfield ends with a > on its own line. For example:
np becomes mp and mt becomes nt+ means to not change the target cluster at all0 to reject the word if it contains that sequence^ to delete the target sequenceThis is the advanced section. It presents solutions to edge-cases and novel systems to achieve the desired forms of words.
The routine transform provides useful functions that you can call at any point in the transform block. You call a routine on a newline with <routine, optional space, =, optional space, the routine, and a closing >.
The routines are:
decompose will break-down all characters in a word into their "Unicode Normalization, Canonical Decomposition" form. For example, ñ as a singular unicode entity, \u00F1, will be broken-down into a sequence of two characters, \u006E + \u0303compose does the opposite of decompose. It converts all characters in a word to the "Unicode Normalization, Canonical Decomposition followed by Canonical Composition" form. For example, ñ as two characters \u006E + \u0303, will be transformed into one character, \u00F1capitalise will convert the first character of a word to uppercasedecapitalise will convert the first character of a word to lowercaseto-uppercase will convert all characters of a word to uppercaseto-lowercase will convert all characters of a word to lowercasereverse will reverse the order of graphemes in a wordxsampa-to-ipa will convert characters of a word written in X-Sampa into IPA. ipa-to-xsampa will convert them back
| X-SAMPA | IPA |
|---|---|
| b_< | ɓ |
| d_< | ɗ |
| d` | ɖ |
| g_< | ɠ |
| h\ | ɦ |
| j\ | ʝ |
| l\ | ɺ |
| l` | ɭ |
| n` | ɳ |
| p\ | ɸ |
| r\ | ɹ |
| r\` | ɻ |
| r` | ɽ |
| s\ | ɕ |
| s` | ʂ |
| t` | ʈ |
| x\ | ɧ |
| z\ | ʑ |
| z` | ʐ |
| A | ɑ |
| B | β |
| B\ | ʙ |
| C | ç |
| D | ð |
| E | ɛ |
| F | ɱ |
| G | ɣ |
| G\ | ɢ |
| G\_< | ʛ |
| H | ɥ |
| H\ | ʜ |
| I | ɪ |
| J | ɲ |
| J\ | ɟ |
| J\_< | ʄ |
| K | ɬ |
| K\ | ɮ |
| L | ʎ |
| L\ | ʟ |
| M | ɯ |
| M\ | ɰ |
| N | ŋ |
| N\ | ɴ |
| O | ɔ |
| O\ | ʘ |
| v\ | ʋ |
| P | ʋ |
| Q | ɒ |
| R | ʁ |
| R\ | ʀ |
| S | ʃ |
| T | θ |
| U | ʊ |
| V | ʌ |
| W | ʍ |
| X | χ |
| X\ | ħ |
| Y | ʏ |
| Z | ʒ |
| " | ˈ◌ |
| % | ˌ◌ |
| : | ◌ː |
| :\ | ◌ˑ |
| @ | ə |
| @\ | ɘ |
| @` | ɚ |
| { | æ |
| } | ʉ |
| 1 | ɨ |
| 2 | ø |
| 3 | ɜ |
| 3\ | ɞ |
| 4 | ɾ |
| 5 | ɫ |
| 6 | ɐ |
| 7 | ɤ |
| 8 | ɵ |
| 9 | œ |
| & | ɶ |
| ? | ʔ |
| ?\ | ʕ |
| <\ | ʢ |
| >\ | ʡ |
| ^ | ꜛ |
| ! | ꜜ |
| !\ | ǃ |
| | | | |
| |\ | ǀ |
| || | ‖ |
| |\\|\ | ǁ |
| =\ | ǂ |
| -\ | ‿ |
latin-to-hangul converts, or rather, transliterates characters written in an arbitrary romanisation into Hangul Jamo blocks. hangul-to-latin converts them back.
| A romanisation | Initial | Final |
|---|---|---|
| k | ㄱ | ㄱ |
| gk | ㄲ | ㄲ |
| n | ㄴ | ㄴ |
| t | ㄷ | ㄷ |
| dt | ㄸ | |
| r | ㄹ | ㄹ |
| m | ㅁ | ㅁ |
| p | ㅂ | ㅂ |
| bp | ㅃ | |
| s | ㅅ | ㅅ |
| z | ㅆ | ㅆ |
| c | ㅈ | ㅈ |
| j | ㅉ | |
| ch | ㅊ | ㅊ |
| kh | ㅋ | ㅋ |
| th | ㅌ | ㅌ |
| ph | ㅍ | ㅍ |
| x | ㅎ | ㅎ |
| gn | ㅇ |
| A romanisation | Hangul |
|---|---|
| a | ㅏ |
| ẹ | ㅐ |
| ọ | ㅓ |
| e | ㅔ |
| o | ㅗ |
| u | ㅜ |
| ụ | ㅡ |
| i | ㅣ |
| wa | ㅘ |
| wẹ | ㅙ |
| wọ | ㅝ |
| we | ㅞ |
| wi | ㅚ |
| uí | ㅟ |
| ụí | ㅢ |
| ya | ㅑ |
| yẹ | ㅒ |
| yọ | ㅕ |
| ye | ㅖ |
| yo | ㅛ |
| yu | ㅠ |
When there is no initial to be found, the jamo will have an initial Ieung. Forming an initial of the next jamo is preferred over creating a final for the current jamo
latin-to-greek converts, or rather, transliterates characters written in an arbitrary romanisation into greek letters. greek-to-latin converts them back
| Latin | Greek |
|---|---|
| a | α |
| á | ά |
| à | ὰ |
| e | ε |
| é | έ |
| è | ὲ |
| ẹ | η |
| ẹ́ | ή |
| ẹ̀ | ὴ |
| i | ι |
| í | ί |
| ì | ὶ |
| o | ο |
| ó | ό |
| ò | ὸ |
| ọ | ω |
| ọ́ | ώ |
| ọ̀ | ὼ |
| u | υ |
| ú | ύ |
| ù | ὺ |
| b | β |
| d | δ |
| f | φ |
| g | γ |
| k | κ |
| l | λ |
| m | μ |
| n | ν |
| p | π |
| r | ρ |
| s | σ |
| t | τ |
| x | χ |
| z | ζ |
| q | ξ |
| þ | θ |
| ṕ | ψ |
| c | ϛ |
| č | ͷ |
| h | ͱ |
| j | ϳ |
| š | ϸ |
| w | ϝ |
A target-mark is a reference to the captured TARGET graphemes. It cannot be used in TARGET. This uses an ampersand and a capital t &T.
Here are some examples where target-mark is employed:
Full reduplication:
"Haplology":
Reject a word when a word-initial consonant is identical to the next consonant:
Simple metathesis involves an ampersand and a capital m &M in REPLACEMENT. This will swap the first and last grapheme from the captured TARGET graphemes:
Since metathesis reference is swapping the first and last grapheme, we can effectively simulate long-distance metathesis using an anythings-mark:
An Empty-mark using &E, inserts an 'empty' grapheme into the captured TARGET graphemes. It is only allowed in TARGET
One use for it is a trick to make one-place long-distance metathesis work, for example:
Sometimes graphemes must be copied or asserted to be a certain grapheme between other graphemes. This is the purpose of 'reference'. Reference is fairly straightforward, but there is a lot of jargon and different behaviour between fields to explain.
A grapheme (or graphemes) are bound to a reference using a 'reference-capture', to the right of some grapheme. A reference-capture looks like = followed by a single-digit positive number. This number is called the 'reference-key' of the reference. The grapheme (or graphemes) bound to the reference is called the 'reference-value'.
The key behaviours of reference-capture are:
The captured grapheme can then be reproduced elsewhere in the rule with a 'reference-mark', even before the reference-capture. The reference-mark invokes the reference-key of a reference.
The key behaviours of reference-mark are:
TARGET of a rule.a -> e / 1x=1_ is invalid, and so is a -> e / 1_x=1. Reference is not recursive in conditions and exceptions.Here are some examples:
In the rule above, we are binding the [+vowel] feature-matrix to the reference 1, by appending =1 to it. Whatever this grapheme from [+vowel] is when the condition is met, is captured as the value of 1. Then the value of backrefence 1 is in AFTER by invoking its reference-mark.
In the rule above, we are binding the [+vowel] feature-matrix to the reference 1, by appending =1 to it. Whatever this grapheme from [+vowel] is when the condition is met is the value of 1. Then the value of 1 is inserted into REPLACEMENT by invoking its reference-mark.
Now that 'reference-capture' and 'reference-mark' has been (hopefully) introduced and explained adequately, let's explain how to capture and reference a sequence of graphemes.
To start capturing a sequence, you use a 'start-reference-capture', &= before the graphmemes to be captured. Then at the end of the graphemes to be captured, a 'reference-capture' is used to bind those graphemes to a reference:
If your language encodes tone, stress, breathy voice, or other phonological features directly on vowels, you'll often need to target a particular grapheme across its variants.
One method is to target each variant manually:
This workaround uses alternators, but lacks semantic clarity and scalability, and is outright tedious.
To solve this, are 'associatemes' -- aligned variant graphemes associated with their base grapheme, and other associated graphemes -- other SCAs might use the terms "floating diacritics" or "autosegmentals". These allow you to target all forms of a grapheme with a single token. To set up associatemes, they must be stated in the graphemes directive with the base associateme set of an entry inside curly braces, and each variant set in curly braces with a < to the left of the base set, or another variant set, like so:
The behaviour of associatemes are:
{a,b,c}<{x,y,z} is valid{a,i,o}<{á,í,ó}, {a,b,c}<{x,y,z} is validIn a rule, you then put a tilde after the grapheme to mark it as a base associateme. This is called a 'based-mark'. For example:
This transform targets all variants of a and carries over that association to e.
This program can change the case of letters or the whole word with routines or with paragraph mode. However your language may not have the expected correspondences between lowercase and uppercase. Some examples:
Turkish: The lowercase i becomes uppercase İ, and the uppercase I becomes lowercase ı.
Some polynesian languages: The vowel after an "okina" is capitalised instead of the okina itself.
Some styles in a few European languages: Sometimes both letters of a digraph will be capitalised.
To accommodate these special cases, you can define a letter case field directive. For example:
Blocks modify the behaviour of transforms that are inside them with 'condition and event' logic.
They begin with a <@ at the beginning of a line, and end with a > on the beginning of a line.
This block will indicate a chance that the transformations inside the block will occur or not. They cannot be nested. This is useful for sporadic sound change.
The above example's transforms have a 60% chance of occuring on each word.