Lexiguru documentation

Contents

  1. About Lexiguru
  2. Interface
    1. Options
    2. File save / load
  3. Using comments
  4. Word creation
    1. Classes
      1. Weights
      2. Class-drop-off
      3. Classes in classes and "pick-one"
    2. Macros
      1. Pick-one
      2. Optional
      3. The random-weight: directive
      4. Inter-pick-one
    3. Words
      1. Word-drop-off
    4. The graphs: directive
    5. Escape characters for word creation

1About Lexiguru

This is the complete documentation for Lexiguru version b2.0.1

Lexiguru is an online application that randomly generates words from a given definition of phonemes, frequencies and word patterns. Applications like Lexiguru are called "word generators" or "vocabulary generators".

You can use it to make words for a constructed language, to get an original nickname or password, or just for fun.

2Interface

2.1Options

2.2File save / load

3Using comments

If a line contains a ;, everything after it on that line is ignored and not interpreted as Lexiguru syntax -- unless ; is escaped. You can use this to leave notes about what something does or why you made certain decisions.

4Word creation

4.1Classes

Classes are groups of phonemes with singular-length character names. For example.

C = t n k m ch l ʔ s r d h w b y p g
F = n l ʔ t k r p
V = a i e u o

This creates three groupings. C is the group of all consonants, V is the group of all vowels, and F is a group of some of the consonants.

By default, the phonemes' frequencies decrease as they go to the right, according to the Gusein-Zade distribution. In the above example, when Lexiguru needs to choose a V, it will choose a the most at 43%, i the second-most at 26%, e the third-most at 17%, u the fourth-most at 10%, and o the fith most at 4%.

4.1.2Class-drop-off

You can modify the phonemes' frequencies using this option. The options are zipfian, gusein-zade, and flat. As already stated, the default is gusein-zade.

class-drop-off: flat

4.1.1Weights

If you want to set you own frequency for a class, you can use a colon (:) to specify the weight for each phoneme, like so:

V = a:5 e:4 i:3 o:2 u:1

V has approximately the following probabilities: a: 33%, e: 27%, i: 20%, o: 13%, u: 7%.

4.1.3"Pick-one" and classes in classes

Project creep, do later.

Treated as a single unit in terms of frequency

V = a i e o [aa ii ee oo]

V has a sixth chance at being a long vowel. Using a class inside a class will also have the same result

L = aa ii ee oo
V = a i e o L

4.2Macros

Macros are a system designed to provide an abbreviation for syllable shapes. They are defined similarly to phoneme classes, but with several important differences:

For example:

$S = CVD?
words: V?$S$S V?$S V?$S$S$S

4.2.1Pick-one

Using square brackets, ([ ]), the set is treated as if it were a class or macro.

C = t
A = a u
V = e y 
$S = C[A V]

ta tu, te, ty

4.2.2Optional

Using round brackets, ({ }), Optionaler works the same way as “pick-one”, the only difference is that what's inside them can either appear in the word or not. The probability of each of these variants is dependent on the optional weight directive. The default is a probability of 10%

$S = CV(F)
words: V?$S$S V?$S V?$S$S$S

4.2.3optionals-weight

The optionals-rate: directive specifies how often optional phonemes or classes are selected. This number is a percentage and as previously stated the default is 10%. For example,

optionals-rate: 20

4.2.4Inter-pick-one

“Inter-pick-one”, using curly braces ({ }), works the same as pick-one. The only difference is only one "Inter-pick-one" set will be chosen for that macro.

Inter pick one is a feature designed to generate words with stress or pitch accent systems.

C = t
V = a
$X = ({'}CV){'}CV
words: $X

This produces any of the following words: 'ta, ta'ta, 'tata. Notice here that ta is not possible.

There are a few restrictions and peculiarities to it. Most notibly, Inter-pick-ones may not be nested inside each other. Lets look at another example.

class-drop-off: flat
$Z = {a b}{x}
words: $Z

The above example is rather silly, as there is nothing between each "Inter-pick-one", defeating it's whole purpose. However it is useful as an example here in showing that it produces equivalent results to the example below, which uses "pick ones" instead.

class-drop-off: flat
$Z = [[a b][x]]
words: $Z

In both of the above examples, we have a 25% chance of producing a, a 25% chance of b, and a 50% chance of x.

See the "Romance-like" example for a language that use "Inter-pick-one" for its stress system, and "BTX" for a language that uses it for a complex pitch accent system.

4.3Words

The most common way to make a word is to use the words: directive. Words are weighted similarly to how phonemes are weighted in classes. A word can consist of individual phonemes, phoneme classes, or a mixture of both.

Phonemes or classes that are optional can be indicated by a ?. For example, words: CVD? is similar to words: CV CVD, although the weights are quite different.

If you choose from the same class twice in a row, you may put an ! after the second one, to indicate they must not be the same phoneme. For example, CC may generate tt, but CC! never will.

By default, words are selected using the Zipf distribution.

4.3.1Word-drop-off

This directive modifies how often the words' frequencies decrease as they go to the right, unless they have weights of course. The options are zipfian, gusein-zade, and flat. The default is zipfian.

4.4The graphs: directive

The graphs: directive can be an important element to your phonology definition file.

Alphabetisation

The graphs directive gives Lexiguru a custom sort order for words, when the sort words checkbox is selected.

Sometimes you may want the utility of telling Lexiguru which are multigraph graphemes without alphabetisation.

graphs: a b c c(h) d e f g h i j k l m n o p p' r s t t' u v y 
        cat chit-chat cumin frog

Definining multigraphs and others

Tells Lexiguru which multigraphs, including character + combining diacritics, to be treated as singular phonemes

Tells Lexiguru what multigraphs to treat as a single unit when using filters.

graphs: a b ch d e f g h i j k l m n o p p' r s t t' u v y 
        words: CVD?

Tells Lexiguru which character + combining diacritic sequences to treat as a single grapheme.

V = a a̋ e i o u
        words: CVD?

Tells Lexiguru what character + combining diacritic sequences to be treated as alternatives of another grapheme

graphs: a <[á à ǎ â] b d e <[é è ě ê] f g h i <[í ì ǐ î] k l m n o <[ó ò ǒ ô] p r s t u <[ú ù ǔ û] w y
        words: CVD?

The graphs: directive has the following aliases: alphabet, letters, graphemes, multigraphs, digraphs

4.5Word creation escape characters

Characters enclosed in a set of double quotes ignore any meaning they might have had in the generator, including double quotes themselves. This way, anything including capital letters that have already been defined as classes, brackets, even spaces, can be generated.

These are the characters you must escape if you want to use them in classes, macros, or words:

Characters Meaning
; Creates a comment
C = Creates a class
Space, seperates choices
$ Defines a macro
: Weight
[ ] Pick-one
( ) Optionals
{ } Inter-pick-one
" Escapes characters enclosed in them