Next Previous Contents

7. Contracted Braille

BRLTTY can display in-line contracted braille. It does this if:

This feature isn't available if the --disable-contracted-braille build option was specified.

The following contraction tables are provided:

big5

Chinese

compress

Remove excessive white-space.

en-us-g2

Grade 2 American English.

fr-abrege

Contracted Unified French.

fr-integral

Uncontracted Unified French.

7.1 File Format

Blank lines are ignored. Any leading and trailing white-space (any number of blanks and/or tabs) is ignored. Lines which begin with a number sign (#) are ignored, i.e. they're comments.

# This is a comment.
All other lines define table entries.

The general form of a table entry is an opcode followed by its operands. The opcode and its operands are separated from one another by white-space. Each opcode has a specific number of operands, and any text following its last operand is treated as a comment.

opcode operand ... comment

The following operand types are supported:

string

A sequence of characters other than white-space (which terminates the string) and backslashes (\). The following special representations are also supported:

\\

backslash

\f

form feed

\n

new line

\oooo

3-digit octal value

\r

carriage return

\s

blank (space)

\t

horizontal tab

\v

vertical tab

\xxx

2-digit hexadecimal value

number

An integer. It may be specified in any of the following ways:

decimal

A sequence of decimal digits (0-9), with the first digit being nonzero (1-9).

hexadecimal

A sequence of hexadecimal digits (0-9, and either a-f or A-F), preceded by either 0x or 0X.

octal

A sequence of octal digits (0-7), with the first digit being zero (0).

dots

A braille symbol. The braille dots must be specified via their standard numbers (see section Standard Braille Dot Numbering Convention), and, for a multi-cell symbol, the cell specifications must be separated from one another by a dash (-). For example, the contraction for the English word lord (the letter l prefixed with dot 5) would be specified as 5-123. A space may be specified via the special dot number 0.

7.2 Opcodes

An opcode is a keyword which tells the translator how to interpret the operands. The opcodes are grouped here by function.

Table Administration

These opcodes make it easier to write contraction tables. They have no direct effect on the character translation.

include path

Include the contents of another file. Nesting can be to any depth. Relative paths are anchored at the directory of the including file.

locale locale

Define the locale for character interpretation (lowercase, uppercase, numeric, etc.). The locale may be specified as:

language[_country][.charset][@modifier]

The language component is required and should be a two-letter ISO-639 language code. The country component is optional and should be a two-letter ISO-3166 country code. The charset component is optional and should be a character set name, e.g. ISO-8859-1.

C

7-bit ASCII.

-

No locale.

The last locale specification applies to the entire table. If this opcode isn't used then the C locale is assumed.

Special Symbol Definition

These opcodes define special symbols which must be inserted into the braille text in order to clarify it.

capsign dots

The symbol which capitalizes a single letter.

begcaps dots

The symbol which begins a block of capital letters within a word.

endcaps dots

The symbol which ends a block of capital letters within a word.

letsign dots

The symbol which marks a letter which isn't part of a word.

numsign dots

The symbol which marks the beginning of a number.

Character Translation

These opcodes define the braille representations for character sequences. Each of them defines an entry within the contraction table. These entries may be defined in any order except, as noted below, when they define alternate representations for the same character sequence.

Each of these opcodes has a characters operand (which must be specified as a string), and a built-in condition governing its eligibility for use. The text is processed strictly from left to right, character by character, with the most eligible entry for each position being used. If there's more than one eligible entry for a given position, then the one with the longest character string is used. If there's more than one eligible entry for the same character string, then the one defined nearest to the beginning of the table is used (this is the only order dependency).

Many of these opcodes have a dots operand which defines the braille representation for its characters operand. It may also be specified as an equals sign (=), in which case it means one of two things. If the entry is for a single character, then it means that the currently selected computer braille representation (see the -t command line option and the text-table configuration file directive) for that character is to be used. If it's for a multi-character sequence, then the default representation for each character (see always) within the sequence is to be used.

Some special terms are used within the descriptions of these opcodes.

word

A maximal sequence of one or more consecutive letters.

Now, finally, here are the opcode descriptions themselves:

literal characters

Translate the entire white-space-bounded containing character sequence into computer braille (see the -t command line option and the text-table configuration file directive).

replace characters characters

Replace the first set of characters, no matter where they appear, with the second. The replaced characters aren't reprocessed.

always characters dots

Translate the characters no matter where they appear. If there's only one character, then, in addition, define the default representation for that character.

repeated characters dots

Translate the characters no matter where they appear. Ignore any consecutive repetitions of the same sequence.

largesign characters dots

Translate the characters no matter where they appear. Remove white-space between consecutive words matched by this opcode.

lastlargesign characters dots

Translate the characters no matter where they appear. Remove preceding white-space if the previous word was matched by the largesign opcode.

word characters dots

Translate the characters if they're a word.

joinword characters dots

Translate the characters if they're a word. Remove the following white-space if the first character after it is a letter.

lowword characters dots

Translate the characters if they're a white-space-bounded word.

contraction characters

Prefix the characters with a letter sign (see letsign) if they're a word.

sufword characters dots

Translate the characters if they're either a word or at the beginning of a word.

prfword characters dots

Translate the characters if they're either a word or at the end of a word.

begword characters dots

Translate the characters if they're at the beginning of a word.

begmidword characters dots

Translate the characters if they're either at the beginning or in the middle of a word.

midword characters dots

Translate the characters if they're in the middle of a word.

midendword characters dots

Translate the characters if they're either in the middle or at the end of a word.

endword characters dots

Translate the characters if they're at the end of a word.

prepunc characters dots

Translate the characters if they're part of punctuation at the beginning of a word.

postpunc characters dots

Translate the characters if they're part of punctuation at the end of a word.

begnum characters dots

Translate the characters if they're at the beginning of a number.

midnum characters dots

Translate the characters if they're in the middle of a number.

endnum characters dots

Translate the characters if they're at the end of a number.

Character Classes

These opcodes define and use character classes. A character class associates a set of characters with a name. The name then refers to any character within the class. A character may belong to more than one class.

The following character classes are automatically predefined basdd on the selected locale:

digit

Numeric characters.

letter

Both uppercase and lowercase alphabetic characters. Some locales have additional letters which are neither uppercase nor lowercase.

lowercase

Lowercase alphabetic characters.

punctuation

Printable characters which are neither white-space nor alphanumeric.

space

White-space characters. In the default locale these are: space, horizontal tab, vertical tab, carriage return, new line, form feed.

uppercase

Uppercase alphabetic characters.

The opcodes which define and use character classes are:

class name characters

Define a new character class. The characters operand must be specified as a string. A character class may not be used until it's been defined.

after class opcode ...

The specified opcode is further constrained in that the matched character sequence must be immediately preceded by a character belonging to the specified class. If this opcode is used more than once on the same line then the union of the characters in all the classes is used.

before class opcode ...

The specified opcode is further constrained in that the matched character sequence must be immediately followed by a character belonging to the specified class. If this opcode is used more than once on the same line then the union of the characters in all the classes is used.


Next Previous Contents