ispell - Interactive spelling checking
ispell [-d file | -p file | -w chars | -Wn | -t | -n | -x | -b
| -S | -B | -C | -P | -m | -L context | -M | -N
| -T type | -V] file .....
ispell [-d file | -p file | -w chars | -Wn | -t | -n | -T type] -l
ispell [-d file | -p file | -f file | -Wn | -t | -n
| -s | -B | -C | -P | -m | -T type] {-a | -A}
ispell [-d file] [-w chars | -Wn] -c
ispell [-d file] [-w chars] -e[1-4]
ispell [-d file] [-w chars] -D
ispell -v
The ispell(1) utility is fashioned after the spell program from ITS (called ispell(1) on Twenex systems.) The most common usage is "ispell filename". In this case, ispell(1) will display each word that does not appear in the dictionary at the top of the screen and allow you to change it. If there are "near misses" in the dictionary (words that differ by only a single letter, a missing or extra letter, a pair of transposed letters, or a missing space or hyphen), they are also displayed on following lines. In addition to "near misses," ispell can display other guesses at ways to make the word from a known root, with each guess preceded by question marks. Finally, the line containing the word and the previous line are printed at the bottom of the screen. If your terminal can display in reverse video, the word itself is highlighted. You have the option of replacing the word completely, or choosing one of the suggested words. Commands are single characters as follows (case is ignored):
"Normal" mode, as well as the -l, -a, and -A options (discussed later in this topic) also accept the following "common" flags on the command line: -B, -b, -C, -d, -m, -n, -p, -S, -T, -t, -W, -w, and -x. These options are described in more detail after the option list.
The ispell(1) command takes the following options; when there are contradictory options, the last one on the command line takes precedence.
[prefix+]root[-prefix][-suffix][+suffix]For example, refries would display as re+fry-y+ies.
This option can miss short misspellings. If you use this option frequently, it is recommended that you check the spelling yourself on a final pass without this option before you publish your document to protect yourself from possible errors.
The -n and -t options select whether ispell(1) runs in nroff/troff (-n) or TeX/LaTeX (-t) input mode. The default is controlled by the DEFTEXFLAG installation option.
In TeX/LaTeX mode, whenever a backslash (\) is found, ispell(1) will skip to the next white space or TeX/LaTeX delimiter. Certain commands contain arguments that should not be checked, such as labels and reference keys like those found in the \cite command, because they contain arbitrary, non-word arguments. Spell checking is also suppressed when in math mode. For example, given:
\chapter {This is a Ckapter} \cite{SCH86}
ispell(1) will find "Ckapter" but not "SCH". The -t option does not recognize the TeX comment character "%", so comments are also spell-checked. It also assumes correct LaTeX syntax.Arguments to infrequently used commands and some optional arguments are sometimes checked unnecessarily. The bibliography will not be checked if ispell(1) was compiled with IGNOREBIB defined. Otherwise, the bibliography will be checked, but the reference key will not.
TeX/LaTeX mode is also automatically selected if an input file has the extension ".tex", unless overridden by the -n switch.
References for the tib(1) bibliography system, that is, text between a ''[.'' or ''<.'' and ''.]'' or ''.>'' will always be ignored in TeX/LaTeX mode.
The -p option is used to specify an alternate personal dictionary file. A personal dictionary file is simply a sorted list of words, one word to a line. If the file name does not begin with "/", the value of HOME is prefixed. Also, the shell variable WORDLIST can be set, which renames the personal dictionary in the same manner. The command line overrides any WORDLIST setting. If neither the -p switch nor the WORDLIST environment variable is given, ispell(1) will search for a personal dictionary in both the current directory and $HOME, creating one in $HOME if none is found. The preferred name is constructed by appending ".ispell_" to the base name of the hash file. For example, if you use the English dictionary, your personal dictionary would be named ".ispell_english". However, if the file ".ispell_words" exists, it will be used as the personal dictionary regardless of the language hash file chosen. This feature is included primarily for backwards compatibility.
If the -p option is not specified, ispell(1) will look for personal dictionaries in both the current directory and the home directory. If dictionaries exist in both places, they will be merged. If any words are added to the personal dictionary, they will be written to the current directory if a dictionary already existed in that place. Otherwise, they will be written to the dictionary in the home directory.
The -w option can be used to specify characters other than alphabetics that can also appear in words. For instance, -w "&" will allow "AT&T" to be picked up. Underscores are useful in many technical documents. There is an admittedly crude provision in this option for eight-bit international characters. Non-printing characters can be specified in the usual way by inserting a backslash (\) followed by the octal character code; for example, "\014" for a form feed. Alternatively, if "n" appears in the character string, the (up to) three characters following are a DECIMAL code 0-255, for the character. For example, to include bells and form feeds in your words, you would use:
n007n012
Numeric digits other than the three following "n" are simply numeric characters. The use of "n" does not conflict with anything because actual alphabetics have no meaning: alphabetics are already accepted. The Ispell(1) utility will typically be used with input from a file, meaning that preserving parity for possible eight-bit characters from the input text is allowed. Specifying the -l option and typing text from the terminal can create problems if your stty settings preserve parity.
The -a and -A options are intended to be used from other programs through a pipe. This mode is also suitable for interactive use when you want to figure out the spelling of a single word.
In this mode, ispell(1) prints a one-line version-identification message, and then begins reading lines of input. For each input line, a single line is written to the standard output for each word checked for spelling on the line. If the word was found in the main dictionary or your personal dictionary, the line contains only a asterisk (*). If the word was found through affix removal, the line contains a plus sign (+), a space, and the root word. If the word was found through compound formation (concatenation of two words, controlled by the -C option), the line contains only a hyphen (-).
If the word is not in the dictionary, but there are near misses, the line contains an ampersand (&), a space, the misspelled word, a space, the number of near misses, the number of characters between the beginning of the line and the beginning of the misspelled word, a colon (:), another space, and a list of the near misses separated by commas and spaces. Following the near misses (and identified only by the count of near misses), if the word could be formed by adding (illegal) affixes to a known root, is a list of suggested derivations, again separated by commas and spaces. If there are no near misses, the line format is the same, except that the ampersand (&) is replaced by a question mark (?) (and the near-miss count is always zero). The suggested derivations following the near misses are in the form:
[prefix+] root [-prefix] [-suffix] [+suffix]
(for example, "re+fry-y+ies" to get "refries") where each optional prefix and suffix is a string. Also, each near miss or guess is capitalized the same as the input word unless such capitalization is illegal. In the latter case, each near miss is capitalized correctly according to the dictionary.
If the word does not appear in the dictionary, and there are no near misses, the line contains a number-sign character (#), a space, the misspelled word, a space, and the character offset from the beginning of the line. Each sentence of text input is terminated with an additional blank line, indicating that ispell(1) has completed processing the input line.
These output lines can be summarized as follows:
OK | * |
Root | + <root> |
Compound | - |
Miss | & <original> <count> <offset>: <miss>, <miss>, ..., <guess>, ... |
Guess | ? <original> 0 <offset>: <guess>, <guess>, ... |
None | # <original> <offset> |
For example, a dummy dictionary containing the words "fray," "Frey," "fry," and "refried" might produce the following response to the command:
$ echo 'frqy refries | ispell -a -m -d ./test.hash" (#) International Ispell Version 3.0.05 (beta), 08/10/91 & frqy 3 0: fray, Frey, fry & refries 1 5: refried, re+fry-y+ies
When in the -a and -A modes, ispell(1) will also accept lines of single words prefixed with any of '*', '&', '@', '+', '-', '~', '#', '!', '%', or '^'. A line starting with '*' tells ispell(1) to insert the word into the user's dictionary (similar to the I command). A line starting with '&' tells ispell(1) to insert an all-lowercase version of the word into the user's dictionary (similar to the U command). A line starting with '@' causes ispell(1) to accept this word in the future (similar to the A command). A line starting with '+', followed immediately by tex or nroff will cause ispell(1) to parse future input according the syntax of that formatter. A line consisting solely of a '+' will place ispell(1) in TeX/LaTeX mode (similar to the -t option) and '-' returns ispell(1) to nroff/troff mode (but these commands are obsolete). However, string character type is not changed; the '~' command must be used to do this. A line starting with '~' causes ispell(1) to set internal parameters (in particular, the default string character type) based on the file name given in the rest of the line. (A file suffix is sufficient, but the period must be included. Instead of a file name or suffix, a unique name, as listed in the language affix file, can be specified.) However, the formatter parsing is not changed; the '+' command must be used to change the formatter. A line prefixed with '#' will cause the personal dictionary to be saved. A line prefixed with '!' will turn on terse mode (see below), and a line prefixed with '%' will return ispell(1) to normal (non-terse) mode. Any input following the prefix characters '+', '-', '#', '!', or '%' is ignored, as is any input following the file name on a '~' line. To allow spell-checking of lines beginning with these characters, a line starting with '^' has that character removed before it is passed to the spell-checking code. It is recommended that programmatic interfaces prefix every data line with an up arrow to protect against future changes in ispell(1).
To summarize these:
* | Add to personal dictionary |
@ | Accept word, but leave out of dictionary |
# | Save current personal dictionary |
~ | Set parameters based on file name |
+ | Enter TeX mode |
- | Exit TeX mode |
! | Enter terse mode |
% | Exit terse mode |
^ | Spell-check rest of line |
In terse mode, ispell(1) will not print lines beginning with '*', '+', or '-', all of which indicate correct words. This significantly improves running speed when the driving program is going to ignore correct words anyway.
The -s option is only valid in conjunction with the -a or -A options, and only on BSD-derived systems. If specified, ispell(1) will stop itself with a SIGTSTP signal after each line of input. It will not read more input until it receives a SIGCONT signal. This can be useful for handshaking with certain text editors.
The -f option is only valid in conjunction with the -a or -A options.
The -c, -e[1-4] and -D options of ispell(1) are intended primarily for use by the munchlist(1) shell script. The -c switch causes a list of words to be read from the standard input. For each word, a list of possible root words and affixes will be written to the standard output. Some of the root words will be illegal and must be filtered from the output by other means; the munchlist(1) script does this, as in the command:
$ echo BOTHER | ispell -c BOTHER BOTHE/R BOTH/R
The -e switch is the reverse of -c; it expands affix flags to produce a list of words, as in the command:
$ echo BOTH/R | ispell -e BOTH BOTHER
An optional expansion level can also be specified. A level of 1 is the same as -e alone. A level of 2 causes the original root/affix combination to be prepended to the line:
BOTH/R BOTH BOTHER
A level of 3 causes multiple lines to be output, one for each generated word, with the original root/affix combination followed by the word it creates:
BOTH/R BOTH BOTH/R BOTHER
A level of 4 causes a floating-point number to be appended to each of the level-3 lines, giving the ratio between the length of the root and the total length of all generated words including the root:
BOTH/R BOTH 2.500000 BOTH/R BOTHER 2.500000
The -D flag causes the affix tables from the dictionary file to be dumped to standard output.
Unless your system administrator has suppressed the feature to save space, ispell(1) is aware of the correct capitalizations of words in the dictionary and in your personal dictionary. As well as recognizing words that must be capitalized (such as proper names) and words that must be all uppercase (such as acronyms), it can also handle words with unusual capitalization (such as "ITCorp" or "TeX"). If a word is capitalized incorrectly, the list of possibilities will include all acceptable capitalizations. (More than one capitalization might be acceptable; for example, my dictionary lists both "ITCorp" and "ITcorp.")
Although this feature is usually quite predictable, there is something of which you should be aware. If you use "I" to add a word to your dictionary that is at the beginning of a sentence (such as the first word of this paragraph if "although" were not in the dictionary), it will be marked as "capitalization required". A subsequent usage of this word without capitalization (that is, the quoted word in the previous sentence) will be considered a misspelling by ispell(1), and it will suggest the capitalized version. You must then compare the actual spellings by eye, and then type "I" to add the uncapitalized variant to your personal dictionary. You can prevent this problem, however, by using "U" to add the original word, rather than "I".
The rules for capitalization are as follows:
The version of ispell(1) supplied with Interix does not include the scripts and tools for building dictionary files.
The original reference page lists these as bugs:
It takes several to many seconds for ispell(1) to read in the hash table, depending on size.
When all options are enabled, ispell(1) might take several seconds to generate all the guesses at corrections for a misspelled word; on slower computers this time is long enough to be annoying.
The hash table is stored as a quarter-megabyte (or larger) array, so a PDP-11 or 286 version does not seem likely.
The ispell(1) utility should understand more troff(1) syntax, and deal more intelligently with contractions.
Although small personal dictionaries are sorted before they are written out, the order of capitalizations of the same word is somewhat random.
When the -x flag is specified, ispell(1) will unlink any existing .bak file.
There are too many flags, and many of them have nonmnemonic names.
Pace Willisson (pace@mit-vax), 1983, based on the PDP-10 assembly version. That version was written by R. E. Gorin in 1971, and later revised by W. E. Matson (1974) and W. B. Ackerman (1978).
Collected, revised, and enhanced for the Usenet by Walt Buehring, 1987.
Table-driven multi-lingual version by Geoff Kuenning, 1987-88.
Large dictionaries provided by Bob Devine (vianet!devine).
A complete list of contributors is too large to list here, but is distributed with the ispell sources in the file "Contributors."
The version of ispell described in this topic is International Ispell Version 3.1.00, 10/08/93.
spell(1)
egrep(1)
join(1)
sort(1)