Public API¶
CLI utilities¶
- hunspellcheck.hunspellchecker_argument_parser(parser, version=False, version_prog=None, version_number=None, hunspell_version=True, ispell_version=True, version_template='{% if version_number %}{{version_prog}} {{version_number}}{% endif %}{% if version_number and (hunspell_version or ispell_version) %} - {% endif %}{% if hunspell_version %}Hunspell {{hunspell_version}}{% endif %}{% if hunspell_version and ispell_version %} - {% endif %}{% if ispell_version %}Ispell {{ispell_version}}{% endif %}', version_template_context={}, version_name_or_flags=['--version'], version_kwargs={}, files=True, files_kwargs={}, languages=True, languages_name_or_flags=['-l', '--language'], languages_kwargs={}, negotiate_languages=True, personal_dicts=True, personal_dicts_name_or_flags=['-p', '--personal-dict'], personal_dicts_kwargs={}, encoding=True, encoding_name_or_flags=['-i', '--input-encoding'], encoding_kwargs={}, digits_are_words=True, digits_are_words_name_or_flags=['--digits-are-words'], digits_are_words_kwargs={}, words_not_contain_digits=True, words_not_contain_digits_name_or_flags=['--words-not-contain-digits'], words_not_contain_digits_kwargs={}, words_not_startswith_dash=True, words_not_startswith_dash_name_or_flags=['--words-not-startswith-dash'], words_not_startswith_dash_kwargs={}, words_not_endswith_dash=True, words_not_endswith_dash_name_or_flags=['--words-not-endswith-dash'], words_not_endswith_dash_kwargs={}, words_not_contain_dash=True, words_not_contain_dash_name_or_flags=['--words-not-contain-dash'], words_not_contain_dash_kwargs={}, words_not_contain_two_upper=True, words_not_contain_two_upper_name_or_flags=['--words-not-contain-two-upper'], words_not_contain_two_upper_kwargs={}, no_include_filename=True, no_include_filename_name_or_flags=['--no-include-filename'], no_include_filename_kwargs={}, no_include_line_number=True, no_include_line_number_name_or_flags=['--no-include-line-number'], no_include_line_number_kwargs={}, no_include_word=True, no_include_word_name_or_flags=['--no-include-word'], no_include_word_kwargs={}, no_include_word_line_index=True, no_include_word_line_index_name_or_flags=['--no-include-word-line-index'], no_include_word_line_index_kwargs={}, include_line=True, include_line_name_or_flags=['--include-line'], include_line_kwargs={}, include_text=True, include_text_name_or_flags=['--include-text'], include_text_kwargs={}, include_error_number=True, include_error_number_name_or_flags=['--include-error-number'], include_error_number_kwargs={}, include_near_misses=True, include_near_misses_name_or_flags=['--include-near-misses'], include_near_misses_kwargs={})¶
Extends a
argparse.ArgumentParser
instance adding spellchecking common parameters.By default will add next parameters:
A positional argument as a property named
files
inside the options namespace which takes multiple possible globs as inputs.A required argument
-l/--language
that could be passed multiple times which take language dictionary names or filepaths. It will check if the passed language is recognized by Hunspell (or if is a dictionary file, if exists), and in case that not, will print a list with all available dictionaries.An optional argument
-p/--personal-dict
that could be passed multiple times which takes a path to a file used to exclude certain words from being triggered as positives.An optional argument
-i/--input-encoding
that should define the input content encoding.
- Parameters
version (
bool
) – Include a convenient--version
option that will print the version of the program, and optionally the installed versions of Hunspell and Ispell. Seeversion_prog
,version_number
,hunspell_version
andispell_version
parameters below.version_prog (
str
) – Name of the program shown along the version. If is not provided, will be taken fromparser.prog
property.version_number (
str
) – Version of the program. Seeversion_template
argument below for details about the formatting.hunspell_version (
str
) – Include version of Hunspell in the version shown passing--version
.ispell_version (
str
) – Include version of Ispell in the version shown passing--version
.version_template (
str
) – Template for version rendering passed to ajinja2.Template
object that will be used to renderize the version string. By default, ifversion_number
is provided, andhunspell_version
andispell_version
areTrue
, it will render a string like"<version_prog> <X.Y.Z> - Hunspell <X.Y.Z> - Ispell <X.Y.Z>"
. The data for template rendering by default is compound by the next fields:version_prog
,version_number
,hunspell_version
andispell_version
. If you want to pass other fields, include them in the argumentversion_template_context
.version_template_context (
dict
) – Additional data to use in the version string rendering.version_name_or_flags (
list, str
) – Flag name defined constructing the--version
argument using the methodargparse.ArgumentParser.add_argument()
.version_kwargs (
dict
) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()
constructing the--version
option.files (
bool
) – Include thefiles
positional argument inside the argument parser.files_kwargs (
dict
) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()
constructing thefiles
positional argument.languages (
bool
) – Include the-l/--language
option inside the argument parser.languages_name_or_flags (
list, str
) – Flag name defined constructing the-l/--language
option using the methodargparse.ArgumentParser.add_argument()
.languages_kwargs (
dict
) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()
constructing the-l/--language
option.negotiate_languages (
bool
) – Enables the language negotiation. If this is enabled and the CLI consumer passes a locale code instead of a full language name (for examplees
instead ofes_ES
), hunspellcheck will convertes
to a territorialized language dictionary name available using the functionbabel.core.Locale.negotiate()
. If is disabled, a language dictionary passed as locale code likees
will be considered invalid.personal_dicts (
bool
) – Include the-p/--personal-dict
option inside the argument parser.personal_dicts_name_or_flags (
list, str
) – Flag name defined constructing the-p/--personal-dict
option using the methodargparse.ArgumentParser.add_argument()
.personal_dicts_kwargs (
dict
) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()
constructing the-p/--personal-dict
option.encoding (
bool
) – Include the-i/--input-encoding
hunspell option inside the argument parser.encoding_name_or_flags (
list, str
) – Flag name defined constructing the-i/--input-encoding
option using the methodargparse.ArgumentParser.add_argument()
.encoding_kwargs (
dict
) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()
building the-i/--input-encoding
option.digits_are_words (
bool
) – Include the option--digits-are-words
to define if a value filled by digits will be considered a word for mispellchecking or not.digits_are_words_name_or_flags (
list, str
) – Flag name defined constructing the--digits-are-words
option using the methodargparse.ArgumentParser.add_argument()
.digits_are_words_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--digits-are-words
option.words_not_contain_digits (
bool
) – Include the option--words-not-contain-digits
which when passed in a CLI, the words that contain digits will be ignored mispellchecking errors.words_not_contain_digits_name_or_flags (
list
) – Flag name defined constructing the--words-not-contain-digits
option using the methodargparse.ArgumentParser.add_argument()
.words_not_contain_digits_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--words-not-contain-digits
option.words_not_startswith_dash (
bool
) – Include the option--words-not-startswith-dash
which when passed in a CLI, the words starting with character ``”-” `` will be ignored mispellchecking errors.words_not_startswith_dash_name_or_flags (
list
) – Flag name defined constructing the--words-not-startswith-dash
option using the methodargparse.ArgumentParser.add_argument()
.words_not_startswith_dash_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--words-not-startswith-dash
option.words_not_endswith_dash (
bool
) – Include the option--words-not-endswith-dash
which when passed in a CLI, the words ending with character ``”-” `` will be ignored mispellchecking errors.words_not_endswith_dash_name_or_flags (
list
) – Flag name defined constructing the--words-not-endswith-dash
option using the methodargparse.ArgumentParser.add_argument()
.words_not_endswith_dash_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--words-not-endswith-dash
option.words_not_contain_dash (
bool
) – Include the option--words-not-contain-dash
which when passed in a CLI, the words containing character ``”-” `` will be ignored mispellchecking for possible errors.words_not_contain_dash_name_or_flags (
list
) – Flag name defined constructing the--words-not-contain-dash
option using the methodargparse.ArgumentParser.add_argument()
.words_not_contain_dash_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--words-not-contain-dash
option.words_not_contain_two_upper (
bool
) – Include the option--words-not-contain-two-upper
which when passed in a CLI, the words containing two uppercase letters or mote will be ignored mispellchecking for possible errors.words_not_contain_two_upper_name_or_flags (
list
) – Flag name defined constructing the--words-not-contain-two-upper
option using the methodargparse.ArgumentParser.add_argument()
.words_not_contain_two_upper_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--words-not-contain-two-upper
option.no_include_filename (
bool
) – Include the option--no-include-filename
which when passed in a CLI, the path to files in which mispelling errors are found are not shown in the output.no_include_filename_name_or_flags (
list
) – Flag name defined constructing the--no-include-filename
option using the methodargparse.ArgumentParser.add_argument()
.no_include_filename_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--no-include-filename
option.no_include_line_number (
bool
) – Include the option--no-include-line-number
which when passed in a CLI, the number of lines in which mispelling errors are found are not shown in the output.no_include_line_number_name_or_flags (
list
) – Flag name defined constructing the--no-include-line-number
option using the methodargparse.ArgumentParser.add_argument()
.no_include_line_number_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--no-include-line-number
option.no_include_word (
bool
) – Include the option--no-include-word
which when passed in a CLI, the words in which mispelling errors are found are not shown in the output.no_include_word_name_or_flags (
list
) – Flag name defined constructing the--no-include-word
option using the methodargparse.ArgumentParser.add_argument()
.no_include_word_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--no-include-word
option.no_include_word_line_index (
bool
) – Include the option--no-include-word-line-index
which when passed in a CLI, the index of the mispelled words inside their lines in which mispelling errors are found are not shown in the output.no_include_word_line_index_name_or_flags (
list
) – Flag name defined constructing the--no-include-word-line-index
option using the methodargparse.ArgumentParser.add_argument()
.no_include_word_line_index_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--no-include-word-line-index
option.include_line (
bool
) – Include the option--include-line
which when passed in a CLI, the line of the mispelled words in which mispelling errors are found are shown in the output.include_line_name_or_flags (
list
) – Flag name defined constructing the--include-line
option using the methodargparse.ArgumentParser.add_argument()
.include_line_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--include-line
option.include_text (
bool
) – Include the option--include-text
which when passed in a CLI, the text in which reside found mispelled words is shown in the output.include_text_name_or_flags (
list
) – Flag name defined constructing the--include-text
option using the methodargparse.ArgumentParser.add_argument()
.include_text_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--include-text
option.include_error_number (
bool
) – Include the option--include-error-number
which when passed in a CLI, the number of each error is shown in the output.include_error_number_name_or_flags (
list
) – Flag name defined building the--include-error-number
option using the methodargparse.ArgumentParser.add_argument()
.include_error_number_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--include-error-number
option.include_near_misses (
bool
) – Include the option--include-near-misses
which when passed in a CLI, some Hunspell suggestions will be shown for each mispelled word in the report.include_near_misses_name_or_flags (
list
) – Flag name defined building the--include-near-misses
option using the methodargparse.ArgumentParser.add_argument()
.include_near_misses_kwargs (
dict
) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()
building the--include-near-misses
option.
Examples
>>> import argparse >>> >>> parser = argparse.ArgumentParser() >>> hunspellchecker_argument_parser( ... version=True, ... version_number="1.0.0", ... ) >>> opts = parser.parse_args(["--language", "es"]) >>> print(opts) Namespace(languages=["es_ES"])
Spellchecker interface¶
- class hunspellcheck.HunspellChecker(filenames_contents, languages, personal_dicts=None, looks_like_a_word=<function looks_like_a_word>, encoding=None)¶
Main spellchecking interface of hunspellcheck.
- Parameters
filenames_contents (
dict
) – Dictionary mapping filenames to content of those files.languages (
list, str
) – Languages against will be checked the contents.personal_dicts (
str, list
) – Globs of files which would be dictionaries with custom words to ignore from being triggered as positives. Can be globs or files, as string or list of strings.looks_like_a_word (
types.FunctionType
) – Function to filter the positive words from being considered positives. Takes a possible word string and returns if the value could be considered a word to be checked for mispelling errors. By default, the functionhunspellcheck.word.looks_like_a_word_creator()
will be used with all its arguments by default to build a basic validator.encoding (
str
) – Input encoding. If not defined, it will be autodetected by hunspell.
- check(include_filename=True, include_line_number=True, include_word=True, include_word_line_index=True, include_line=False, include_text=False, include_error_number=False, include_near_misses=False)¶
Spellchecking function.
Yields each mispelled word data found in contents from a generator. The data generated for each word depends on the optional arguments
include_<field>
passed to this function, beingfield
the name of the field inside the yielded dictionary.- Parameters
include_filename (
bool
) – Includes filename where the mispelled word has been found in yielded error data.include_line_number (
bool
) – Includes the line number where the mispelled word has been found in the content for the yielded error data.include_word (
bool
) – Includes the mispelled word found in the yielded error data.include_word_line_index (
bool
) – Includes the index of the caracter in which the mispelled word starts in their line (starting at index 0).include_line (
bool
) – Includes the entire line where the mispelled word resides inside the content.include_text (
bool
) – Includes the full text of the content in where the mispelled word resides.include_error_number (
bool
) – Include the number of the error in yielded data. This could be useful to avoid the need of define a counter.include_near_misses (
bool
) – Includes a list with the near misses for the mispelled word.
- Yields
dict
– Dictionary with all the included data for each mispelled word.
- hunspellcheck.render_hunspell_word_error(data, fields=['filename', 'word', 'line_number', 'word_line_index'], sep=':')¶
Renders a mispelled word data dictionary.
This function allows a convenient way to render each mispelled word data dictionary as a string, that could be useful to print in the context of spell checkers command line interfaces.
- Parameters
data (
dict
) – Mispelled word data, as it is yielded by the methodhunspellcheck.HunspellChecker.check()
.fields (
list
) – List of fields to include in the response.sep (
str
) – Separator string between each field value.
- Returns
Mispelled word data as a string.
- Return type
- hunspellcheck.word.looks_like_a_word_creator(digits_are_words=False, words_can_contain_digits=True, words_can_startswith_dash=True, words_can_endswith_dash=True, words_can_contain_dash=True, words_can_contain_two_upper=True)¶
Generates dinamically the function
look_like_a_word
use to clean the words that must not be checked for mispelling errors.- Parameters
digits_are_words (
bool
) – IfFalse
, values with all characters as digits will not be considered words.words_can_contain_digits (
bool
) – IfFalse
, values with at least one digit character will not be considered words.words_can_startswith_dash (
bool
) – IfFalse
, values starting with the-
character will not be considered words.words_can_endswith_dash (
bool
) – IfFalse
, values ending with the-
character will not be considered words.words_can_contain_dash (
bool
) – IfFalse
, values containing the-
character will not be considered words.words_can_contain_two_upper (
bool
) – IfFalse
, values which contain at least two uppercase like CPython will not be considered words and will not be checking for possible mispellings.
- Returns
- Function that takes a possible word as a parameter and
returns if that value is considered a word. This function can be passed to
hunspellcheck.spellchecker.HunspellChecker
.
- Return type
function
Hunspell utilities¶
- hunspellcheck.get_hunspell_version(hunspell=True, ispell=True)¶
Returns the number of version of Hunspell and the version of Ispell that the installed Hunspell program is using.
- hunspellcheck.is_valid_dictionary_language(dictionary_name, negotiate_languages=False)¶
Check if a dictionary name is a valid dictionary installed for your Hunspell version.
- Parameters
- Returns
Has 3 values:
The first value is a boolean and indicates if the language is valid.
The second value is the dictionary language name, which could be changed from the input is language negotation is enabled.
The third value is a list with all available dictionaries.
- Return type
- hunspellcheck.is_valid_dictionary_language_or_filename(value, negotiate_languages=False)¶
Returns if a value is a valid dictionary language name or an existent file defined by their path.
- hunspellcheck.assert_is_valid_dictionary_language_or_filename(value, negotiate_languages=False)¶
Asserts if a value is a valid dictionary language name or an existent file defined by their path. If is not, raises an
hunspellcheck.InvalidLanguageDictionaryError
.- Parameters
value (
str, list
) – Dictionary language/s or filepath/s.negotiate_languages (
bool
) – Enable language negotiation from locale name to territory.
- hunspellcheck.gen_available_dictionaries(full_paths=False)¶
Generates the available dictionaries contained inside the search paths configured by hunspell.
These dictionaries can be used without specify the full path to their location in the system calling hunspell, only their name is needed.
- hunspellcheck.gen_available_dictionaries_with_langcodes(sort=True, full_paths=False)¶
Generates all available dictionaries installed along with their locale names (without territories).
For example, if es_ES is installed, es also will be included in the response.
- hunspellcheck.list_available_dictionaries(full_paths=False)¶
Convenient wrapper around the generator
hunspellcheck.gen_available_dictionaries()
which returns the dictionary names in a list.
- hunspellcheck.print_available_dictionaries(sort=True, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, full_paths=False)¶
Prints into an stream the available hunspell dictionaries.
By default are printed to the standard output of the system (STDOUT).
- Parameters