Public API¶
CLI utilities¶
- hunspellcheck.hunspellchecker_argument_parser(parser, version=False, version_prog=None, version_number=None, hunspell_version=True, ispell_version=True, version_template='{% if version_number %}{{version_prog}} {{version_number}}{% endif %}{% if version_number and (hunspell_version or ispell_version) %} - {% endif %}{% if hunspell_version %}Hunspell {{hunspell_version}}{% endif %}{% if hunspell_version and ispell_version %} - {% endif %}{% if ispell_version %}Ispell {{ispell_version}}{% endif %}', version_template_context={}, version_name_or_flags=['--version'], version_kwargs={}, files=True, files_kwargs={}, languages=True, languages_name_or_flags=['-l', '--language'], languages_kwargs={}, negotiate_languages=True, personal_dicts=True, personal_dicts_name_or_flags=['-p', '--personal-dict'], personal_dicts_kwargs={}, encoding=True, encoding_name_or_flags=['-i', '--input-encoding'], encoding_kwargs={}, digits_are_words=True, digits_are_words_name_or_flags=['--digits-are-words'], digits_are_words_kwargs={}, words_not_contain_digits=True, words_not_contain_digits_name_or_flags=['--words-not-contain-digits'], words_not_contain_digits_kwargs={}, words_not_startswith_dash=True, words_not_startswith_dash_name_or_flags=['--words-not-startswith-dash'], words_not_startswith_dash_kwargs={}, words_not_endswith_dash=True, words_not_endswith_dash_name_or_flags=['--words-not-endswith-dash'], words_not_endswith_dash_kwargs={}, words_not_contain_dash=True, words_not_contain_dash_name_or_flags=['--words-not-contain-dash'], words_not_contain_dash_kwargs={}, words_not_contain_two_upper=True, words_not_contain_two_upper_name_or_flags=['--words-not-contain-two-upper'], words_not_contain_two_upper_kwargs={}, no_include_filename=True, no_include_filename_name_or_flags=['--no-include-filename'], no_include_filename_kwargs={}, no_include_line_number=True, no_include_line_number_name_or_flags=['--no-include-line-number'], no_include_line_number_kwargs={}, no_include_word=True, no_include_word_name_or_flags=['--no-include-word'], no_include_word_kwargs={}, no_include_word_line_index=True, no_include_word_line_index_name_or_flags=['--no-include-word-line-index'], no_include_word_line_index_kwargs={}, include_line=True, include_line_name_or_flags=['--include-line'], include_line_kwargs={}, include_text=True, include_text_name_or_flags=['--include-text'], include_text_kwargs={}, include_error_number=True, include_error_number_name_or_flags=['--include-error-number'], include_error_number_kwargs={}, include_near_misses=True, include_near_misses_name_or_flags=['--include-near-misses'], include_near_misses_kwargs={})¶
Extends a
argparse.ArgumentParserinstance adding spellchecking common parameters.By default will add next parameters:
A positional argument as a property named
filesinside the options namespace which takes multiple possible globs as inputs.A required argument
-l/--languagethat could be passed multiple times which take language dictionary names or filepaths. It will check if the passed language is recognized by Hunspell (or if is a dictionary file, if exists), and in case that not, will print a list with all available dictionaries.An optional argument
-p/--personal-dictthat could be passed multiple times which takes a path to a file used to exclude certain words from being triggered as positives.An optional argument
-i/--input-encodingthat should define the input content encoding.
- Parameters
version (
bool) – Include a convenient--versionoption that will print the version of the program, and optionally the installed versions of Hunspell and Ispell. Seeversion_prog,version_number,hunspell_versionandispell_versionparameters below.version_prog (
str) – Name of the program shown along the version. If is not provided, will be taken fromparser.progproperty.version_number (
str) – Version of the program. Seeversion_templateargument below for details about the formatting.hunspell_version (
str) – Include version of Hunspell in the version shown passing--version.ispell_version (
str) – Include version of Ispell in the version shown passing--version.version_template (
str) – Template for version rendering passed to ajinja2.Templateobject that will be used to renderize the version string. By default, ifversion_numberis provided, andhunspell_versionandispell_versionareTrue, it will render a string like"<version_prog> <X.Y.Z> - Hunspell <X.Y.Z> - Ispell <X.Y.Z>". The data for template rendering by default is compound by the next fields:version_prog,version_number,hunspell_versionandispell_version. If you want to pass other fields, include them in the argumentversion_template_context.version_template_context (
dict) – Additional data to use in the version string rendering.version_name_or_flags (
list, str) – Flag name defined constructing the--versionargument using the methodargparse.ArgumentParser.add_argument().version_kwargs (
dict) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()constructing the--versionoption.files (
bool) – Include thefilespositional argument inside the argument parser.files_kwargs (
dict) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()constructing thefilespositional argument.languages (
bool) – Include the-l/--languageoption inside the argument parser.languages_name_or_flags (
list, str) – Flag name defined constructing the-l/--languageoption using the methodargparse.ArgumentParser.add_argument().languages_kwargs (
dict) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()constructing the-l/--languageoption.negotiate_languages (
bool) – Enables the language negotiation. If this is enabled and the CLI consumer passes a locale code instead of a full language name (for exampleesinstead ofes_ES), hunspellcheck will convertesto a territorialized language dictionary name available using the functionbabel.core.Locale.negotiate(). If is disabled, a language dictionary passed as locale code likeeswill be considered invalid.personal_dicts (
bool) – Include the-p/--personal-dictoption inside the argument parser.personal_dicts_name_or_flags (
list, str) – Flag name defined constructing the-p/--personal-dictoption using the methodargparse.ArgumentParser.add_argument().personal_dicts_kwargs (
dict) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()constructing the-p/--personal-dictoption.encoding (
bool) – Include the-i/--input-encodinghunspell option inside the argument parser.encoding_name_or_flags (
list, str) – Flag name defined constructing the-i/--input-encodingoption using the methodargparse.ArgumentParser.add_argument().encoding_kwargs (
dict) – Optional kwargs which override the default kwargs passed toargparse.ArgumentParser.add_argument()building the-i/--input-encodingoption.digits_are_words (
bool) – Include the option--digits-are-wordsto define if a value filled by digits will be considered a word for mispellchecking or not.digits_are_words_name_or_flags (
list, str) – Flag name defined constructing the--digits-are-wordsoption using the methodargparse.ArgumentParser.add_argument().digits_are_words_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--digits-are-wordsoption.words_not_contain_digits (
bool) – Include the option--words-not-contain-digitswhich when passed in a CLI, the words that contain digits will be ignored mispellchecking errors.words_not_contain_digits_name_or_flags (
list) – Flag name defined constructing the--words-not-contain-digitsoption using the methodargparse.ArgumentParser.add_argument().words_not_contain_digits_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--words-not-contain-digitsoption.words_not_startswith_dash (
bool) – Include the option--words-not-startswith-dashwhich when passed in a CLI, the words starting with character ``”-” `` will be ignored mispellchecking errors.words_not_startswith_dash_name_or_flags (
list) – Flag name defined constructing the--words-not-startswith-dashoption using the methodargparse.ArgumentParser.add_argument().words_not_startswith_dash_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--words-not-startswith-dashoption.words_not_endswith_dash (
bool) – Include the option--words-not-endswith-dashwhich when passed in a CLI, the words ending with character ``”-” `` will be ignored mispellchecking errors.words_not_endswith_dash_name_or_flags (
list) – Flag name defined constructing the--words-not-endswith-dashoption using the methodargparse.ArgumentParser.add_argument().words_not_endswith_dash_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--words-not-endswith-dashoption.words_not_contain_dash (
bool) – Include the option--words-not-contain-dashwhich when passed in a CLI, the words containing character ``”-” `` will be ignored mispellchecking for possible errors.words_not_contain_dash_name_or_flags (
list) – Flag name defined constructing the--words-not-contain-dashoption using the methodargparse.ArgumentParser.add_argument().words_not_contain_dash_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--words-not-contain-dashoption.words_not_contain_two_upper (
bool) – Include the option--words-not-contain-two-upperwhich when passed in a CLI, the words containing two uppercase letters or mote will be ignored mispellchecking for possible errors.words_not_contain_two_upper_name_or_flags (
list) – Flag name defined constructing the--words-not-contain-two-upperoption using the methodargparse.ArgumentParser.add_argument().words_not_contain_two_upper_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--words-not-contain-two-upperoption.no_include_filename (
bool) – Include the option--no-include-filenamewhich when passed in a CLI, the path to files in which mispelling errors are found are not shown in the output.no_include_filename_name_or_flags (
list) – Flag name defined constructing the--no-include-filenameoption using the methodargparse.ArgumentParser.add_argument().no_include_filename_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--no-include-filenameoption.no_include_line_number (
bool) – Include the option--no-include-line-numberwhich when passed in a CLI, the number of lines in which mispelling errors are found are not shown in the output.no_include_line_number_name_or_flags (
list) – Flag name defined constructing the--no-include-line-numberoption using the methodargparse.ArgumentParser.add_argument().no_include_line_number_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--no-include-line-numberoption.no_include_word (
bool) – Include the option--no-include-wordwhich when passed in a CLI, the words in which mispelling errors are found are not shown in the output.no_include_word_name_or_flags (
list) – Flag name defined constructing the--no-include-wordoption using the methodargparse.ArgumentParser.add_argument().no_include_word_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--no-include-wordoption.no_include_word_line_index (
bool) – Include the option--no-include-word-line-indexwhich when passed in a CLI, the index of the mispelled words inside their lines in which mispelling errors are found are not shown in the output.no_include_word_line_index_name_or_flags (
list) – Flag name defined constructing the--no-include-word-line-indexoption using the methodargparse.ArgumentParser.add_argument().no_include_word_line_index_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--no-include-word-line-indexoption.include_line (
bool) – Include the option--include-linewhich when passed in a CLI, the line of the mispelled words in which mispelling errors are found are shown in the output.include_line_name_or_flags (
list) – Flag name defined constructing the--include-lineoption using the methodargparse.ArgumentParser.add_argument().include_line_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--include-lineoption.include_text (
bool) – Include the option--include-textwhich when passed in a CLI, the text in which reside found mispelled words is shown in the output.include_text_name_or_flags (
list) – Flag name defined constructing the--include-textoption using the methodargparse.ArgumentParser.add_argument().include_text_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--include-textoption.include_error_number (
bool) – Include the option--include-error-numberwhich when passed in a CLI, the number of each error is shown in the output.include_error_number_name_or_flags (
list) – Flag name defined building the--include-error-numberoption using the methodargparse.ArgumentParser.add_argument().include_error_number_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--include-error-numberoption.include_near_misses (
bool) – Include the option--include-near-misseswhich when passed in a CLI, some Hunspell suggestions will be shown for each mispelled word in the report.include_near_misses_name_or_flags (
list) – Flag name defined building the--include-near-missesoption using the methodargparse.ArgumentParser.add_argument().include_near_misses_kwargs (
dict) – Optional kwargs which override default kwargs passed toargparse.ArgumentParser.add_argument()building the--include-near-missesoption.
Examples
>>> import argparse >>> >>> parser = argparse.ArgumentParser() >>> hunspellchecker_argument_parser( ... version=True, ... version_number="1.0.0", ... ) >>> opts = parser.parse_args(["--language", "es"]) >>> print(opts) Namespace(languages=["es_ES"])
Spellchecker interface¶
- class hunspellcheck.HunspellChecker(filenames_contents, languages, personal_dicts=None, looks_like_a_word=<function looks_like_a_word>, encoding=None)¶
Main spellchecking interface of hunspellcheck.
- Parameters
filenames_contents (
dict) – Dictionary mapping filenames to content of those files.languages (
list, str) – Languages against will be checked the contents.personal_dicts (
str, list) – Globs of files which would be dictionaries with custom words to ignore from being triggered as positives. Can be globs or files, as string or list of strings.looks_like_a_word (
types.FunctionType) – Function to filter the positive words from being considered positives. Takes a possible word string and returns if the value could be considered a word to be checked for mispelling errors. By default, the functionhunspellcheck.word.looks_like_a_word_creator()will be used with all its arguments by default to build a basic validator.encoding (
str) – Input encoding. If not defined, it will be autodetected by hunspell.
- check(include_filename=True, include_line_number=True, include_word=True, include_word_line_index=True, include_line=False, include_text=False, include_error_number=False, include_near_misses=False)¶
Spellchecking function.
Yields each mispelled word data found in contents from a generator. The data generated for each word depends on the optional arguments
include_<field>passed to this function, beingfieldthe name of the field inside the yielded dictionary.- Parameters
include_filename (
bool) – Includes filename where the mispelled word has been found in yielded error data.include_line_number (
bool) – Includes the line number where the mispelled word has been found in the content for the yielded error data.include_word (
bool) – Includes the mispelled word found in the yielded error data.include_word_line_index (
bool) – Includes the index of the caracter in which the mispelled word starts in their line (starting at index 0).include_line (
bool) – Includes the entire line where the mispelled word resides inside the content.include_text (
bool) – Includes the full text of the content in where the mispelled word resides.include_error_number (
bool) – Include the number of the error in yielded data. This could be useful to avoid the need of define a counter.include_near_misses (
bool) – Includes a list with the near misses for the mispelled word.
- Yields
dict– Dictionary with all the included data for each mispelled word.
- hunspellcheck.render_hunspell_word_error(data, fields=['filename', 'word', 'line_number', 'word_line_index'], sep=':')¶
Renders a mispelled word data dictionary.
This function allows a convenient way to render each mispelled word data dictionary as a string, that could be useful to print in the context of spell checkers command line interfaces.
- Parameters
data (
dict) – Mispelled word data, as it is yielded by the methodhunspellcheck.HunspellChecker.check().fields (
list) – List of fields to include in the response.sep (
str) – Separator string between each field value.
- Returns
Mispelled word data as a string.
- Return type
- hunspellcheck.word.looks_like_a_word_creator(digits_are_words=False, words_can_contain_digits=True, words_can_startswith_dash=True, words_can_endswith_dash=True, words_can_contain_dash=True, words_can_contain_two_upper=True)¶
Generates dinamically the function
look_like_a_worduse to clean the words that must not be checked for mispelling errors.- Parameters
digits_are_words (
bool) – IfFalse, values with all characters as digits will not be considered words.words_can_contain_digits (
bool) – IfFalse, values with at least one digit character will not be considered words.words_can_startswith_dash (
bool) – IfFalse, values starting with the-character will not be considered words.words_can_endswith_dash (
bool) – IfFalse, values ending with the-character will not be considered words.words_can_contain_dash (
bool) – IfFalse, values containing the-character will not be considered words.words_can_contain_two_upper (
bool) – IfFalse, values which contain at least two uppercase like CPython will not be considered words and will not be checking for possible mispellings.
- Returns
- Function that takes a possible word as a parameter and
returns if that value is considered a word. This function can be passed to
hunspellcheck.spellchecker.HunspellChecker.
- Return type
function
Hunspell utilities¶
- hunspellcheck.get_hunspell_version(hunspell=True, ispell=True)¶
Returns the number of version of Hunspell and the version of Ispell that the installed Hunspell program is using.
- hunspellcheck.is_valid_dictionary_language(dictionary_name, negotiate_languages=False)¶
Check if a dictionary name is a valid dictionary installed for your Hunspell version.
- Parameters
- Returns
Has 3 values:
The first value is a boolean and indicates if the language is valid.
The second value is the dictionary language name, which could be changed from the input is language negotation is enabled.
The third value is a list with all available dictionaries.
- Return type
- hunspellcheck.is_valid_dictionary_language_or_filename(value, negotiate_languages=False)¶
Returns if a value is a valid dictionary language name or an existent file defined by their path.
- hunspellcheck.assert_is_valid_dictionary_language_or_filename(value, negotiate_languages=False)¶
Asserts if a value is a valid dictionary language name or an existent file defined by their path. If is not, raises an
hunspellcheck.InvalidLanguageDictionaryError.- Parameters
value (
str, list) – Dictionary language/s or filepath/s.negotiate_languages (
bool) – Enable language negotiation from locale name to territory.
- hunspellcheck.gen_available_dictionaries(full_paths=False)¶
Generates the available dictionaries contained inside the search paths configured by hunspell.
These dictionaries can be used without specify the full path to their location in the system calling hunspell, only their name is needed.
- hunspellcheck.gen_available_dictionaries_with_langcodes(sort=True, full_paths=False)¶
Generates all available dictionaries installed along with their locale names (without territories).
For example, if es_ES is installed, es also will be included in the response.
- hunspellcheck.list_available_dictionaries(full_paths=False)¶
Convenient wrapper around the generator
hunspellcheck.gen_available_dictionaries()which returns the dictionary names in a list.
- hunspellcheck.print_available_dictionaries(sort=True, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, full_paths=False)¶
Prints into an stream the available hunspell dictionaries.
By default are printed to the standard output of the system (STDOUT).
- Parameters