nltk.model package¶
Submodules¶
nltk.model.api module¶
- class nltk.model.api.ModelI[source]¶
Bases: builtins.object
A processing interface for assigning a probability to the next word.
- choose_random_word(context)[source]¶
Randomly select a word that is likely to appear in this context.
- entropy(text)[source]¶
Evaluate the total entropy of a message with respect to the model. This is the sum of the log probability of each word in the message.
nltk.model.ngram module¶
- class nltk.model.ngram.NgramModel(n, train, pad_left=True, pad_right=False, estimator=None, *estimator_args, **estimator_kwargs)[source]¶
Bases: nltk.model.api.ModelI
A processing interface for assigning a probability to the next word.
- choose_random_word(context)[source]¶
Randomly select a word that is likely to appear in this context.
Parameters: context (list(str)) – the context the word is in
- entropy(text)[source]¶
Calculate the approximate cross-entropy of the n-gram model for a given evaluation text. This is the average log probability of each word in the text.
Parameters: text (list(str)) – words to use for evaluation
- generate(num_words, context=())[source]¶
Generate random text based on the language model.
Parameters: - num_words (int) – number of words to generate
- context (list(str)) – initial words in generated string
- logprob(word, context)[source]¶
Evaluate the (negative) log probability of this word in this context.
Parameters: - word (str) – the word to get the probability of
- context (list(str)) – the context the word is in
- perplexity(text)[source]¶
Calculates the perplexity of the given text. This is simply 2 ** cross-entropy for the text.
Parameters: text (list(str)) – words to calculate perplexity of
- prob(word, context)[source]¶
Evaluate the probability of this word in this context using Katz Backoff.
Parameters: - word (str) – the word to get the probability of
- context (list(str)) – the context the word is in
- unicode_repr()¶