a good indicator of whether a person knows the meaning of a word is the ability to use it appropriately in a sentence (miller and gildea, 1987). much information about usage can be obtained from quite a limited context: choueka and lusignan (1985) found that people can typically recognize the intended sense of a polysemous word by looking at a narrow window of one or two words around it. statistically-based computer programs have been able to do the same with a high level of accuracy (kilgarriff and palmer, 2000). the goal of our work is to automatically identify inappropriate usage of specific vocabulary words in essays by looking at the local contextual cues around a target word. we have developed a statistical system, alek (assessing lexical knowledge), that uses statistical analysis for this purpose. a major objective of this research is to avoid the laborious and costly process of collecting errors (or negative evidence) for each word that we wish to evaluate. instead, we train alek on a general corpus of english and on edited text containing example uses of the target word. the system identifies inappropriate usage based on differences between the word's local context cues in an essay and the models of context it has derived from the corpora of well-formed sentences. a requirement for alek has been that all steps in the process be automated, beyond choosing the words to be tested and assessing the results. once a target word is chosen, preprocessing, building a model of the word's appropriate usage, and identifying usage errors in essays is performed without manual intervention. alek has been developed using the test of english as a foreign language (toefl) administered by the educational testing service. toefl is taken by foreign students who are applying to us undergraduate and graduate-level programs.toefl is taken by foreign students who are applying to us undergraduate and graduate-level programs. a good indicator of whether a person knows the meaning of a word is the ability to use it appropriately in a sentence (miller and gildea, 1987). the unsupervised techniques that we have presented for inferring negative evidence are effective in recognizing grammatical errors in written text. however, its techniques could be incorporated into a grammar checker for native speakers. approaches to detecting errors by non-native writers typically produce grammars that look for specific expected error types (schneider and mccoy, 1998; park, palmer and washburn, 1997). the problem of error detection does not entail finding similarities to appropriate usage, rather it requires identifying one element among the contextual cues that simply does not fit. alek has been developed using the test of english as a foreign language (toefl) administered by the educational testing service. under this approach, essays written by esl students are collected and examined for errors. this system was tested on eight essays, but precision and recall figures are not reported. an incorrect usage can contain two or three salient contextual elements as well as a single anomalous element. comparison of these results to those of other systems is difficult because there is no generally accepted test set or performance baseline.