This paper examines a possible solution to the problem of disambiguating polysemous nouns in machine translation. Latent Semantic Analysis (LSA), a statistical method of nding and representing word sense, is used to di erentiate between the different meanings of ambiguous words according to the given context. A collection of training texts are sorted according to polysemous word and meaning. A word-by-text matrix is created from this data and transformed by the LSA method, creating vectors for each text de ning it in terms of the (non-polysemous) words that appear in it. These representations of textual meanings are compared to the context of an ambiguous word to determine the most similar meaning. The viability of this LSA model is compared with a simple Bayesian probability model.
Faculty Advisor: Simon Levy