C.I. Guinn, D. Crist, and H. St. Werth (USA)
Parsing, semantic grammars, probabilistic natural language processing, speech recognition, text abstraction.
In this paper, we contrast technologies for extracting meaning from text in an application developed to collect fine-grained data of individual daily human activity including information related to time, exertion, specific activity, and location. This data is used by the Environmental Protection Agency (EPA) to build models of human exposure to various chemicals. This paper will look specifically at the automatic classification of activity and location based on a spoken language diary collected from a digital recorder. Two natural language processing paradigms were employed and contrasted in the extraction of information from the spoken diary: a semantic grammar approach using minimum distance parsing and a statistical approach using n-grams and Bayesian statistics.
Important Links:
Go Back