"In the past, people have tried to hand code all of this knowledge," explains Katrin Erk, a professor of linguistics (specializing in lexical semantics) at The University of Texas at Austin, US. "I think it's fair to say that this hasn't been successful. There are just too many little things that humans know."
Watching annotators struggle to make sense of conflicting definitions led Erk to try something new. Instead of hard-coding human logic or deciphering dictionaries, her approach mines vast bodies of text (which are a reflection of human knowledge) and uses the implicit connections between words to create a weighted map of relationships.
|(Photo : Katrin Erk and TACC)|
A sentence is translated to logic for inference with the Markov Logic Network, and words are translated to points in space. Here
"An intuition for me was that you could visualize the different meanings of a word as points in space," she said. "You could think of them as sometimes far apart, like a battery charge and criminal charges, and sometimes close together, like criminal charges and accusations ("the newspaper published charges..."). The meaning of a word in a particular context is a point in this space. Then we don't have to say how many senses a word has. Instead we say: ‘This use of the word is close to this usage in another sentence, but far away from the third use.'"
By considering words in a relational, non-fixed way, Erk’s research draws from emerging ideas in psychology of how the mind deals with language and concepts in general. Instead of rigid definitions, concepts have unclear boundaries where the meaning, value, and limits of an idea can vary considerably according to the context or conditions.