A program that works out the meaning of newly coined words using the online encyclopaedia Wikipedia could help machines understand the slang used in blogs and other informal texts, say researchers.
The program – called Zeitgeist – hunts through Wikipedia looking for entries about new words that do not appear in an online resource called WordNet, an official linguistics tool that is both a dictionary and a thesaurus. WordNet is used by researchers to help computers understand human language. New words, or neologisms, that do not appear in WordNet inevitably leave computers stumped.
"Zeitgeist is a neat tool," adds Carrol. But he points out that its limitations mean it can handle only 75% of the neologisms it finds in Wikipedia. Another technique is to use the context of a new word to guess at its meaning, he says. Adding that ability to Zeitgeist could make it much more powerful.