Do you remember the computer HAL 9000 in Stanley Kubrick’s 1960’s science fiction film “2001 A Space Odyssey”? Forget the fact that it went a bit OTT and mission crazy towards the end, one of the interesting things about HAL (which stood for Heuristically programmed ALgorithmic computer) was that it could understand and converse in English. No need for inputting via a keyboard, or translation into machine code. Think “lexical semantics”.
Ever since the release of the film linguists and computer scientists have tried to get computers to understand human language by programming the semantics of language as software. We already have programmes that can understand and distinguish numbers and certain words on our mobiles, when we pay bills over the phone, and even in computer games, but a computer that can understand and be fluent in a human language has eluded us.
That may be changing- a University of Texas at Austin linguistics Professor, Katrin Erk, is using supercomputers to develop a new method for helping computers learn natural language.
Instead of hard-coding human logic or deciphering dictionaries to try to teach computers language, Erk decided to try a different tactic: feed computers a vast body of texts (as an input of human knowledge) and use the implicit connections between the words to create a map of relationships.
“An intuition for me was that you could visualize the different meanings of a word as points in space. You could think of them as sometimes far apart, like a battery charge and criminal charges, and sometimes close together, like criminal charges and accusations (“the newspaper published charges…”). The meaning of a word in a particular context is a point in this space. Then we don’t have to say how many senses a word has. Instead we say: ‘This use of the word is close to this usage in another sentence, but far away from the third use.'”
I have to say that as a human, I had some trouble getting my head round that quote! Perhaps we should be looking at how babies learn language and try to replicate that learning in a computer. But back to Erk’s work, to create a model that can accurately recreate the intuitive ability to distinguish word meaning will require a lot of text and a lot of analytical crunching power.
“The lower end for this kind of a research is a text collection of 100 million words. If you can give me a few billion words, I’d be much happier. But how can we process all of that information? That’s where supercomputers come in.”
So we need a mega computer to help us devise a computer that will not only understand us, but communicate intelligently with us. If this could be achieved, how close wiuld such a computer be to a sentient entity? What if it’s first words to us once we switch on this fully loaded language-conversant computer are “I’m hungry”! Well, as long it doesn’t start singing “Daisy Daisy” and switching off life support systems… But perhaps HAL 9001 will be better behaved.