Friday, December 23, 2011

Natural Language Processing

If you were going to design an android, one of the most important abilities it must have is the ability to understand human speech, at least to the point where it could understand the commands you give it. It would also be nice if it would talk back to you. To be able to communicate with your computer in a normal conversational way would also be a good thing. You may have also noticed that lately, when you call certain businesses, you don't necessarily have to press buttons to enter information to their automated answering systems. Some allow you to speak the required information. All these artificial intelligence tasks fall under the province of natural language processing. Other tasks that require natural language processing are translation from one human language to another, transforming text to speech, answering questions, and retrieving information.

Natural language processing is the study and software development associated with the automatic generation and understanding of natural human languages. Natural language generation software converts information from computer data bases into normal human language. Natural language understanding software converts human language into forms that a computer can understand and manipulate.

One of the earliest systems, called SHRDLU, used a restricted world of blocks. It used a small restricted vocabulary to manipulate blocks of different shapes and sizes on a computer monitor screen. Because it worked extremely well, researchers were excessively optimistic about developing natural language software. However, it turned out that in the real world, language processing was much more difficult than supposed.

Some of the problems are: Ambiguity. For example when it is not clear which word in a sentence an adjective or adverb is modifying. Some strings of words can be interpreted in many ways. In spoken words, sounds that represent successive letters blend into each other. Some written languages, such as Chinese and Thai, do not signal word boundaries. Many words have several meanings. The grammar for natural languages is ambiguous. Typing errors, speech irregularities and OCR errors. Some sentences don't literally mean what they say.

Many of these problems have been partially or wholly solved, but artificial intelligence experts still have a long way to go before you can have an intelligent conversation with your computer or friendly robot.

I note with interest the various web sites with talking heads called chatbots. I urge you to visit one of these sites to learn what a natural language artificial intelligence artifact can do. A popular one is called The ALICE Chatbot Foundation.

No comments: