ゴビンダラジュによる手書きアラビア語OCR
バッファロー大学のゴビンダラジュはすでに手書きの英語文書をスキャン認識するソフトツールを開発したが,手書きのアラビア語を認識するOCRを開発した。これにより,本に書き込まれた手書きのメモや,古代文書などがスキャンされて特定キーワードが抽出できるようになる。また,ゴビンダラジュと共同研究者はデーバナガリ文字で書かれたサンスクリット語,ヒンディ語や他の南アジア系言語のOCR認識ソフト開発への一歩となるソフトツールも作っている。
Computer scientists are developing software to scan Arabic documents, including handwritten ones, for specific words and phrases, filling a void that became apparent following the Sept. 11. attacks.
Besides helping with intelligence gathering, the software should expand access to modern and ancient Arabic manuscripts. It will allow Arabic writings to be digitized and posted on the Web.
For instance, the word mas'uul, meaning responsible, can be written in more than one way, he said. So the software would have to be given instructions about possible variations.
Govindaraju was involved in the development at UB of the first comprehensive OCR software for interpreting handwritten addresses in English, a milestone that spurred research into handwriting recognition that led to some applications now taken for granted, such as personal digital assistants. He and his UB colleagues also created a software tool that is the first step in developing OCR software for Devanagari script, which will allow digitization of documents in Sanskrit, Hindi and dozens of other Indian and South Asian languages.