|
|
|
History of Taro's Speech RecognitionIn 1971, a pair of newly married graduate students in New York began the development of what would become the basis of Taro's speech recognition system. Jim and Janet Baker sought to create "software, executable on inexpensive computers, which could translate a user's continuous speech." After trying to develop their idea in large companies such as Carnage Melon, IBM, and Verbex, the couple founded their own company, Dragon Systems, in 1982 from their Victorian home in Newton, Massachusetts USA. From the onset, Dragon Systems created their software along the opposite path of their competitors. While their larger competition IBM, Wang, and Digital Equipment were relying on their then-impressive mainframes to muscle through matching each spoken word with huge libraries, Dragon Systems developed their speech recognition on the much smaller Apple II personal computers. Their very first dictation program recognized a vocabulary of 64 words. However, in 1990, when Dragon Systems released the world's first general-purpose dictation software named DragonDictate, it recognized over 60,000 English words. In 1994, Dragon Systems solved the problem of distinguishing speech without the need to pause between each word and released NaturallySpeaking, the world's first continuous-speech dictation software. The Dragon speech-recognition engine (or Dragon Engine) continued to flourish until 1999 when Dragon Systems sold its code to L&H, a Belgium speech and language company looking to replace its own defunct speech-recognition engine. L&H quickly boosted the Dragon Engine to 300,000 lines of recognition but the company folded in the NASDAQ technology stock market crash of 2000. The ScanSoft Corporation swiftly bought the 5,000 CD-ROM's of Dragon code when the L&H assets were separated and sold off. The Dragon Engine passed from company to company and received little public or private interest, though it continued to be incrementally improved by its various owners as a novelty. Eventually, in 2098, the Honda Corporation of Japan took notice and acquired the Dragon Engine for its Asimo line of commercial robots. Honda's Asimo robots had been successfully working in Japanese commercial environments where their speech-recognition was limited to small operational tasks. Asimo robots were prevalent information robots in businesses throughout Japan. However, in 2098, the Honda Corporation focused its attention on marketing their robots outside of Japan. The problem was that Honda's own speech-recognition engine required an exorbitant amount of computations to process even the limited commercial vocabulary. This was not an issue for the enormous HIRA computer systems that relied on their enormous amount of processors to compute vocal information requests, but the Asimo robots didn't have enough space in their 3 feet tall humanoid frames for anything more than a 500,000-word vocabulary in a single language. Honda's speech-recognition engine would have to be replaced in order to make the Asimos work outside of Japan. The Honda Corporation purchased the Dragon Engine because of its ability to perform complex speech-recognition functions with minimal processing power.
A single language is made up of hundreds of thousands of words, each sliced into individual sounds called phonemes. Those words themselves are made of nothing but noise, vibrations of varying frequencies and amplitudes produced by the larynxes, noses, and lungs of a myriad of people under a myriad of conditions.
On the other hand, the Dragon speech-recognition engine processed speech in exactly the opposite manner. Where the Honda speech-recognition engine required thousands of stored sounds to compare vocal input to, the Dragon Engine analyzed each new spoken utterance independently through formulas. Each letter was passed through equations that compared every sound to the sounds that came before and after it. By breaking down speech into such small pieces and using equations rather than matching, the robot could recognize a word by the context of what came before and after the sound rather than looking the sound up and comparing it to a library. Honda didn't limit the Dragon Engine to just understanding language, they also included equations for business tasks so that the robots would know to do a task without being asked. If the Asimo cameras saw someone spill a cup of coffee, it would process that visual input through its equations of tasks and know that coffee + spill = clean the spill. Before the Second Horizon was launched, one of Honda's master programmers, Ryoukan, was assigned to the task of implementing the Dragon Engine into the civilian ship's HIRA systems. His servant, Ant, was unofficially brought in to assist in the implementation but both men quickly learned that the Dragon Engine would never be compatible with the overly complex workings of the HIRA system. The project was quickly abandoned but not before Ant could hide a copy of the Dragon Engine inside of the Second Horizon's systems. Ant realized that if he created a robot with the ability to not only process language but write its own equations, it would essentially learn from what it saw, heard, and experienced. Thus his idea for Taro, a thinking, learning, and reacting robot was born. Back to Second Horizon Characters |