Automatic Speech Recognition (ASR)

Speech recognition (also known as automatic speech recognition or computer speech recognition) converts speech to text. Speech recognition is a broad term which means it can recognize almost anybody's speech - such as a call-centre system designed to recognize many voices. Speech recognition applications include voice dialing, call routing (e.g., "I would like to make a collect call"), transcription (digital speech-to-text), automatic translation etc.

Speech recognition is a solution which refers to technology that can recognize speech without being targeted at single speaker such as a call center system that can recognize arbitrary voices.

Speech recognition applications include voice user interfaces such as voice dialing (e.g., "Call home"), call routing (e.g., "I would like to make a collect call"), domotic appliance control, search (e.g., find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), speech-to-text processing (e.g., word processors or emails), and aircraft (usually termed Direct Voice Input).

Accuracy of recognition [%]
The definition of "Recognition Reliability" can be interpreted multiple ways. An analysis of a number of representative speech applications confirms that out-of-grammar errors outnumber misrecognition errors by a factor of as high as 5-to-1. Put simply, the problem is not recognizing what the caller said, it's knowing what the caller meant. Therefore, to increase the accuracy of any system improving its automation rate, making customers happier with the use of the system, and lowering overall costs for the contact center a critical factor is to reduce these out-of-grammar errors. When a caller believes that a speech system has misrecognized his or her response, it is more likely that the caller spoke something which was out-of-grammar the system simply wasn't expecting the caller to respond quite like that. For example, callers often provide more information than prompted for. A system that can respond to varying amounts of information will have more productive and shorter calls. A system that constantly asks for confirmations creates a disjointed conversation that callers tend to reject. However systems that can handle corrections and verifications by dynamically embedding the confirmations in the next prompt are more engaging, leading to better automation rates.

Isolated word recognition have to be trained before they reach a good level of reliability (up to 98 to 99% correct recognition in controlled environments) and can be speaker-adaptive, that means, at the beginning the recognition accuracy is very moderate, but after intensive use it continues to improve and the accuracy can also reach up to 98...99%.

Dependency of recognition on environmental noise
Background noise, including conversations, coughs, traffic and mobile phone static all pose a key challenge in speech recognition. The Recognizer, however, uses advanced acoustic models that are inherently robust to noisy input to deliver higher accuracy, fewer re-prompts, and superior barge-in. Thanks to strong connections within academia and established relationships with data collection consortia, The Recogniser has extremely broad and deep access to acoustic and linguistic data. As a result, the Recognizer's acoustic models have been trained on a vast range of real-world data, including noisy data form all sorts of environments, to deliver unmatched noise robustness.

The Recognizer is able to separate speech from background noise with remarkable precision due to its superior endpointing and speech detection algorithms. Improved endpointer detection enables the system to determine when speech started and ended -even in extremely noisy mobile environments- for more accurate transcription.

Supported languages are:

  • Arabic: Jordan
  • Australian: English
  • Basque: Spain
  • Bengali: India
  • Cantonese: Hong Kong
  • Catalan: Spain
  • Czech: Czech_Republic
  • Danish: Denmark
  • Dutch: Belgium, Netherlands
  • English: UK, Singapore
  • Finnish: Finland
  • French: Belgium, Canada, France
  • German: Austria, Germany, Switzerland
  • Greek: Greece
  • Gujarati: India
  • Hebrew: Israel
  • Hindi: India
  • Hungarian: Hungary
  • Icelandic: Iceland
  • Indian: English
  • Italian: Italy
  • Japanese: Japan
  • Kannada: India
  • Korean: Korea
  • Malayalam: India
  • Mandarin: China, Taiwan
  • Marathi: India
  • Norwegian: Norway
  • Oriya: India
  • Polish: Poland
  • Portuguese: Brazil, Portugal
  • Punjabi: India
  • Russian: Russia
  • lovak: Slovakia
  • Slovenian: Slovenia
  • Spanish: Argentina, Colombia, Sweden
  • Tamil: India
  • Telugu: India
  • Turkish: Turkey
  • US: English, Spanish

When you are looking for a language which is not listed here, contact us.

KPN Mobile iTalk BB MVNO Virgin Mobile MVNO Skinny MVNE Entel MVNE, Falabella MVNO TZMobile MVNO Bambora PCI IVR Vodafone Ziggo MNO T-Mobile MNO Tele2 MNO, MVNO
Calls / year
Mobile subscribers
10+ years
customer loyalty
Global solutions
On premise, Hybrid,
Public/Private Cloud


LinkedIn, Comsys Facebook, Comsys Follow ComsysTM on Twitter

Sitemap - Cookie Policy - Privacy Policy
Copyright© 1984-2019 - Pareteum

Comsys a Pareteum Brand