AICS Enables Better Voice Experience

Article

Before discussing the potential of AICS speech recognition, let us first consider the very beginning of speech recognition.

The first project can be traced to the digit recognizer Project Audrey, invented by Bell Laboratories in 1952. As the first speech recognition product, Audrey only recognized numbers. In 1962, IBM’s ‘Shoebox’ technology was able to understand 16 words in English. In the 1980s, the Hidden Markov Model (HMM) made breakthroughs through statistical methodology instead of just using words and searching for sound patterns.

Over the years, technology (ex. deep learning, neural network) and computing power has developed tremendously. In the twenty-first century – 50 years after Audrey was first created – most technology industry leaders have developed their own Automatic Speech Recognition (ASR) solutions. This technology continues to enrich modern life and open many exciting possibilities for the future.

ASR Solution in AICS: Made in Taiwan

We live in a time where ASR can truly be productized and woven directly into our lives. In an era of endless innovation, the technology industry is filled with companies competing to provide the best user experience to its customers. As scrutiny and sensitivity from the public continue to increase, companies also need to strive to better protect the privacy and data of its people.

AICS was established in Taipei in January 2019. Its mission is to help businesses solve their most challenging problems with AI-first products and services. ASR is one of AICS’s key differentiators. Just nine months after inception, the team has already achieved remarkable milestones.

World-Class Word Error Rate

Google is often the first name that comes to mind when dealing with Word Error Rate (WER). You may be surprised to find then, that the ASR Word Error Rate is on-par with Google in the English data set from libriSpeech, and actually surpasses Google in a localized data set. This result illustrates how MIT (Made in Taiwan) ASR engine is a word-class solution.

Formosa Speech Recognition Challenge 2018

AICS was the champion in the 2018 Formosa Speech Recognition Challenge. The competition primarily focused on Taiwanese speech recognition.

AICS’s speech recognition technology was awarded the best industrial system by its outstanding CER (Character Error Rate) of 8.1%. (Check detailed paper p1, p2) AICS also joined the Grand Challenge and won the second place award in 2018 [1][2]. These awards demonstrate AICS’s excellence in engineering and quality, as well as its solid technology foundation and adaptive ability.

AICS ASR API is ready for EVERYONE

In addition to engine performance, ASR is designed to be accessible to anyone and everyone. AICS API recognizes English, Chinese, and Bi-lingual, and has outstanding performance in recognizing these three languages. It has a specific focus on the Taiwan accent that helps catch and address the precise need of users.

AICS API is also trained by different vertical’s corpus, such as medical or fintech, and allows users to easily customize use cases without diminishing the powerful mechanics of the engine.

Features in AICS ASR API

Bi-lingual (Mandarin + English with Taiwanese accent)
Real-time streaming within ~200ms latency

Contact us if you have questions

We are excited to help accelerate your AI-powered future. Please feel free to contact us if you have any needs for speech recognition solutions:

Front-end Signal processing/Enhancement/Robustness/De-noise/VAD (Voice Activity Detection)
Speech recognition/Acoustic & Language modeling
Text-to-Speech
Speech applications/Speaker diarization/Wake-up