
This exploration of existing ASR solutions is the result of that curiosity.

But, naturally, we are curious about the state of art in ASR, NLU and TTS even though we do not expose these parts of our tech stack as separate SaaS offerings.
#Azure speech to text api save file android
Our Android and Web SDKs provide simple APIs suitable from the perspective of app programmers, while Slang platform handles the burden of the complexity of stitching together ASR, NLU and Text-to-Speech (TTS). We are interested in ASR and NLU in general, and their efficacy in the voice-to-action loop in apps in particular. transcribing dictation, or producing real-time subtitles for videos. There are also stand-alone applications of ASR, e.g. Then this text is fed to a Natural Language Processing/Understanding (NLP/NLU) to understand and extract key information (such as intentions, sentiments), and then appropriate action is taken. In ASR, an audio file or speech spoken to a microphone is processed and converted to text, therefore it is also known as Speech-to-Text (STT).

We are very interested in Conversational AI for Indic languages.Īutomatic Speech Recognition (ASR) is the necessary first step in processing voice. At Slang Labs, we are building a platform for programmers to easily augment existing apps with voice experiences. With the growing popularity of voice assistants like Alexa, Siri and Google Assistant, several apps (e.g., YouTube, Gana, Paytm Travel, My Jio) are beginning to have functionalities controlled by voice. Speech recognition technologies have been evolving rapidly for the last couple of years, and are transitioning from the realm of science to engineering.
