Spokestack is designed to support multiple speech recognition providers so you can decide which is right for your use case. Support varies by mobile platform, however, so we decided to gather the information in one place to make the choice as easy as possible for your app.
Supported ASR providers by platform
|Android ASR (on-device)||✅||❌|
|Apple ASR (on-device)||❌||✅|
|Azure Speech Services||✅||❌|
ASR providers require various configuration, usually in the form of API keys, but sometimes runtime components. This configuration takes place when you first build a Spokestack
SpeechPipeline; below is a list of configuration needed for each platform and some usage notes.
For Android, primitive configuration properties are set via a call to
setProperty(propertyName, value) on the speech pipeline’s builder (or a
SpeechConfig object supplied to it); in iOS, they’re set as fields of a
No API keys or configuration properties are required, but a Context (
android.content.Context) object must be added to the
SpeechPipeline’s builder via the
setAndroidContext() method. See the javadoc for
AndroidSpeechRecognizer for more information.
Android’s native ASR support is device-dependent. For production apps targeting broad compatibility, we recommend testing for its availability by calling
SpeechRecognizer.isRecognitionAvailable() and having a fallback option in place for if it returns
This chart lists physical devices on which it has been tested by either the Spokestack team or our community. If you have a device that is not listed, please try it out and submit a PR with your results!
|Device||API Level||ASR working?|
|Moto X (2nd Gen)||22||❌
|Lenovo TB-X340F tablet||27||✅|
|Pixel 3 XL||29||✅|
* ASR fails consistently with a
SERVER_ERROR, which seems to indicate that the server used by the device manufacturer to handle these requests is no longer operational.
None required! 🎉
Azure Speech Services
azure-api-key(string): An API key valid for Azure Cognitive Services. See Microsoft’s documentation for more information.
azure-region(string): A region identifier for Azure Speech Services. See Microsoft’s list.
N/A (for now)
google-credentials(string): A JSON-serialized string containing Google account credentials. See Google’s documentation for more information.
locale(string): A BCP-47 language identifier to identify the language that should be used for speech recognition (example: “en-US”). See Google’s documentation for a list of supported codes.
N/A (for now)