ASR configuration

Edit on GitHub

Spokestack is designed to support multiple speech recognition providers so you can decide which is right for your use case. Support varies by mobile platform, however, so we decided to gather the information in one place to make the choice as easy as possible for your app.

Supported ASR providers by platform

Provider Android iOS
Android ASR (on-device)
Apple ASR (on-device)
Spokestack Cloud ASR
Azure Speech Services
Google Cloud

Configuration

ASR providers require various configuration, usually in the form of API keys, but sometimes runtime components. This configuration takes place when you first build a Spokestack SpeechPipeline (or, in newer versions, Spokestack object). Below is a list of configuration needed for each platform and some usage notes.

For Android, primitive configuration properties are set via a call to setProperty(propertyName, value) on the speech pipeline’s builder (or a SpeechConfig object supplied to it); in iOS, they’re set as fields of a SpeechConfiguration object.


Android ASR

Android

No API keys or configuration properties are required, but a Context (android.content.Context) object must be added to the SpeechPipeline’s builder via the setAndroidContext() method. See the javadoc for AndroidSpeechRecognizer for more information.

Device compatibility

Android’s native ASR support is device-dependent. For production apps targeting broad compatibility, we recommend testing for its availability by calling SpeechRecognizer.isRecognitionAvailable() and having a fallback option in place for if it returns false.

This chart lists physical devices on which it has been tested by either the Spokestack team or our community. If you have a device that is not listed, please try it out and submit a PR with your results!

Device API Level ASR working?
Moto X (2nd Gen) 22 *
Lenovo TB-X340F tablet 27
Pixel 1 29
Pixel 3 XL 29
Pixel 3a 29
Pixel 4 29

* ASR fails consistently with a SERVER_ERROR, which seems to indicate that the server used by the device manufacturer to handle these requests is no longer operational.

iOS

N/A


Apple ASR

Android

N/A

iOS

None required! 🎉


Spokestack Cloud ASR

Spokestack’s Cloud ASR requires requests to be signed with a Spokestack client ID and API secret. Spokestack accounts are free, and cloud-based ASR currently is as well. If you don’t already have an account, you can sign up for one here; if you do, log in to get your credentials.

Android
  • spokestack-id (string): A Spokestack client ID, available in the account portal.
  • spokestack-secret (string): A Spokestack API secret, also available in the account portal.
iOS
  • spokestack-id (string): A Spokestack client ID, available in the account portal.
  • spokestack-secret (string): A Spokestack API secret, also available in the account portal.

Azure Speech Services

Android

You’ll also need the following dependency in your app’s build:gradle:

  implementation 'com.microsoft.cognitiveservices.speech:client-sdk:1.9.0'

This will require you to add Microsoft’s Maven repository to your top-level build.gradle, which implies acceptance of their license terms:

repositories {
  // ...
  maven { url 'https://csspeechstorage.blob.core.windows.net/maven/' }
}
iOS

N/A (for now)


Google Cloud

Android
  • google-credentials (string): A JSON-serialized string containing Google account credentials. See Google’s documentation for more information.
  • locale (string): A BCP-47 language identifier to identify the language that should be used for speech recognition (example: “en-US”). See Google’s documentation for a list of supported codes.

You’ll also need the following dependencies in your app’s build:gradle:

  implementation 'com.google.cloud:google-cloud-speech:1.22.2'
  implementation 'io.grpc:grpc-okhttp:1.28.0'
iOS

N/A (for now)