The @capgo/capacitor-speech-recognition package provides natural, low-latency speech recognition for Capacitor apps with cross-platform parity across iOS and Android. This tutorial will guide you through installing and using this package to add voice transcription capabilities to your app.
To install the @capgo/capacitor-speech-recognition package, run the following command in your project's root directory:
npm install @capgo/capacitor-speech-recognition
npx cap sync
For iOS, you need to add permission descriptions to your app's Info.plist file:
<key>NSSpeechRecognitionUsageDescription</key>
<string>We need access to speech recognition to transcribe your voice</string>
<key>NSMicrophoneUsageDescription</key>
<string>We need access to your microphone to record audio for transcription</string>
The Android platform automatically includes the required RECORD_AUDIO permission. No additional configuration is needed.
The @capgo/capacitor-speech-recognition package provides the following API methods:
This method checks if speech recognition is available on the device.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
async function checkAvailability() {
const { available } = await SpeechRecognition.available();
console.log('Speech recognition available:', available);
}
This method requests the necessary microphone and speech recognition permissions.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
async function requestPermissions() {
const { speechRecognition } = await SpeechRecognition.requestPermissions();
console.log('Permission status:', speechRecognition);
}
This method checks the current permission state without requesting them.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
async function checkPermissions() {
const { speechRecognition } = await SpeechRecognition.checkPermissions();
console.log('Current permission:', speechRecognition);
}
This method starts the speech recognition with optional configuration.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
async function startRecognition() {
const options = {
language: 'en-US',
maxResults: 3,
partialResults: true,
addPunctuation: true, // iOS 16+ only
};
await SpeechRecognition.start(options);
console.log('Speech recognition started');
}
Available options:
language (string): Locale identifier like 'en-US'. Defaults to device language.maxResults (number): Maximum number of results. Defaults to 5.prompt (string): Prompt text for Android dialog (Android only).popup (boolean): Show system dialog on Android. Defaults to false.partialResults (boolean): Enable streaming partial results.addPunctuation (boolean): Enable automatic punctuation (iOS 16+ only).allowForSilence (number): Milliseconds of silence before segment split (Android only).This method stops the speech recognition and cleans up resources.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
async function stopRecognition() {
await SpeechRecognition.stop();
console.log('Speech recognition stopped');
}
This method retrieves the list of supported languages.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
async function getSupportedLanguages() {
const { languages } = await SpeechRecognition.getSupportedLanguages();
console.log('Supported languages:', languages);
}
Note: Android 13+ devices no longer expose this list, so the array may be empty.
This method checks if speech recognition is currently active.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
async function checkListening() {
const { listening } = await SpeechRecognition.isListening();
console.log('Is listening:', listening);
}
This method returns the native plugin version.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
async function getVersion() {
const { version } = await SpeechRecognition.getPluginVersion();
console.log('Plugin version:', version);
}
Listen for partial transcription updates while recognition is active.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
const listener = await SpeechRecognition.addListener('partialResults', (event) => {
const transcription = event.matches?.[0];
console.log('Partial result:', transcription);
});
// Remove listener when done
await listener.remove();
Listen for segmented recognition results.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
const listener = await SpeechRecognition.addListener('segmentResults', (event) => {
const segment = event.matches?.[0];
console.log('Segment result:', segment);
});
Listen for segmented session completion.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
const listener = await SpeechRecognition.addListener('endOfSegmentedSession', () => {
console.log('Segmented session ended');
});
Listen for changes in the listening state.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
const listener = await SpeechRecognition.addListener('listeningState', (event) => {
console.log('Listening state:', event.status); // 'started' or 'stopped'
});
Remove all registered event listeners.
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
await SpeechRecognition.removeAllListeners();
Here's a complete example showing how to implement voice recognition in your app:
import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';
class VoiceRecorder {
constructor() {
this.isRecording = false;
this.partialListener = null;
}
async initialize() {
// Check availability
const { available } = await SpeechRecognition.available();
if (!available) {
throw new Error('Speech recognition not available');
}
// Request permissions
const { speechRecognition } = await SpeechRecognition.requestPermissions();
if (speechRecognition !== 'granted') {
throw new Error('Permission denied');
}
return true;
}
async startRecording(onTranscript) {
if (this.isRecording) {
console.warn('Already recording');
return;
}
// Set up listener
this.partialListener = await SpeechRecognition.addListener(
'partialResults',
(event) => {
const text = event.matches?.[0] || '';
onTranscript(text);
}
);
// Start recognition
await SpeechRecognition.start({
language: 'en-US',
maxResults: 3,
partialResults: true,
addPunctuation: true,
});
this.isRecording = true;
}
async stopRecording() {
if (!this.isRecording) {
return;
}
// Stop recognition
await SpeechRecognition.stop();
// Clean up listener
if (this.partialListener) {
await this.partialListener.remove();
this.partialListener = null;
}
this.isRecording = false;
}
async getSupportedLanguages() {
const { languages } = await SpeechRecognition.getSupportedLanguages();
return languages;
}
}
// Usage
const recorder = new VoiceRecorder();
async function startVoiceNote() {
try {
await recorder.initialize();
await recorder.startRecording((transcript) => {
console.log('Current transcript:', transcript);
// Update UI with transcript
});
} catch (error) {
console.error('Error starting voice note:', error);
}
}
async function stopVoiceNote() {
try {
await recorder.stopRecording();
} catch (error) {
console.error('Error stopping voice note:', error);
}
}
SFSpeechRecognizerSpeechRecognizer APIallowForSilence optionAlways handle errors gracefully when working with speech recognition:
try {
const { available } = await SpeechRecognition.available();
if (!available) {
console.error('Speech recognition not available');
return;
}
await SpeechRecognition.start({
language: 'en-US',
partialResults: true,
});
} catch (error) {
console.error('Speech recognition error:', error);
// Show user-friendly error message
}
The @capgo/capacitor-speech-recognition package provides a powerful and easy-to-use solution for adding speech recognition to your Capacitor apps. With cross-platform support, real-time partial results, and comprehensive event listeners, you can create sophisticated voice-enabled features for your users.