Skip to content

Speech-To-Text Call

Speech-To-Text

Speech-To-Text calls are calls that can take an audio file and listen to it and generate a textual prompt of it. Famous providers are Deepgram or OpenAI. Sometimes called transcribing.

Example normalized Speech-To-Text call

The following is an example of sending the file hello_there.mp3 into Deepgrams model nova-2 and getting the text back.

use Drupal\ai\OperationType\GenericType\AudioFile;
use Drupal\ai\OperationType\SpeechToText\SpeechToTextInput;

$audio_binary = file_get_contents('hello_there.mp3');
$normalized_file = new AudioFile($audio_binary, 'audio/mp3', 'hello_there.mp3');
$input = new SpeechToTextInput($normalized_file);
/** @var \Drupal\ai\OperationType\SpeechToText\SpeechToTextOutput $return_text */
$return_objet =  \Drupal::service('ai.provider')->createInstance('deepgram')->speechToText($input, 'nova-2', ['my-custom-call']);
echo $return_object->getNormalized();
// Will output "Hello there"

Speech-To-Text Interfaces & Models

The following files defines the methods available when doing a speech-to-text call as well as the input and output.

Speech-To-Text Explorer

If you install the AI API Explorer, you can go configuration > AI > AI API Explorer > Speech-To-Text Generation Explorer under /admin/config/ai/explorers/ai-speech-to-text to test out different calls and see the code that you need for it.