Speech-To-Text Call
Speech-To-Text
Speech-To-Text calls are calls that can take an audio file and listen to it and generate a textual prompt of it. Famous providers are Deepgram or OpenAI. Sometimes called transcribing.
Example normalized Speech-To-Text call
The following is an example of sending the file hello_there.mp3 into Deepgrams model nova-2 and getting the text back.
use Drupal\ai\OperationType\GenericType\AudioFile;
use Drupal\ai\OperationType\SpeechToText\SpeechToTextInput;
$audio_binary = file_get_contents('hello_there.mp3');
$normalized_file = new AudioFile($audio_binary, 'audio/mp3', 'hello_there.mp3');
$input = new SpeechToTextInput($normalized_file);
/** @var \Drupal\ai\OperationType\SpeechToText\SpeechToTextOutput $return_text */
$return_objet = \Drupal::service('ai.provider')->createInstance('deepgram')->speechToText($input, 'nova-2', ['my-custom-call']);
echo $return_object->getNormalized();
// Will output "Hello there"
Speech-To-Text Interfaces & Models
The following files defines the methods available when doing a speech-to-text call as well as the input and output.
Speech-To-Text Explorer
If you install the AI API Explorer, you can go configuration > AI > AI API Explorer > Speech-To-Text Generation Explorer
under /admin/config/ai/explorers/ai-speech-to-text
to test out different calls and see the code that you need for it.