Creating a Transcription Job

Create and submit a job to transcribe one or more media files to text files in the Speech service.

Before you begin

  • Store the media files that you want to transcribe in an Object Storage bucket.

  • To compare the Whisper and Oracle ASR models for transcription job creation, see Comparing Whisper and Oracle ASR Models.

Comparing Whisper and Oracle ASR Models

Compare Whisper model and Oracle ASR model for creating transcription jobs.

In addition to the native Oracle ASR speech model, Speech supports the Whisper model from OpenAI. Whisper is trained on a large corpus of multilingual data collected from the web, and it supports file-based voice-to-text transcription for over 50 languages. This model uses the same service end points and API and SDK interfaces as the Oracle ASR model to give you flexibility and compatibility. In addition, the Whisper model uses diarization to label individual speakers in the recording.

Use the following comparison of the Whisper and Oracle ASR models to choose the correct model when creating a transcription job.

Feature Oracle ASR Model Whisper Model in OCI Speech
Real time transcriptions Supported Not supported
Large file size Up to 2 GB Up to 2 GB
Word level timestamp Supported Supported
File format AAC, AC3, AMR, AU, FLAC, M4A, MKV, MP3, MP4, OGA, OGG, WAV, WEBM AAC, AC3, AMR, AU, FLAC, M4A, MKV, MP3, MP4, OGA, OGG, WAV, WEBM
Multilingual support English, Spanish, French, German, Italian, Portuguese, and Hindi Same as Oracle ASR model plus 50 other languages*
Diarization Supported Supported

* OpenAI Whisper FAQ