(Preview) Using the Transcript Tab

Powered by Avid Ada, the Transcript tab shows the speech to text (Transcribe) output of your asset's audio. Transcribe workflows, at the highest level, include the following: A request is made for an analysis service to process some audio, convert spoken words into text and then deliver the text along with timing data that map the words to offsets into the audio; the transcript text is added to segments that are automatically divided by each individual speaker.

Transcribe operations on assets can be initiated either by automation (rules defined in the Rules Editor app) or by manual interaction in the Transcript tab.

n The Transcript tab is only shown when the feature pack STT is installed and can only be used when the required license (MediaCentral | STT Base Package) is activated.

Using the Transcript tab, you can do the following:

  • Initiate the transcript creation for the entire asset or selected audio tracks

  • Display and inspect transcripts

  • Navigate in the transcript with synchronized playhead in the Media Viewer

  • Rename speakers

  • Add new speakers

  • Assign words or an entire segment to another speaker

  • Edit transcript text


In this release, only Production Management assets are supported for transcripts, as follows:

  • Production Management master clips are supported; other asset types are not supported.

  • Growing/Edit while capture (EWC) files are supported.

  • Remote assets are not supported.

The following limitations apply to transcript creation:

  • Avid Ada Transcribe uses an Artificial Intelligence (AI) / Machine Learning (ML) model to produce transcripts and speaker labels from assets that include non-musical, spoken word. In terms of a newsroom workflow, this might be identified as a VO or "talking head". As with all AI/ML models, results might be incorrect as the model is predicting the output from the provided input. Therefore, the Transcript tab provides options to correct any of these errors.

  • Avid Ada Transcribe results are directly impacted by the quality of audio signal being transcribed. For accurate transcriptions, it is important to ensure the audio signal has a reasonable signal-to-noise ratio (SNR).

  • Creating transcripts currently is not limited by specific user privileges. But it is covered with a quota license that tracks and limits the number of hours of transcribed audio.

  • Only a completely created transcript can be edited. The transcript portions already shown for EWC clips or very long clips are read-only until the transcribe operation is completely finished.

  • If a transcribe operation fails, all portions already shown for EWC clips or very long clips are removed from the Transcript tab and a "Transcription failed" message is shown.

  • If a re-transcribe job for a transcript is triggered, all edits made to the transcript will be overwritten.

Basic Interaction Patterns

Note the following basic interaction patterns:

  • When the STT feature pack is installed but the required license is not activated, the Transcript tab shows the “Your system is not licensed for this feature. Please contact your system administrator for assistance” message.

  • When no asset is loaded in the Asset Editor, the Transcript tab shows an “Asset is not loaded” message.

  • The Transcript tab shows an “Asset is not supported” message:

    • If there are no audio tracks for the asset.

    • If the system type of the asset is not Production Management (interplay-pam).

    • If you open an asset that does not support the transcript feature (for example, a sequence or group clip).

  • If for none of the tracks a transcript has been created yet, the Transcript tab shows a “There is no transcript yet” message and a Transcribe Asset button.

  • If there is transcription at least available on one track, the Transcript tab selects the first audio track with transcription.

Right-to-Left Languages

The Transcript tab supports right-to-left languages:

  • You can display and edit text in right-to-left languages (for example, Arabic or Hebrew). MediaCentral Cloud UX recognizes right-to-left characters (RTL). If the entire text or more than 50 percent of the text in a transcript segment consists of right-to-left characters, the text direction in the transcript segment changes to right-to-left.

  • The same applies when you merge two transcript segments or move a portion of a segment text into another segment by using the Change Speaker feature: when more than 50 percent of the text in the merged transcript segment consists of right-to-left characters, the text direction in the transcript segment changes to right-to-left as soon as you click outside the field.

Searching for Transcript Data

After you have transcribed an asset, you can use either the Any or the Markers & Segments pill types in the Search app to search for the Speaker or Transcript text.

If you are using the Timeline view of the Search app's Inline Hits window, the transcript is displayed as one segment per sentence, and not as a 1:1 representation of the Transcript segments. Each segment includes information about the transcripted audio, speaker, language, and track.

For more information on Search, see Using the Search App.

Transcript Tab Layout

The following illustration and table describe the layout of the Transcript tab.





Reload button

Reloads the transcripts for all tracks. After reloading, the top-most transcript in the Transcript selector (usually A1) is selected again.

  • For non-growing clips shorter than 1 hour, only the fully created transcript is shown. Clicking the Reload button updates the tab after the transcribe job has completed.

  • When a transcript is created for a growing (EWC) clip or a non-growing clip that is longer than 1 hour, the Transcript tab display auto-refreshes, new portions of the transcript are automatically shown in read-only mode. When the transcript creation is finished, clicking the Reload button enables editing of the transcript.


Transcript selection

Lets you select the completed, failed, or currently being created transcripts of the asset. Each transcript is named according to the audio track for which is was created. If several audio tracks have been selected for transcript creation at the same time, the tracks are mixed down into one track and a transcript for the mixed down audio track is created; the name of the transcript shows the included audio track names, for example "A1-A2-A3".

When you select a transcript from the list, it is shown in the Transcript area. If the selected transcript is currently being created, the Transcript area shows a "Transcription in progress" message. If the creation of the selected transcript has failed, the Transcript area shows a "Transcription failed" message.

When no transcript has been created for the asset yet, the list shows the "There are no items in this list" entry.


Word filter

Lets you search words in the transcript text. The word count to the right side of the word filter shows the number of matching words in the form <selected word>/<total number of found words>. Using the word navigation controls to the right side of the word count you can skip between found words. If Sync Playhead is enabled, skipping between found words also moves the playhead in the Media Viewer timeline.

For more information, see Searching in the Transcript.


Sync Playhead button

Toggles the synchronization of the position indicator (playhead) and transcript on and off:

  • On (default): During playback in the Asset Editor's Media Viewer or when you position the playhead in the timeline, the corresponding word is highlighted in green in the Transcript area.

    Vice versa, when you select a section in the Transcript area, the playhead in the Asset Editor's Media Viewer is positioned to the corresponding timecode.

  • Off: During playback in the Asset Editor's Media Viewer, corresponding words are not highlighted in the Transcript area. Selecting a section in the Transcript area does not move the playhead in the Asset Editor's Media Viewer.

When you edit the transcript text, synchronization is ignored.

For more information, see Navigating within the Transcript.


Edit Mode button

Toggles the edit mode for the Transcript area on and off:

  • When toggled on, playback is stopped and synchronization with the player is ignored. You can select a row and double-click on a word to edit it.

  • When toggled off, you can select a row but do not enter edit mode. The status of the Sync Playhead toggle is regarded.

For more information, see Editing the Transcript Text.


Create New Transcription button

Opens the Create Transcript dialog box, that lets you start the transcript creation for one or more audio tracks of the asset.

For more information, see Creating Transcript for Selected Audio Tracks.


Transcript area

Shows the contents of the selected transcript in a table. Each row consists of the following:

For more information, see Viewing a Transcript.