Skip to content

Supported Files

voxANN Align accepts a wide variety of files for both Scripts and Audio.

Script Import

Accepted Files

FormatFile ExtensionMIME Type
CSV.csvtext/csv
Word Document.docxapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
PDF.pdfapplication/pdf
SubRip Subtitle.srtapplication/x-subrip
Text Encoding

The text within document files can actually be encoded in many forms, particularly with multi-lingual content. Align currently assumes script files have an international friendly UTF-8 text encoding. If you encounter any issues with corrupt or strange looking characters after import, try converting the content to UTF-8 first.

CSV Files

The most reliable Script Import format is CSV – allowing each phrase to be explicitly declared in each line. Our format also allows for multiple languages to be specified in a single file, breaking out each one into a separate Script within the Project. Each phrase can also have an external reference attached, allowing exports to loop back and integrate with other systems.

CSV Format & Example

A CSV file upload must consist of at least two columns – "Phrase" and "Locale" – with the headings included. "Phrase" is the script content for each line and "Locale" one of our supported language tag codes. An optional third column with the heading "Reference" allows an external reference to be attached to each phrase, such as an ID from another system.

This is an example of the correct format:

PhraseLocaleReference
An example script phrase.en-GBEXT_REF_001
The second phrase in the script.en-GBEXT_REF_002

Or you can download a working example with the same content: Download example CSV

PDF & Word Files

PDF and Word files can also be provided as a script source – these are parsed to try and extract the structured text within. Results will vary depending on the complexity of the document and how logically the content is organised. Extracted text can be edited before being added to the project. See Add Script.

SRT Files

SRT (Subtitles) can be used as a script source. Each phrase is already segmented and can be extracted accurately. Due to the nature of the SRT format, phrases are often overly segmented versus dialogue, so an opportunity to edit and collapse lines as needed is available before adding to the project. See Add Script.

Max File Size

Each individual script file upload must be less than 10MB in size.

Audio Import

Accepted Files

FormatFile ExtensionMIME Type
AIFF (Audio Interchange File Format).aiff, .aifaudio/aiff, audio/x-aiff
FLAC (Free Lossless Audio Codec).flacaudio/flac
M4A (MPEG-4 Audio / AAC).m4aaudio/x-m4a
MP3 (MPEG-1 Layer 3).mp3audio/mpeg, audio/mp3
WAV (Waveform Audio File Format).wavaudio/x-wav, audio/wav, audio/vnd.wave
AAC (Advanced Audio Codec).aacaudio/aac
AC3 (Dolby Digital).ac3audio/ac3
GSM (Global System for Mobile Communications Audio).gsmaudio/gsm
MKA (Matroska Audio).mkaaudio/x-matroska
OGG / OGA (Ogg Vorbis, Opus, or FLAC audio).ogg, .ogaaudio/ogg, audio/oga
Opus (Opus Audio Codec).opusaudio/opus
Vorbis (Vorbis Audio Codec).vorbisaudio/vorbis
WMA (Windows Media Audio).wmaaudio/x-ms-wma
Audio Format Detail

There are no constraints on the nature of the audio contained within the Accepted Files. It can be any combination of sample rate, bit depth, bit rate, number of channels etc – as long as it’s in a valid container. Generally speaking, we accept pretty much anything, from 64Kbps MP3 files, to 24/96 WAV files, multi-channel mixes and more. Higher fidelity can improve alignment performance, but more important is clean dialogue with minimal noise and non-speech elements.

Max File Size

Each individual audio file upload must be less than 750MB in size and no longer than 2 hours in length.

Make a Request

Don’t see a format you’re looking for? Let us know.