AI audio transcription and translation
SurveyCTO's AI transcription and translation feature lets you turn audio recordings into text and translate text responses from within the Data Explorer. You can transcribe any audio field or audio audit with a single click, and translate transcripts or text responses into the language of your choice.
This feature is useful when:
You collect audio responses (e.g., open-ended interview questions) and want to read them as text instead of listening to each recording.
You use audio audits for quality control and want to quickly scan what was said during an interview without playing the recording.
You receive text responses in a language other than your own and want a quick translation during review.
Transcription and translation send audio and text from your SurveyCTO server to a third-party cloud AI provider for processing. Even for encrypted forms, the audio is decrypted in your browser and then sent to the provider in plaintext over an encrypted connection. This means the contents of encrypted audio fields are visible to the AI provider while the request is being processed.
Before you use this feature, confirm that sharing respondent audio and text with a third-party AI provider is permitted by your organization's data-handling policies, your IRB protocol or consent agreements, and any applicable data-protection regulations (for example, GDPR or India's DPDP Act). See the Privacy and data handling section below for details.
Your reviewers will see a privacy reminder the first time they click Transcribe or Translate in a given browser, and again roughly once a month afterwards so the considerations stay top-of-mind.
Transcribing audio
To transcribe an audio recording, open the submission in the Data Explorer (or during review and correction). For every audio field or audio audit in the submission, you'll see a Transcribe button next to the audio player.
Click Transcribe and SurveyCTO will send the audio to a cloud AI provider for transcription. The transcript appears inline as a comment on the audio field, with a microphone icon to distinguish it from regular comments.
When a submission contains audio fields, SurveyCTO displays a banner at the top of the submission listing each audio field as a clickable link. Clicking a link scrolls directly to that field, making it easy to find audio fields in long submissions without scrolling manually.
The transcript is:
Visible inline in the submission view, in the submission's comments panel, and (if the form uses review and correction) alongside reviewer comments.
Exportable via the standard comments column in the SurveyCTO Desktop export. No additional configuration is required.
Saved alongside your other review changes – transcripts are not committed to the server until you click Save changes. See the Saving AI annotations section below for the full workflow.
Translating text
SurveyCTO can translate any free-form text response or AI transcript into the language you select. Look for the Translate link in two places:
Next to every free-form text response on the submission.
Next to every AI transcript that has been produced for an audio field.
Clicking Translate sends the source text to the AI provider and produces a translation. The translation appears inline as a separate comment with a globe icon, labeled with the source and target languages.
Batch translation
When a submission contains translatable text fields, SurveyCTO displays a translation banner at the top of the submission with a checkbox next to each eligible field. You can select multiple fields (or use Select all) and click the Translate button at the bottom of the banner to translate them all in sequence.
Each field's status is shown with an icon: a spinner while translation is in progress, a blue globe on success (indicating a pending unsaved translation), or a red exclamation on failure. A green check appears only for translations that have already been saved. If any selected fields already have a pending translation, SurveyCTO asks you to confirm before replacing them.
A Cancel button appears next to the Translate button while a batch is running. Clicking it lets the translation that is currently in progress finish, but stops the remaining fields from being processed.
Choosing the target language
The target language is set from a Translate to: section inside the Options dropdown of the submission title bar (next to Attachments). Open the dropdown and pick a language – the next Translate click will use that language.
The list of options is built dynamically:
Your browser's language always appears at the top, suffixed with (auto-detected). This is the default selection when you open a submission for the first time.
If the form is a multi-language form, each of its languages is added below the auto-detected entry. The language that the form designer marked as the form's default is suffixed with (form's default).
Custom languages you have added appear below the form languages. See Adding a custom language below.
The auto-detected entry is kept even when it matches one of the form's languages, so you always see the source of each option (e.g. English (auto-detected) alongside English (form's default)). They produce the same translation; pick whichever feels right.
SurveyCTO remembers your last choice for each form. The next time you open a submission for the same form, your selection is reapplied automatically. You can change it at any time by reopening the Options dropdown.
You can translate the same source text into multiple languages – pick a different target in the Options dropdown and click Translate again. Each (field, target language) pair becomes its own comment, so all of your translations are kept.
Adding a custom language
If the language you need is not in the menu, select Add language at the bottom of the Translate to: list. A dialog asks you to type the language name (e.g. Swahili, Tagalog, or Brazilian Portuguese). After you add it:
The custom language becomes the active translation target immediately.
It appears in the Translate to: menu for every form you review on this browser, and persists across sessions.
Duplicates are prevented: if you type a language name that already exists in the menu (case-insensitive), SurveyCTO will let you know.
To remove a custom language, hover over it in the menu and click the icon that appears. Removing the currently active language resets your selection to the browser's auto-detected language.
Why these languages?
A Why these languages? link at the bottom of the Translate to: menu explains how the list is assembled. In short: SurveyCTO surfaces your browser language, the form's authored languages, and any custom languages you have added. The AI provider supports many more languages – use Add language to reach any language the provider supports.
If you only see your browser's language in the menu, it's because the form is single-language – SurveyCTO can't reliably tell which language a single-language form is in just from its definition. To get more options:
Use Add language to type the language you need.
Switch your browser to a different language. SurveyCTO picks up the change automatically the next time you open a submission.
If your form has content in multiple languages, ask the form designer to add those languages to the form definition (see Adding translations to a form). Once the form is redeployed, every authored language will appear under Translate to:.
Translations follow the same save workflow as transcripts (see Saving AI annotations below) and are also included in the comments column when you export via SurveyCTO Desktop.
Saving AI annotations
Transcripts and translations are not saved to the server immediately. They appear as pending edits on the submission, just like regular reviewer comments and corrections, and they are committed when you click Save changes. SurveyCTO warns you about pending edits if you try to navigate away or close the tab without saving, so transcripts and translations cannot be lost silently.
Before saving
While a transcript or translation is still pending, you can replace it as many times as you like:
Clicking Re-transcribe on a field with a pending transcript replaces it with a fresh one. SurveyCTO warns you first – either that the existing pending transcript will be overwritten, or (if the previous attempt produced no speech) that you may consume more of your allowance without producing a useful result. You are free to retry either way.
Re-translating works the same way: clicking Re-translate replaces the pending translation with a fresh one. There is at most one pending translation per field at any time – re-translating never accumulates multiple drafts.
After saving
Once you click Save changes, transcripts and translations on that submission become permanent. They cannot be edited or deleted from the Data Explorer, and the Re-transcribe button reverts to Transcribe. Clicking it again will add a new transcript next to the existing one rather than replace it – which is useful when you want a second pass with a fresh model run. Each new transcription still counts against your allowance.
The same applies to translations: clicking Translate on a field that already has a saved translation – even with the same target language – adds a new translation alongside the existing one rather than replacing it.
Saved transcripts display a Translate link, so you can translate a transcript that was saved in a previous session without needing to re-transcribe it.
If you want to correct or annotate a saved transcript or translation, add a regular reviewer comment on the same field. The original AI-generated annotation remains alongside it for audit purposes.
Submissions in any review state
You can add transcripts and translations to any submission in the Data Explorer, regardless of its review status:
Submissions that have already been approved or rejected via the review and correction workflow are normally locked against further reviewer edits. AI transcripts and translations are an exception – they can still be added, because the underlying audio and text on the submission do not change.
Adding a transcript or translation does not alter the submission's approval status, the operator who approved or rejected it, or the timestamp of that decision. The reviewer audit trail for the original review stays intact, and the AI annotation is recorded separately.
Encrypted forms
AI transcription and translation work with encrypted forms, but it is important to understand what this means for the confidentiality of your data. When you view an encrypted submission in the Data Explorer, your browser already has the private key loaded to decrypt the audio for playback. When you click Transcribe, your browser sends that decrypted audio to the third-party AI provider over an encrypted connection – not the encrypted file. In practical terms:
Your private key never leaves your browser, and the AI provider never has access to it.
However, the plaintext audio itself is processed by the AI provider for as long as the transcription request takes. The end-to-end encryption guarantee you normally get from encrypted forms – that the raw audio is only ever readable on trusted machines holding the private key – does not apply to audio you choose to transcribe.
The same is true of transcripts you translate: the plaintext transcript is sent to the provider for the translation request.
The resulting transcript and translation comments are returned to your browser and stored encrypted on the server using the same mechanism as other reviewer comments on encrypted forms, so they remain protected at rest.
Operators should treat the Transcribe and Translate buttons as deliberate actions that take a given audio file or text response out of the encrypted-form security boundary for the duration of the request. If that is not acceptable for a given submission, don't click them.
Usage limits
Each server has a monthly AI usage allowance. Transcription and translation share the same allowance—both consume AI tokens from a single balance, with the cost of each action scaling with its size:
Transcription consumes tokens roughly in proportion to the duration of the audio that is sent to the AI provider.
Translation consumes tokens roughly in proportion to the length of the source text (whether that text is a free-form text response or a previously-produced transcript).
When the allowance is exhausted, both Transcribe and Translate actions are disabled until your next billing period.
You can see your current AI usage and remaining allowance from the Usage menu icon in the SurveyCTO server console home page. It is shown as AI usage (current period): used / total tokens.
Privacy and data handling
Transcription and translation are strictly on-demand – audio files, text responses, and transcripts remain on your SurveyCTO server unless a reviewer explicitly clicks Transcribe or Translate on a particular field.
When a reviewer does trigger transcription or translation, the following happens:
The audio file (for transcription) or the text (for translation) is sent from the reviewer's browser to a third-party cloud AI provider over an encrypted connection. For encrypted forms, the audio is decrypted in the reviewer's browser first; see the Encrypted forms section above.
SurveyCTO does not use your data to train AI models. The AI provider does not retain your data beyond what is required to process the request.
The provider returns a transcript or translation, which is stored on your SurveyCTO server as a reviewer comment on the relevant field, under the same access controls as the rest of your submission data. Comments are encrypted at rest for encrypted forms.
Compliance and regional considerations
Because transcription and translation involve sending respondent audio and text to a third-party AI provider, this feature is not appropriate for every dataset. Before enabling or using it, consider the following:
Data-protection regulations. Sending respondent audio or text to an AI provider may constitute a cross-border transfer of personal data. This can have implications under the EU's General Data Protection Regulation (GDPR), India's Digital Personal Data Protection Act (DPDP), and similar regimes. SurveyCTO servers hosted in the EU and India regions carry specific residency commitments that may be affected if you use this feature; confirm with your DPO or legal team before enabling it for production data.
IRB protocols and consent. If your study is governed by an Institutional Review Board (IRB) or similar ethics body, check your approved protocol and your respondents' consent language. Sending audio recordings or transcripts to a third-party AI provider may be outside the scope of what respondents consented to.
Sensitive-topic data. Audio recordings can be especially sensitive when they capture health information, legal or financial details, identifiable voices, minors, or vulnerable populations. Even if the AI provider does not retain the data, transient exposure during processing may not be acceptable for every dataset.
Organizational policies. If your organization has policies that prohibit sharing respondent data with third-party AI providers, do not use this feature.
Reviewers see a per-browser privacy reminder the first time they click Transcribe or Translate, and roughly once a month afterwards, so the risk is surfaced at the point of action.
Opting out of AI provider data processing
AI transcription and translation are enabled by default on every SurveyCTO server. To function, these features send your submission data to external AI providers for processing. If your organization prefers not to have these features available—for example, because of data protection, IRB, or internal policy requirements—contact the SurveyCTO Support Center and request that these AI features be disabled on your server. The SurveyCTO team will honor the request, after which the Transcribe and Translate feature controls will no longer appear in Data Explorer.
Current limitations
This is the first version of AI transcription and translation in SurveyCTO. The following are not yet supported and are planned for a future release:
Automatic transcription. All transcription is on-demand – you click per field. Future versions may support automatic transcription of new submissions as they arrive.
Speaker diarization. Transcripts do not distinguish between speakers. This is a known limitation of the current model and is under evaluation.
Batch transcription. You cannot yet transcribe all audio in a submission with a single click; each audio field must be transcribed individually. (Batch translation of text fields is supported – see the Batch translation section above.)
Audio pre-screening. Files that contain only silence or background noise will still be sent for transcription and will count against your allowance, producing an empty or near-empty transcript.
Provider choice. The AI provider is not configurable per server; SurveyCTO selects the provider based on overall platform quality and cost.