SSML - Is it possible to remove automatic break pauses?

541 Views Asked by S5amuel At 27 July 2025 at 19:33

I'm trying to remove the automatic breaks added by the synthesis processor, to create speech files without any "linguistic pauses".

I'm using Microsoft's speech synthesis engine with the SpeechSynthesizer class in C#.

This is the output I get with "This is an example why do automatic breaks occur?" wrapped in <speak> tags with SpeechSynthesizer:

https://clyp.it/4nofhh3n

This is the output I want (achieved by using Oddcast's TTS Demo):

https://clyp.it/m55wt14u

I've read through w3.org's SSML documentation several times which in point 3.2.3 - break element, note the following:

If the element is not present between tokens, the synthesis processor is expected to automatically determine a break based on the linguistic context. In practice, the break element is most often used to override the typical automatic behavior of a synthesis processor.

This is how my voice is currently behaving. I want to somehow override/turn off this functionality, and have the speech be completely uninterrupted. I have tried putting a <break> element with attributes strength="none" and time="0ms" between the words where this automatic break occurs like they write above to override it, and all kinds of different things such as wrapping the whole text string in <s> tags etc, to no avail.

I also can't just remove the breaks in post processing, since the voice has a different tone on the words spoken, when the automatic breaks are added.

I have read through several different SSML documentations which, while often worded a bit differently compared to the w3 docs, don't explain how to concretely override the automatic breaks, which is my issue.

Original Q&A

There are 1 best solutions below

Luke On 13 October 2020 at 15:32

In my experimenting with SpeechSynthesizer if you put a break of 50ms at the end then it will respect it - if it's less then it'll be ignored. However, it will always treat <speak> wrapped content as its own clause, so will speak it as if it's a sentence/clause, rather than carrying the prosody like the 2nd example. You need to send all your text in a single <speak> element (and voice) to have it treated as a single linguistic utterance.

SSML - Is it possible to remove automatic break pauses?

There are 1 best solutions below

Related Questions in C#

Related Questions in TEXT-TO-SPEECH

Related Questions in SPEECH-SYNTHESIS

Related Questions in SSML

Related Questions in SPEECHSYNTHESIZER

Trending Questions

Popular # Hahtags

Popular Questions