Java API for SRT subtitles

15.4k Views Asked by At

Is there any Java API for SRT subtitles ?

4

There are 4 best solutions below

1
On

Actually the modified regex from @Panayotis that supports multi-line subtitle text is like this:

protected static final String nl = "\\n";
protected static final String sp = "[ \\t]*";
Pattern.compile(
                    "(\\d+)" + sp + nl
                    + "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
                    + "-->" + sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
                    + "(X1:\\d.*?)??" + nl + "([^\\|]*?)" + nl + nl);

Replace ([^\\|]*?) with any character which have less probability to come as subtitle text. I have currently used "|" character negation rule.

3
On

The actual SRT parsing is performed through regular expressions, which Java is able to manipulate.

The actual regexp is:

protected static final String nl = "\\\n";
protected static final String sp = "[ \\t]*";
Pattern.compile("(?s)(\\d+)" + sp + nl + "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp + "-->"+ sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp + "(X1:\\d.*?)??" + nl + "(.*?)" + nl + nl);

group 2, 3, 4, and 5 is start time group 6, 7, 8, and 9 is finish time group 11 is subtitle text

0
On

There is another basic (and open source) API that can deal with SRT and ASS subtitle here

Parsing SRT :

File file = Paths.get("subtitle.srt").toFile();
SRTSub subtitle = new SRTParser().parse(file);
3
On

I have produced a java logic with which to parse and read different subtitle formats, among them is the popular srt: you can find the code licensed under MIT open source license (free to use for whatever) in my GiT repository:

https://github.com/JDaren/subtitleConverter

You probably just need the basic classes and the SRTFormat class, and with that you can read srt files from an InputStream or get full String[] files once you've finished editing them.

If you do find this useful or I can help you with anything please contact me.

PS: (other supported formats, either partially or fully are .ASS .SSA .STL .SCC and .XML (from W3C's TTAF-DFXP also known as TTML 1.0)

EDIT:

you can find the logic at work in www.subtitleconverter.net