Elena & Fabrice's Web

kdenlive uses a 25th of a second accuracy for the millisecond timestamp $t$, which thus becomes converted to the nearest integer $\lfloor{t/40}\rceil$ (there are forty fractions of 1/25s in a second). So if you have, say:

00:00:02,764 --> 00:00:03,705 So now...

Then this will be rounded up as follows:

i.e., instead of the 764ms have been rounded up to 19 (should be 19.1, so we start earlier than we should) and the 705ms have been rounded up to 18 (should be 17.625, so we stop later than we should). Kdenlive goes for the nearest integer as opposed to ceiling the first time-stamp and flooring the second one.

The result, if not precisely an overlap of the text itself because the times involved are so short, in an upward translation of the line [1]:

This is disconcerting and annoying, so we need to remove such overlaps. My first attempt from the previous day failed because even if the times do not overlap, if they do after being rounded to the nearest integer within 1/25th s accuracy, then we may have this artifact.

So here is the combined sed/Mathematica procedure to do that (but again, see [2]):

- First, I export the timestamps alone:

cat tom-iop-lecture-edited.srt | sed -n '/-->/p' > timestamps

Then I run the content of the **notebook fixingOverlapping-srt.nb** which I won't go into details now, but highlight the main points.

- I define an $\epsilon t$ that is slightly larger than the rounding of kenlive:

\[Epsilon]t = 1/24.

- I find the subtitles that overlap within that rounding uncertainty:

loverlap = Drop[Flatten[ Position[ltimesStart - RotateRight[ltimesStop, 1], val_ /; val < \[Epsilon]t]], 1]

In the case of Tom's subtitles, that was around 412 instances, or about 42% of the cases! Not all of them resulted in such displacements, but sufficiently to be annoying.

- I then use the following module which takes a string-time in srt format and remove a $\delta t$ (numerical time) from it, which will be the actual proximity + the $\epsilon t$, to be extra safe:

RemoveTime[strtime_, \[Delta]t_] := Module[{datelist}, datelist = DateList[{strtime, {"Hour", ":", "Minute", ":", "Second", ",", "Millisecond"}}]; DateString[ DateObject[ DatePlus[datelist, Quantity[-\[Delta]t - \[Epsilon]t, "Second"]]], {"Hour", ":", "Minute", ":", "Second", ",", "Millisecond"}] ]

This is how this module works:

RemoveTime["00:00:02,959", .237] "00:00:02,680"

Here we removed 237ms to the timestamp at 2s959ms. It removed more to take into account roundings and 1/25 accuracy. The final goal here is to avoid overlap in kdenlive while staying at the small fraction of a second tampering. Note that it is enough to systematically remove time from the ending-timestamp.

- Next we generate a list of sed rules that we export to a file:

Export["rules.txt",Table["s/--> " <> ltimess[[All, 2]][[i - 1]] <> "/--> " <> RemoveTime[ltimess[[All, 2]][[i - 1]], l\[Delta]t[[i]] + \[Epsilon]t] <> "/", {i, loverlap}]]

- And back to shell, we now invoke sed on these rules and the subtitles:

sed -f rules.txt tom-iop-lecture-edited.srt > FINAL-tom-iop-lecture-edited.srt

We have now removed any risk of overlap: