kdenlive uses a 25th of a second accuracy for the millisecond timestamp $t$, which thus becomes converted to the nearest integer $\lfloor{t/40}\rceil$ (there are forty fractions of 1/25s in a second). So if you have, say:
00:00:02,764 --> 00:00:03,705 So now...
Then this will be rounded up as follows:
i.e., instead of the 764ms have been rounded up to 19 (should be 19.1, so we start earlier than we should) and the 705ms have been rounded up to 18 (should be 17.625, so we stop later than we should). Kdenlive goes for the nearest integer as opposed to ceiling the first time-stamp and flooring the second one.
The result, if not precisely an overlap of the text itself because the times involved are so short, in an upward translation of the line [1]:
This is disconcerting and annoying, so we need to remove such overlaps. My first attempt from the previous day failed because even if the times do not overlap, if they do after being rounded to the nearest integer within 1/25th s accuracy, then we may have this artifact.
So here is the combined sed/Mathematica procedure to do that (but again, see [2]):
cat tom-iop-lecture-edited.srt | sed -n '/-->/p' > timestamps
Then I run the content of the notebook fixingOverlapping-srt.nb which I won't go into details now, but highlight the main points.
\[Epsilon]t = 1/24.
loverlap = Drop[Flatten[ Position[ltimesStart - RotateRight[ltimesStop, 1], val_ /; val < \[Epsilon]t]], 1]
In the case of Tom's subtitles, that was around 412 instances, or about 42% of the cases! Not all of them resulted in such displacements, but sufficiently to be annoying.
RemoveTime[strtime_, \[Delta]t_] := Module[{datelist}, datelist = DateList[{strtime, {"Hour", ":", "Minute", ":", "Second", ",", "Millisecond"}}]; DateString[ DateObject[ DatePlus[datelist, Quantity[-\[Delta]t - \[Epsilon]t, "Second"]]], {"Hour", ":", "Minute", ":", "Second", ",", "Millisecond"}] ]
This is how this module works:
RemoveTime["00:00:02,959", .237] "00:00:02,680"
Here we removed 237ms to the timestamp at 2s959ms. It removed more to take into account roundings and 1/25 accuracy. The final goal here is to avoid overlap in kdenlive while staying at the small fraction of a second tampering. Note that it is enough to systematically remove time from the ending-timestamp.
Export["rules.txt",Table["s/--> " <> ltimess[[All, 2]][[i - 1]] <> "/--> " <> RemoveTime[ltimess[[All, 2]][[i - 1]], l\[Delta]t[[i]] + \[Epsilon]t] <> "/", {i, loverlap}]]
sed -f rules.txt tom-iop-lecture-edited.srt > FINAL-tom-iop-lecture-edited.srt
We have now removed any risk of overlap: