I've a large MPEG (.ts) Binary file, usually a multiple of 188 bytes, I use python3,when I read 188 byte each time and parse to get required value, I found it really slow. I must traverse through each 188 bytes packet to get the value of the PID (binary data).
- On the same time when I use any MPEG offline professional analyzer, they get the list of all PID values and their total counts, within a 45 seconds for 5 min duration TS file, where my program takes > 10 mins to get the same.
- I don't understand how quickly they can find even though they might be written in c or c++.
- I tried python multiprocessing, but it is not helping much. this means my method of parsing and working of 188 bytes of data is not proper and causing huge delay.
`with open(file2,'rb') as f:
data=f.read(188)
if len(data)==0: break
b=BitStream(data)
... #parse b to get the required value
... # and increase count when needed
...
cnt=cnt+188
f.seek(cnt)`
It's your code man.
I tried Bitstream for a while too, it's slow.
The cProfile module is your friend.
With pypy3, I can parse 3.7GB of mpegts in 2.9 seconds, single process.
With Go-lang, I can parse 3.7GB in 1.2 seconds.