I am trying to measure the data transfer speed achievable through GPIO bit-banging and recording the reading results in memory. I have created a writeup on the setup, it uses FPGA connected to GPIO 1.
I use the following approach: GPIO0 to GPIO20 are dedicated to 'data', while GPIO25 to GPIO27 are designated for 'control signals' such as 'data ready', 'data request', and 'reset'.
The loop is simple:
- RPi sets "data request" to high
- FPGA detects "data request" and sets data @ GPIO0-20 (the data is 50MHz counting timer value, so it allows me to measure the speed).
- FPGA sets "data ready" level to HIGH
- RPI detects "data ready" HIGH level, reads the data and writes into a buffer in RAM
- RPI sets "data request" to LOW. Upon receiving low "data request", FPGA sets data ready to LOW.
- GOTO 1
I've found that first ~60-70K reads are ~0.88 uS per cycle, but then it gets more than twice faster, ~0.38 uS per cycle.
I am wondering what the reason behind this uneven speed is. Is there a way to start the transfer at this higher speed of 0.38 uS per cycle and be sure that it won't fall back to the slower 0.88 uS mode?
I have confirmed that it is due to DVFS.
I found that by default when a core is idle in RPi4, it's frequency set to 600MHz, and increased to 1800MHz when it is under load. It looks like it takes approximately 50-60 mS for the system to react to increased load.
With scaling governor set to "performance" instead of the default "ondemand" value, the exchange speed is fast from the very beginning.