I am testing CPU performance. I used 02 boards with armv7 and SMP support: [email protected] dual core and cortexa7@1GHz dual core.
Then, execute a simple loop as below and measure time of execution:
#define DEFAULT_CALC_LOOPS 1000
#define LOOPS_MULTIPLIER 4.2
...
loops = DEFAULT_CALC_LOOPS;
...
void *calc(int loops)
{
int i, j;
for (i = 0; i < loops * LOOPS_MULTIPLIER; i++) {
for (j = 0; j < 125; j++) {
// Sum of the numbers up to J
volatile int temp = j * (j + 1) / 2;
(void)temp;
}
}
return NULL;
}
The results showed on 02 boards after variety of tests:
cortexa15: ~1.2 ms
cortexa7: ~5 ms
There's a big difference between the above results.
Are there any dependence or limitation impacting to the results ? Who experienced with this can share me ideas ? Thanks.
For me, cortexa15 has over 2x - 3x performance compared to cortexa7. Besides, I am having [email protected] and cortexa7@1GHz. So I also think the above result is reasonable.
Below, I'll give an example for cortexa15 case study to measure execution time:
Formula to calculate CPU time:
CPU execution time = Instruction count x CPI x Clock cycle
I: Number of Instruction
CPI: cycles per instruction (IPC = 1/CPI)
C: Clock cycle (1/CPU clock) - second
Take a look to cortexa15 dual-core (same with iWave G1M/N).
Cortexa15 executes 9,900 MIPS at 1.5 GHz, average IPC = 6.6
Translate C code to assembly code for ARM arch:
for (i = 0; i < loops * LOOPS_MULTIPLIER; i++) {
}
Refer: https://godbolt.org
The CPU execution time finally is 0.000962 seconds. Approximate 0.962 ms to execute the loop with the best effort of CPU.
In worst case (at 1.3 GHz), CPU time for the loop is around 1.109 ms.
Via testing, I got the same values.
--
I do more case for cortexa7@1GHz.