I am searching in a few arrays of struct stored in program memory with lengths of about 30-200 with a linear search like this:
Glyph getGlyph(Font font, uint16_t code) {
for (size_t i = 0; i < font.length; i++) {
if (pgm_read_word(&font.glyphs[i].code) == code) {
static Glyph glyph;
memcpy_P(&glyph, &font.glyphs[i], sizeof (Glyph));
return glyph;
}
}
// return question mark if unknown code point
return getGlyph(font, 0x003f);
}
Assuming I am correctly reading the disassembly, there are 20 instructions per iteration, 3 of them rather expensive as far as I know: brcc, brne and rjmp.
Just evaluating if this could be done more (power) efficient, I implemented a binary search:
Glyph getGlyph(Font font, uint16_t code) {
// https://en.wikipedia.org/wiki/Binary_search_algorithm
int16_t l = 0;
int16_t r = font.length - 1;
while (l <= r) {
uint8_t m = (l + r) / 2;
uint16_t c = pgm_read_word(&font.glyphs[m].code);
if (c < code) {
l = m + 1;
} else if (c > code) {
r = m - 1;
} else {
// found code point, get and return glyph
static Glyph glyph;
memcpy_P(&glyph, &font.glyphs[m], sizeof (Glyph));
return glyph;
}
}
// return question mark if unknown code point
return getGlyph(font, 0x003f);
}
This results in 35 instructions per iteration in the disassembly, of which 5 are what I think rather expensive: brlt, brcc (2x) and rjmp (2x).
Given that the arrays are rather short, I suppose the benefit of the binary search is not a lot and might be compensated by the increased complexity?
Is it worth or does it make sense at all trying to benchmark this?
The program is compiled with -Os. Compiling with -O3 instead as suggested in comments, it seems the amount of instructions is a multiple, but maybe I am just not interpreting the disassembly correctly.
It is running on an ATmega328p and I'm currently building it with gcc 5.4.0 and avr-libc 2.0. I tried also with gcc 13 and avr-libc 2.1 but I find the disassembly quite hard to read.
This is just a simple thermometer and hygrometer with an e-ink display, it runs well and I am satisfied with speed and power consumption, though I lack experience and can't compare with anything else. I'm really just curious and want to learn.
There are lots of big bottlenecks in this program and they aren't necessarily related to the searching as such. To address them you'll rather need a code review:
Why are you passing structs etc by value? That's extremely inefficient on 8 bitters, even in case of small structs.
When you return a glyph, you first copy it from somewhere (flash) into the
staticvariable (.bss) and then copy it again onto the function call stack and then finally you copy it again in the caller. This is massively inefficient when you could just have returned aconstpointer pointing at the original item, or alternatively if these need to be modified in run-time, just copy it once - probably still return aconstpointer and let the caller copy the data.(Might be necessary to use some Harvard architecture tweaks to access flash data directly - whatever your compiler uses to handle that, "PROGMEM" macro or some such.)
Avoid needlessly large counters and indices. If you don't plan to go beyond value 255 then don't use
uint16_t. And pretty much never usesize_twhich will be a 32 bit type and therefore horribly slow.Recursive functions should pretty much never be used in any context and certainly never in embedded systems. That part must be rewritten as a loop - I don't even understand why you are using recursion at all.
Using standard lib
bsearchis not necessarily that much slower than rolling out your own version manually. Benchmark with an oscilloscope.In general, binary search is a pretty sound algorithm for 8-bitters. We neither need to worry about branching nor alignment, just the raw amount of instructions generated. On the other hand you literally picked one of the slowest CPUs still in production, so why worry about performance.
Overall I think you would benefit from picking a modern MCU instead, such as a Cortex M. Programming old 8-bitters in C is actually pretty hard and it's easy to make mistakes. AVR is 1990s technology, there's not many reasons for using them any longer. (They are a pretty good choice for learning assembler but that's about it.)