Golang: benchmark Radix Tree Lookup

974 Views Asked by At

I've been trying to benchmark a Radix Tree implementation I wrote for sake of practice with Golang.

But I encountered a problem on "How should I benchmark it?". In the code below shows two cases or lets say different ways I would like to benchmark the LookUp func.

  • Case 1: Use one single slice of bytes which exist on the tree meaning it will be successful LookUp through all children nodes etc...

  • Case 2: Use a func to generate that random slice from the existing data in the tree meaning it will be successful LookUp as well...

I know the time expend will depend on the tree depth... I think Case 2 is close to a real world implementation or not?

QUESTION: Which case is more efficient or useful to benchmark?

Benchmark:

func BenchmarkLookUp(b *testing.B) {
    radix := New()
    insertData(radix, sampleData2)

    textToLookUp := randomBytes()

    for i := 0; i < b.N; i++ {
        radix.LookUp(textToLookUp) // Case 1 
        //radix.LookUp(randomBytes()) // Case 2
    }
}

func randomBytes() []byte {
    strings := sampleData2()
    return []byte(strings[random(0, len(strings))])
}

func sampleData2() []string {
    return []string{
        "romane",
        "romanus",
        "romulus",
        ...
    }
}

Result Case 1:

PASS
BenchmarkLookUp-4       10000000               146 ns/op
ok      github.com/falmar/goradix       2.068s
PASS
BenchmarkLookUp-4       10000000               149 ns/op
ok      github.com/falmar/goradix       2.244s

Result Case 2:

PASS
BenchmarkLookUp-4        3000000               546 ns/op
ok      github.com/falmar/goradix       3.094s
PASS
BenchmarkLookUp-4        3000000               538 ns/op
ok      github.com/falmar/goradix       4.481s

Results when there is no match:

PASS
BenchmarkLookUp-4       10000000               194 ns/op
ok      github.com/falmar/goradix       3.189s
PASS
BenchmarkLookUp-4       10000000               191 ns/op
ok      github.com/falmar/goradix       3.243s
1

There are 1 best solutions below

2
On BEST ANSWER

If your benchmark is random, that would make it very difficult to compare the performance between different implementations from one run to the next.

Instead, statically implement a few different benchmark cases that stress different areas of your algorithm. The cases should represent different scenarios, such as the case when there are no matches (as you already have), the case where there are many items in the source data that will be returned in a lookup, the case where there are many items and only 1 item will be returned, etc etc.