I have a basic understanding of how 2-3-4 trees maintain the height balance property operation after operation to make sure even the worst case operations are O(n logn).
But I do not understand it well enough to know why only 2-3-4?
Why not 2-3 or 2-3-4-5 etc?
At my algoritm course they told us that they are commonly used for acessing memory from hard disc - known as B/B+ trees. You make tree that store sizeof your avabile ram and by doing so you minimalize number of reed from disc operation (if you made B with node that store for example 10^8 elements you only need log_10^8(n) reed from disc operation to find something on hard disc which is nothing. So something that you called 2-3-4-5-... trees is in fact widespread solution.