Why is von neumann faster than harvard architecture

4.7k Views Asked by At

I read about these two types of architecture and somewhere on the internet someone said that systems using Von Neumann architecture are faster than the ones using Harvard architecture. I tried searching for why this is the case but I was yet to find a explanation that clarified the things for me.

In my understanding: - in a Von Neumann architecture the CPU can do one operation at a time meaning it can fetch data or fetch an instruction from memory in one cycle. So to perform some sort of operation on a data it needs 2 cycles(one to fetch the data and one to fetch the instruction). - in a Harvard architecture the CPU can fetch both data and an instruction in the same clock cycle since there are 2 separate memory blocks and two separate sets of data address busses

So if the HV architecture can do the same thing that VN does in one cycle why is it slower? Doesn't less cycles used for one thing means it should be faster than the other? Please go easy on me I'm a noob in embedded systems. Thank you for reading my post!

2

There are 2 best solutions below

5
On BEST ANSWER

In a von Neumann architecture, the CPU operates sequentially, e.g. it does fetch instruction, decode it, fetch operands (data), compute result, and store it. All these steps use the same memory channel.
A Harvard architecture has two memory channels, one for instructions, and one for data. It has an advantage over the von Neumann architecture, if the CPU supports pipelining, i.e. while instruction x, that has been decoded already, is fetching operands (data) over the data channel, instruction x+1 is fetched at the same time over the instruction channel.
So, if the CPU is pipelined, a Harvard architecture is faster than a von Neumann architecture.

1
On

This is all purely academic, and very dated. From the academic type view the Harvard architecture can at the same time be performing a data transaction and an instruction transaction where von Neumann can only do one or the other at a time.

True Harvard has the problem that you can't actually use it. You can't have a bootloader you can't have an operating system (that loads programs) as you can't use data transactions to put instructions in memory then branch to those instructions and run them the two memory systems are separate. Once you cross the paths it isn't Harvard any more, its a modified Harvard or a von Neumann.

Looking at how Wikipedia defines it, the modern busses you see today are modified Harvard because of the definition that you can't do data and instruction at the same time with von Neumann, but they use the same busses. you will see a read address bus a read data bus a write address bus and a write data bus, both instruction and data will cross the read busses, data goes across the write busses. Many transactions can be happening at the same time, a multi bus width sized instruction fetch can happen on one clock cycle which starts with a read address request, the next clock cycle a data read address request can start on the same bus, some number of clocks later the instruction address request is acked, then the read address request is acked, they don't necessarily have to come back in the same order depends on the design. then the read data bus will deliver the data then the processor will ack that. the write bus can be handling multiple data transactions in flight at the same time as well. And the read and write busses being independent can be doing things at the same not just each having multiple transactions in flight at the same time.

None of this has anything to do with the instruction set, you can and there are instruction sets with different busses behind them. Depending on the instruction set, how the fetching and pipeline and caching works you can have a pure textbook von Neumann come close to meeting the performance of a pure textbook Harvard. But if you think pre-cache, pre-pipeline one instruction at a time type architectures, then you can say 1) neither wins as the instruction fetch has to wait for the data transaction for loads and stores (or other instructions with memory access) to complete before the next fetch happens, so Harvard can't do data and instruction at the same time. Or 2) you can say that Harvard is allowed to do things in parallel and the von Neumann isn't and the Harvard wins as it can complete a simple data transaction and do the next fetch in the same cycle, periodically beating von Neumann by a cycle.

From a pure sense though one instruction at a time the von Neumann cannot be faster than a Harvard, it can tie but can't win. Harvard has two busses that can operate in parallel and all other factors held constant that difference gives Harvard a slight advantage as far as performance goes. All other factors held constant (instruction set, pipeline design, prefetching, etc).

Note that one instruction at a time no pipeline means it takes multiple clock cycles to perform most instructions as you see with pre-cache, pre-pipeline processors, they have tables of how many clocks it takes and you can just look at the instruction anyway and see how and why it takes that many. Even with a pipeline Harvard has a slight advantage. but if you say double the width of the von Neumann bus compared to the Harvard you can fetch two instructions at a time you can perform data operations on two sequential data locations at a time, now you have better bandwidth than the Harvard and can tie or beat it at times. but that isn't a pure comparison.

Again these notions are very dated. There are a very small number of Harvard-ish processors but to make them useful they are really modified Harvard as there is a way to gap between the memory systems.