These are previous homework problems, but I am using them as exam review. I am changing numbers around from what is actually in the problem. I just want to make sure I have a grasp on the concepts. I already have the answers, just need clarification that I understand them. This is not homework but review work.
Anyway, this focuses on aspects of CPI
The fist problem:
An application running on a 1GHz processor has 30% load-store instructions, 30% arithmetic, and 40% branch instructions. The individual CPIs are 3 for load-store, 4 for arithmetic, 5 for branch instructions. Determine the overall CPI of this program on the given processor.
My answer: The overall CPI is the sum of the sub-CPIs, multiplied by the percentages in which they occur i.e. 3*0.3 + 4*0.3 + 5*0.4 = 0.9 + 1.2 + 2 = 4.1
Now, the processor is enhanced to run at 1.6GHz. The CPIs of the branch instructions remain the same but load-store and arithmetic instruction CPIs both increase to 6 cycles. A new compiler is in use which eliminates 30% of branch instructions and 10% of load-stores. Determine the new overall CPI and the factor by which the application will be faster or slower.
My answer: Once again, the new CPI is just the sum of its parts. However, the parts have changed and this must be accounted for. Branch instructions will drop by 30% (0.4*0.7=0.28) and load-stores will drop by 10% (0.3*0.9=0.27); arithmetic instructions will now account for the rest of the instructions (1-0.28-0.27=0.45), or 45%. These will be multiplied by the new sub-CPIs to get: 6*0.45+6*0.27+5*0.28=5.72.
Now, the processor enhancement is 60% faster, and the CPI is greater by (5.72-4.1)/4.1 = 39.5%. Thus, the application will run roughly 0.6*0.395 = 23.7% faster.
Now, the second problem:
A new processor with a load/store architecture has an ideal CPI of 1.25. Typical applications on this processor are a mix of 50% arithmetic and logic, 25% conditional branching and 25% load/store. Memory is accessed via a separate data and instruction cache, with a 5% instruction cache miss rate and 10% data miss rate. The penalty of any cache miss is 100 cycles and hits don't produce any penalties.
What is the effective CPI?
My answer: The effective CPI is the ideal CPI, plus the stalled cycles per instruction due to cache access. The ideal CPI is, as given, 1.25. The stalled cycles per instruction is (0.1*100*0.25) + (0.05*100*1) = 7.5. 0.1*100*0.25 is the data miss rate multiplied by the stalled cycle penalty which is also multiplied by the load/store percentage (which is where the data accesses take place); 0.05*100*1 is the instruction miss rate, which is the instruction cache miss rate times the stalled cycle penalty, instruction access take place in 100% of the program, so this is multiplied by 1. Following from this, the effective CPI is 1.25 + 7.5 = 8.75.
What is the misses per 1000 instruction for typical applications and what is the average memory access time (in clock cycles) for typical applications?
My answers: The misses per 1000 instructions is equal to the stalled cycles per instruction due to cache access (as given above: 7.5), divided by 1000, which equals 7.5/1000 = 0.0075
When discussing the average memory access time (AMAT), we first must talk about the total number of accesses here, which is the percentage of data accesses (25%) plus the percentage of instruction accesses (100%), or 125%=1.25. The data accesses are .25/1.25 and the instruction accesses are 1/1.25.
The AMAT equals the percentage of data accesses (.25/1.25) multiplied by the sum of the hit time (1) and the data miss rate multiplied by the miss penalty (0.1*100), or (.25/1.25)(1+0.1*100) and this is added to the percentage of instruction accesses (1/1.25) multiplied by the sum of the hit time (1) and the instruction miss rate multiplied by the miss penalty (0.05*100), or (1/1.25)(1+0.05*100). Put together, the AMAT is (.25/1.25)(1+0.1*100)+(1/1.25)(1+0.05*100)=7.
Once again, sorry for the wall of text. If I am wrong, please try to help me understand how I am wrong. I tried to show all my work to make it as easy as possible to understand. Thanks in advance.
There's an error in the lat part of your question. When they ask:
what's needed here is the number of misses you will get for every 1000 instructions, which in this case would be 1000*1*0.05 for instruction cache misses and 1000*0.25*0.1 for data cache misses. This equals 75 misses per 1000 instructions.
To calculate the AMAT, you use the formula AMAT = hit time + (miss rate*miss penalty)
In this case, your miss rate is 75/1000 and your miss penalty is 100 cycles. The hit time is given as 1.25 cycles (your ideal CPI!).
Hope this helps and all the best for your exam!