Determining the Parallel and Serial Region of Code and Calculating Speedup using Amdahl's Law

Question

Determining the Parallel and Serial Region of Code and Calculating Speedup using Amdahl's Law

506 Views Asked by malik727 At 13 June 2022 at 15:59

I was trying to understand the working of Amdahl's law but got confused in the process. Consider the following problem:

Suppose
a program has a part at the beginning that is sequential in nature (must be executed by only one processor) and takes 3 ms. Also, there is a part at the end of the program that is sequential (must be executed by only one processor) and takes 4 ms. The rest of the code is divided into 5 equal parts that are executed in parallel on 5 processes and each of these parts takes 16 ms. Calculate speedup using Amdahl's law.

Here is how I approached this problem. I first calculated the serial and parallel fraction, where 0.3 is the serial part and 0.7 is the parallel part calculated from the following logic:

Serial Part = 3 ms + 4 ms = 7 ms

Parallel Part = 16 ms (Only taking once as the code executes parallel on 5 processors)

Total = 7 ms + 16 ms = 23 ms

Serial Fraction = 7 ms / 23 ms = 0.3 (approx)

Parallel Fraction = 16 ms / 23 ms = 0.7 (approx)

Now putting values in Amdahl's law:

Speedup = 1 / (S + P/N) (where N = Processors, S = Serial Fraction, P = Parallel Fraction)

Speedup = 1 / (0.3 + 0.7/5) = 2.27 (approx)

So is my approach correct or is there any other value of speedup for this problem?

Original Q&A

There are 1 best solutions below

**user3666197** · Answer 1 · 2023-05-16T23:53:33.740000

Let's start with a basic Flow-of-Work schedule, as if there were no additional resources, but to allow for a single ( a pure-[SERIAL] ) stream of running the whole amount of work.

This baseline schedule, not using any sort of concurrent or parallel orchestration, shows, that an initial 3 [ms]-sprint ( SSS ) is followed by a consecutive execution of five independent 16 [ms]-sprints ( marked by blocks of 16-P-s ) and the whole workflow terminates after a final 4 [ms]-sprint completes the baseline computing topology in about 87 [ms].

+-------+                                                                              +-------+
|       |                                                                              |       |
| START |                                                                              | EoJOB |
|       |                                                                              |       |
+-------+                                                                              +-------+
:         1         2         3         4         5         6         7         8      :  9
0....5....0....5....0....5....0....5....0....5....0....5....0....5....0....5....0....5....0....5
|                                                                                      ^
v                                                                                      |
=SSS                                                                                SSSS
   |                                                                                |
   |PPPPPPPPPPPPPPPP                                                                |
                   |PPPPPPPPPPPPPPPP                                                |
                                   |PPPPPPPPPPPPPPPP                                |
                                                   |PPPPPPPPPPPPPPPP                |
                                                                   |PPPPPPPPPPPPPPPP|

Amdahl's law defines a maximum speedup that is fair to be expected, if all [PARALLEL]-is-able units-of-work can & do run on sufficient enough & free in time additional processing resources ( five CPU-s as given in O/P ).

Schedule, now using at least those 5 free CPU resources on otherwise non-blocking processing fabric, running the computing topology in resources optimal orchestration, completes the same amount of work, yet in about only 27 [ms].

+-------+              +-------+
|       |              |       |
| START |              | EoJOB |
|       |              |       |
+-------+              +-------+
:         1         2  :      3     
0....5....0....5....0....5....0....5....
|                      ^               [ms]
v                      |
=SSS                SSSS
   |                |
   | CPU[A]         |
   |PPPPPPPPPPPPPPPP|
   |                |
   | CPU[B]         |
   |PPPPPPPPPPPPPPPP|
   |                |
   | CPU[C]         |
   |PPPPPPPPPPPPPPPP|
   |                |
   | CPU[D]         |
   |PPPPPPPPPPPPPPPP|
   |                |
   | CPU[E]         |
   |PPPPPPPPPPPPPPPP|

This is due to an advantage of running all the P-able blocks in true-[PARALLEL] fashion ( having in due time free & non-blocking access to 5+ CPU resources ).

Further we can see, that no matter how many additional CPU-resources were made available, beyond those very 5 CPUs for the very said 5 P-able sections, no further speedup would ever appear, as the P-able sections were already mapped onto CPU-resources [A:E] and any other CPU will not help them do anything faster or complete the whole computing topology any time sooner.

              1
S = -------------------------------- ~ 3.782 x if using 5+ CPU-resources
     ( 3 + 4 )       ( 5 x 16 )
     _________   +  ___________
        87               87
                   --------------
                         5                  <--- using 5+ CPU-resources to operate them in parallel

Q.E.D.

_{For more details
on Amdahl's law of diminishing returns ( adding more CPUs will make zero additional speedups ), on effects of atomicity of P-able work-units execution, on effects of setup/termination add-on overheads, you might want to read this}

Determining the Parallel and Serial Region of Code and Calculating Speedup using Amdahl's Law

There are 1 best solutions below

Related Questions in PARALLEL-PROCESSING

Related Questions in PARALLELISM-AMDAHL

Trending Questions

Popular # Hahtags

Popular Questions