My question is probably of trivial nature. I parallelised a CFD code using MPI libraries and now I am trying to investigate my parallel efficiency. To start with, I created a case which would provide equal loads among the ranks and constant ratio of volume of calculations over transferred data. Thus, my expectation would be that as I increase the ranks, any runtime changes would be attributed to the communication delays only. However, I realised that subroutines that do not invoke rank communication (so they only do domain calculations, hence they deal with the same load for all ranks) contribute significantly-actually the most- runtime increases. What am I missing here? Does this even make sense?
Parallel efficiency drops inconsistently
195 Views Asked by makmarios At
1
There are 1 best solutions below
Related Questions in C
- Passing arguments to main in C using Eclipse
- kernel module does not print packet info
- error C2016 (C requires that a struct or union has at least one member) and structs typedefs
- Drawing with ncurses, sockets and fork
- How to catch delay-import dll errors (missing dll or symbol) in MinGW(-w64)?
- Configured TTL for A record(s) backing CNAME records
- Allocating memory for pointers inside structures in functions
- Finding articulation point of undirected graph by DFS
- C first fgets() is being skipped while the second runs
- C std library don't appear to be linked in object file
- gcc static library compilation
- How to do a case-insensitive string comparison?
- C programming: Create and write 2D array of files as function
- How to read a file then store to array and then print?
- Function timeouts in C and thread
Related Questions in PERFORMANCE
- Slow performance on ipad erasing image
- Can Apache Ant be told to cache its XML files?
- What are the pros and cons of the picture element?
- DB candidate as CouchDB/Schema replacement
- python member str performance too slow
- Split a large query (2 days) into pieces to increase the speed in Postgres
- Use GUI displayed results of SQL query vs new queries?
- fastest way to map a large number of longs
- Bash regular expression execution hangs on long expressions
- Why is calling a function so slow in Javascript?
- Performance of element-compare in java collections
- "Capture GPU Frame" in XCode -- iOS only?
- Efficiency penalty of initializing a struct/class within a loop
- Change the rotating speed of the circle when the mouse moves using javascript
- Replace foreach to make loop into queryable
Related Questions in PARALLEL-PROCESSING
- Async vs Horizontal scaling
- Scattered indices in MPI
- How to perform parallel processes for different groups in a folder?
- Julia parallel programming - Making existing function available to all workers
- Running scala futures somewhat in parallel
- running a thread in parallel
- How to make DGEMM execute sequentially instead of in parallel in Matlab Mex Function
- Running time foreach package
- How to parallelize csh script with nested loop
- SSIS ETL parallel extraction from a AS400 file
- Fill an array with spmd in Matlab
- Distribute lines of code to workers
- Java 8 parallelStream for concurrent Database / REST call
- OutOfRangeException with Parallel.For
- R Nested Foreach Parallelization not Working
Related Questions in MPI
- MPI Processes Communication error
- Scattered indices in MPI
- MPI+OpenMP job submission script on LSF
- Forwarding signals in bash script which is submitted on the cluster
- boost mpi sends NULL messages
- How to know the all the ranks that are part of a group in MPI outside that group?
- How can I measure the memory occupancy of Python MPI or multiprocessing program?
- IPython MPI with a Machinefile
- Parallel HDF5: "make check" hangs when running t_mpi
- Excel VBA call DLL developed using MPI
- non-blocking communications in MPI: order of messages
- Largest Number Datatype MPI
- MPI reverse probe
- On entry to NIT parameter number 9 had an illegal value
- Find an element in array using MPI?
Related Questions in CDF
- CDF - GoogleVis command for Cumulative Distribution plot
- How to set x-axis with decreasing power values in equal sizes
- c++ boost overhead for cdf and pdf
- Cumulative distribution in MongoDB using MapReduce
- How to plot PDF and CDF for a normal distribution in matlab
- Parallel efficiency drops inconsistently
- cdf of Binomial distribution
- Error in ks.test comes from the cumulative distribution function is not written true because the discrete distribution has not buit -in function in R
- mzcdf2peaks does not accepts given data created by mzcdfread in Matlab r2020a
- how to draw guide lines on a gnuplot generated cdf?
- computing percentile rank efficently in R
- Defining a function in cvxopt (python) using log cdf
- Scipy Weibull CDF calculation
- How to map the MAVEN data
- Set specific colours on line using ggplot
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Yes!
The more processes you create (every process has a rank), the more you reach the limit of your system's capability to execute processes in a truly parallel manner.
Your system (e.g. your computer) can run in parallel a certain amount of processes, when this limit is surpassed, then some processes wait to be executed (thus not all processes run in parallel), which harms performance.
For example, assuming that a computer has 4 cores and you create 4 processes, then every core can execute a process, thus your performance is harmed by the communicated between the processes, if any.
Now, in the same computer, you create 8 processes. What will happen?
4 of the processes will start execute in parallel, but the other 4 will wait for a core to get available, so that they can run too. This is not a truly parallel execution (some processes will execute in linear fashion). Moreover, depending on the OS scheduling policy, some processes may be interleaved, causing overhead at every switch.