I am working on a large fortran code and before to compile with fast options (in order to perform test on large database), I usually compile with "warnings" options in order to detect and backtrace all the problems.
So with the gfortran -fbacktrace -ffpe-trap=invalid,zero,overflow,underflow -Wall -fcheck=all -ftrapv -g2
compilation, I get the following error:
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
Backtrace for this error:
#0 0x7fec64cdfef7 in ???
#1 0x7fec64cdf12d in ???
#2 0x7fec6440e4af in ???
#3 0x7fec64a200b4 in ???
#4 0x7fec649dc5ce in ???
#5 0x4cf93a in __f_mod_MOD
at /f_mod.f90:132
#6 0x407d55 in main_loop_
at main.f90:419
#7 0x40cf5c in main_prog
at main.f90:180
#8 0x40d5d3 in main
at main.f90:68
And the portion of the code f_mod.f90:132 is containing a where loop:
! Compute s parameter
do i = 1, Imax
where (dprim .ne. 1.0)
s(:,:,:, :) = s(:,:,:, :) +vprim(:,:,:, i,:)*dprim(:,:,:, :)*dprim(:,:,:, :)/(1.0 -dprim(:,:,:, :))
endwhere
enddo
But I do not see any mistake here. All the other locations are the calls of the subroutine leading to this part. And of course, since it is a SIGFPE error, I have to problem at the execution when I compile gfortran -g1
. (I use gfortran 6.4.0 on linux)
Moreover, this error appears and disappears with the modifications of completely different part of the code. Thus, the problem comes from this where loop ? Or from somewhere else and the backtrace is wrong ? If it is the case how can I find this mistake?
EDIT:
Since, I can not reproduce this error in a minimal example (they are working), I think that the problem comes for somewhere else. But how to find the problem in a large code ?
As the code is dying with a SIGFPE, use each of the individual possible traps to learn if it is a FE_DIVBYZERO, FE_INVALID, FE_OVERFLOW, or FE_UNDERFLOW. If it is an underflow, change your mask to '1 - dprim .ne. 0'.
PS: Don't use array section notation when a whole array reference can be used instead.
PPS: You may want to compute dprim*drpim / (1 - dprim) outside of the do-loop as it is loop invariant.