I appear to be getting inconsistent results when I use R's p.adjust function to calculate the False Discovery Rate. Based upon the paper cited in the documentation the adjusted p value should be calculated like this:
adjusted_p_at_index_i= p_at_index_i*(total_number_of_tests/i).
Now when I run p.adjust(c(0.0001, 0.0004, 0.0019),"fdr")
I get the expected results of
c(0.0003, 0.0006, 0.0019)
but when I run p.adjust(c(0.517479039, 0.003657195, 0.006080152),"fdr")
I get this
c(0.517479039, 0.009120228, 0.009120228)
Instead of the result I calculate:
c(0.517479039, 0.010971585, 0.009120228)
What is R doing to the data that can account for both of these results?
The reason is that the FDR calculation ensures that FDR never increases as the p-value decreases. That's because you can always choose to set a higher threshold for your rejection rule if that higher threshold will get you a lower FDR.
In your case, your second hypothesis had a p-value of
0.0006
and an FDR of0.010971585
, but the third hypothesis had a larger p-value and a smaller FDR. If you can achieve an FDR of0.009120228
by setting your p-value threshold to0.0019
, there is never a reason to set a lower threshold just to get a higher FDR.You can see this in the code by typing
p.adjust
:The
cummin
function takes the cumulative minimum of the vector, going backwards in the order ofp
.You can see this in the Benjamini-Hochberg paper you link to, including in the definition of the procedure on page 293, which states (emphasis mine):