Seeking a statistical javascript function to return p-value from a z-score

5.4k Views Asked by At

I need to convert z-scores to percentile. I found reference to a function in the jStat library that I could use (jstat.ztest), but the jStat documentation seems to be ahead of the available library because there is no such function in the currently available version of the library.

I think there is a more recent version of the library on GitHub, which may include the ztest function, but I am a linux novice and could not figure out how to build the library from the instructions. I spent most of a day learning about git bash and cygwin trying to build the library; I finally decided I'd be better off asking here.

So, could anyone point me toward a javascript function that would do what I need? Alternatively, could anyone point me toward a built version of the jStat library with ztest function included?

4

There are 4 best solutions below

1
On BEST ANSWER

I found this in a forum online and it works like a charm.

function GetZPercent(z) 
  {
    //z == number of standard deviations from the mean

    //if z is greater than 6.5 standard deviations from the mean
    //the number of significant digits will be outside of a reasonable 
    //range
    if ( z < -6.5)
      return 0.0;
    if( z > 6.5) 
      return 1.0;

    var factK = 1;
    var sum = 0;
    var term = 1;
    var k = 0;
    var loopStop = Math.exp(-23);
    while(Math.abs(term) > loopStop) 
    {
      term = .3989422804 * Math.pow(-1,k) * Math.pow(z,k) / (2 * k + 1) / Math.pow(2,k) * Math.pow(z,k+1) / factK;
      sum += term;
      k++;
      factK *= k;

    }
    sum += 0.5;

    return sum;
  }

And I don't need to include a large library just for the one function.

1
On

Just editing the code from Paul's answer for a two-sided t-test

function GetZPercent(z) 
{
//z == number of standard deviations from the mean

//if z is greater than 6.5 standard deviations from the mean
//the number of significant digits will be outside of a reasonable 
//range
if ( z < -6.5)
  return 0.0;
if( z > 6.5) 
  return 1.0;

if (z > 0) { z = -z;}

var factK = 1;
var sum = 0;
var term = 1;
var k = 0;
var loopStop = Math.exp(-23);
while(Math.abs(term) > loopStop) 
{
  term = .3989422804 * Math.pow(-1,k) * Math.pow(z,k) / (2 * k + 1) / Math.pow(2,k) * Math.pow(z,k+1) / factK;
  sum += term;
  k++;
  factK *= k;

}
sum += 0.5;

return (2*sum);
}
0
On

As already correctly stated by Shane, the equation is an implementation of the Taylor Expansion of the normal cdf. The sum value iterates above and below the "real" value with increasing precision. If the value is close to 1 or 0 there is a very low, but existing, probability that sum will be >1 or <0, because of the (relatively) early break by loopstop. The deviation is further strengthened by rounding 1/Math.sqrt(2*Math.Pi) to 0.3989422804 and the precision issues of javascript float numbers. Additionally, the provided solution will not work for z-scores >7 or <-7

I updated the code to be more accurate using the decimal.js npm library and to directly return the p-value:

function GetpValueFromZ(_z, type = "twosided") 
{
    if(_z < -14)
    {
        _z = -14
    }
    else if(_z > 14)
    {
        _z = 14
    }
    Decimal.set({precision: 100});

    let z = new Decimal(_z);
    var sum = new Decimal(0);

    var term = new Decimal(1);
    var k = new Decimal(0);

    var loopstop = new Decimal("10E-50");
    var minusone = new Decimal(-1);
    var two = new Decimal(2);

    let pi = new Decimal("3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067982148086513282306647")

    while(term.abs().greaterThan(loopstop)) 
    {
        term = new Decimal(1)
    
        for (let i = 1; i <= k; i++) {
            term = term.times(z).times(z.dividedBy(two.times(i)))
        }
    
        term = term.times(minusone.toPower(k)).dividedBy(k.times(2).plus(1))        
        sum = sum.plus(term);
        k = k.plus(1);
    }
    
    sum = sum.times(z).dividedBy(two.times(pi).sqrt()).plus(0.5);

    if(sum.lessThan(0))
        sum = sum.abs();
    else if(sum.greaterThan(1))
        sum = two.minus(sum);

    switch (type) {
        case "left":
            return parseFloat(sum.toExponential(40));
        case "right":
            return parseFloat((new Decimal(1).minus(sum)).toExponential(40));
        case "twosided":
            return sum.lessThan(0.5)? parseFloat(sum.times(two).toExponential(40)) : parseFloat((new Decimal(1).minus(sum).times(two)).toExponential(40))
        
    }

}

By increasing the Decimal.js precision value and decreasing the loopstop value you can get accurate p-values for very small (or very high) z-scores for the cost of calculation time.

0
On

This seems like such a simple ask but I had a hard time tracking down a library that does this instead of copying some random code snippet. Best I can tell this will calculate z-score from a percentage using the simple-statistics library.

I took their documentation about cumulativestdnormalprobability and backed into the following algorithm. Feels like there should be an easier way but who knows.

https://simplestatistics.org/docs/#cumulativestdnormalprobability

const z_score = inverseErrorFunction((percentile_value - 0.5) / 0.5) * Math.sqrt(2);