Problems with CORDIC for Logarithm in C

274 Views Asked by At

In order to get started with CORDIC for log10, I implemented the algorithm derived in this PDF, pp6:

#include <stdio.h>
#include <math.h>

// https://www.mikrocontroller.net/attachment/31117/cordic1.pdf
float logf_cordic (float x, int B)
{
    float z = 0.0f;

    for (int i = 1; i <= B; i++)
    {
        if (x > 1.0f)
        {
            printf ("-");
            x = x - ldexpf (x, -i);
            z = z - log10f (1.0f - ldexpf (1.0f, -i));
        }
        else
        {
            printf ("+");
            x = x + ldexpf (x, -i);
            z = z - log10f (1.0f + ldexpf (1.0f, -i));
        }
    }
    return z;
}

This is a straight forward C99 implementation of the code from the paper, where I used ldexpf(x,n) to compute x·2n. The paper claims that the method converges for 0.4194 < x < 3.4627. The program uses 1 ≤ x ≤ 2. The complete C99 code below prints:

+--+--+--+-+-+++--++ x = 1.00000: y = -0.000000, dy = -1.487272e-07
-+++++++++++++++++++ x = 1.12500: y = +0.099773, dy = +4.862081e-02
-+++++++++++++++++++ x = 1.25000: y = +0.099773, dy = +2.863325e-03
-+++-+--+-+--+-++--+ x = 1.37500: y = +0.138302, dy = -4.023314e-07
-++-+--++----+++--+- x = 1.50000: y = +0.176091, dy = -2.831221e-07
-+-+++++-++-++-++++- x = 1.62500: y = +0.210854, dy = +2.831221e-07
-+-+-+-+++++--+---++ x = 1.75000: y = +0.243038, dy = +2.235174e-07
-+--++-+--+---+---+- x = 1.87500: y = +0.273001, dy = +0.000000e+00
-+---+--+++--------- x = 2.00000: y = +0.301030, dy = -5.960464e-08

So it works as expected, except for x = 1.125, 1.25 where the error is big and doesn't decrease when computing with more iterations. I am now staring at that code for hours, but cannot find what I am missing...


Complete C99 code

#include <stdio.h>
#include <math.h>

float logf_cordic (float x, int B)
{
    float z = 0.0f;

    for (int i = 1; i <= B; i++)
    {
        if (x > 1.0f)
        {
            printf ("-");
            x = x - ldexpf (x, -i);
            z = z - log10f (1.0f - ldexpf (1.0f, -i));
        }
        else
        {
            printf ("+");
            x = x + ldexpf (x, -i);
            z = z - log10f (1.0f + ldexpf (1.0f, -i));
        }
    }
    return z;
}

int main (int argc, char *argv[])
{
    int ex = 3;
    int B = 20;

    if (argc >= 2)
        sscanf (argv[1], "%i", &ex);

    if (argc >= 3)
        sscanf (argv[2], "%i", &B);

    if (ex < 0) ex = 0;
    if (ex > 16) ex = 16;
    if (B > 100) B = 100;

    int n = 1 << ex;
    float dx = 1.0f / n;
    for (int i = 0; i <= n; ++i)
    {
        float x = 1.0f + i * dx;
        float y = logf_cordic (x, B);
        float dy = y - log10f (x);
        printf (" x = %.5f: y = %+f, dy = %+e\n",
                (double) x, (double) y, (double) dy);
    }
    return 0;
}

For reference, here is the algo as presented in the paper:

log10(x){
    z = 0;
    for ( i=1;i=<B;i++ ){
        if (x > 1)
            x = x - x*2^(-i);
            z = z - log10(1-2^(-i));
        else
            x = x + x*2^(-i);
            z = z - log10(1+2^(-i));
    }
return(z)
}
3

There are 3 best solutions below

0
emacs drives me nuts On BEST ANSWER

The algorithm is bogus. The problem is that not all numbers can be presented in the form

x = Π (1 ± 2−k)

See this discussion on MSE. There doesn't even exist an interval [a,b] such that all x's in that interval can be represented in that form.


What does exist though is a representation of the form

x = Π (1 + ak·2−k)

with ak in { 0, 1 } for all 1 ≤ x ≤ 2.38 = x0, and this is enough to fix the algorithm:

To compute logB(x), first normalize the argument x such that 1 / x0 ≤ x ≤ 1.

Then apply the following algorithm to the reduced x:

logB_cordic (x, N)
    z = 0
    xmax = 1 + 2^{−N−1}
    FOR k = 1 ... N
        xk = x + x·2^{-k}
        IF xk ≤ xmax
            x = xk;
            z = z - logB (1 + 2^{−k})
    return z

In practice, N is fixed and the N logB values come from a lookup-table. Apart from that, the method requires additions, comparisions, and shifts by variable offsets.

The comparison IF xk ≤ xmax makes the final absolute error (almost) symmetric around 0. With IF xk ≤ 1 the absolute error would be asymmetric and twice as high.

Results

For reference, below is an implementation in C99 that adds some confidence that the fixed algorithm is actually working. It's B = 2, N = 10, and the plot is for 16385 values of x evenly spaced over [0.5, 1].
(click to enlarge)

Absolute error of a C99 implementation with N=10

C99 Code

#include <math.h>

double log2_cordic (double x, int N)
{
    double xmax = 1.0 + ldexp (1.0, -N - 1);
    double z = 0.0;

    for (int k = 1; k <= N; k++)
    {
        double x_bk = x + ldexp (x, -k);

        if (x_bk <= xmax)
        {
            x = x_bk;
            z = z - log2 (1.0 + ldexp (1.0, -k));
        }
    }
    return z;
}

Edit:

Just found out that I reinvented the wheel... According to Wikipedia, it's a method Feynman already used during the Manhattan Project.

2
Piolie On

This is not a complete answer. The question simply catched my eye and I decided to take a dive and have some fun.

I've tested the algorithm as given in the paper in Python using floats and also converting to fixed point and my results are the same as yours. I've also found some other problematic points, such as x = 1.60000, x = 2.00625, x = 2.03750 and x = 2.06875.

I'll use the following convention:

b_i+ = (1 + 2^-i)
b_i- = (1 - 2^-i)

The algorithm can thus be written:

log10(x) {
    z = 0;
    for (i = 1; i =< B; i++) {
        if (x > 1)
            x *= b_i-;
            z -= log10(b_i-);
        else
            x *= b_i+;
            z -= log10(b_i+);
    }
    return(z)
}

I'm not sure the claim that we can choose b_i of the form (1 ± 2^-i) is in itself wrong, but I've noticed that the criteria for choosing each b_i might be. For example, when x == 1.125, the sequence currently is

x *= b_1- == 0.562500, z -= log10(b_1-) == 0.301030
x *= b_2+ == 0.703125, z -= log10(b_2+) == 0.204120
x *= b_3+ == 0.791016, z -= log10(b_3+) == 0.152967
x *= b_4+ == 0.840454, z -= log10(b_4+) == 0.126639
x *= b_5+ == 0.866718, z -= log10(b_5+) == 0.113275
x *= b_6+ == 0.880261, z -= log10(b_6+) == 0.106541
x *= b_7+ == 0.887138, z -= log10(b_7+) == 0.103161
x *= b_8+ == 0.890603, z -= log10(b_8+) == 0.101468
                        ...

x never reaches 1 and the value of z is wrong. However, if we start with b_1+ (instead of b_1-), the sequence is:

x *= b_1+ == 1.687500; z -= log10(b_1+) == -0.176091
x *= b_2- == 1.265625; z -= log10(b_2-) == -0.051152
x *= b_3- == 1.107421; z -= log10(b_3-) ==  0.006839
x *= b_4- == 1.038208; z -= log10(b_4-) ==  0.034868
x *= b_5- == 1.005764; z -= log10(b_5-) ==  0.048656
x *= b_6- == 0.990048; z -= log10(b_6+) ==  0.055495
x *= b_7+ == 0.997783; z -= log10(b_7+) ==  0.052116
x *= b_8+ == 1.001681; z -= log10(b_8-) ==  0.050422
                        ...

which converges to the correct result.

For x == 2.00625, right now the algorithm uses

b_1- b_2- b_3+ b_4+ b_5+ b_6+ b_7+ b_8+, ...

and x settles to about x == 0.957. But if we change to b_2+ instead, the sequence is

b_1- b_2+ b_3- b_4- b_5- b_6+ b_7- b_8-

which makes x == 1.0002 and z == 0.323317 (correct again).

My intuition[1] is that we can always choose a series of b_is of the form (1 ± 2^-i) that satisfies x * b_1 * ... * b_B == 1,[2] but using

b_i = 1 + 2^-i if x * b1 * b2 * ... * b_i-1 < 1
b_i = 1 - 2^-i if x * b1 * b2 * ... * b_i-1 > 1

is not the appropriate criteria for choosing. The condition (x > 1.0) should be replaced by a smarter one which has so far escaped me.


[1] It turned out my intuition was wrong (again!). See OP's answer.

[2] And so use lookup tables and simple shift-and-add operations to calculate the logarithms, which is the purpose of the algorithm.

4
njuffa On

I would quibble with calling this CORDIC computation. CORDIC in hyperbolic vectoring mode can compute tanh-1, and that can be used to compute logarithms with a bit of pre-processing and post-processing. This logarithm computation is closely patterned on binary longhand division and the algorithm has therefore been referred to as pseudo-division. The general approach dates back to how Henry Briggs computed his tables of logarithms.

In binary division, various digit sets can be utilized, for example the digit set {0, 1} in the restoring or "non-performing" variant or {-1, 1} if one uses the non-restoring variant (I am using -1 here in lieu of a 1 with overline). It seems the author of the document linked in the question incorrectly assumed that the non-restoring division algorithm would translate directly into a corresponding pseudo-division for logarithms, using the digit set {-1, 1} to represent factors (1-2-i) and (1+2-i) during the multiplicative normalization process. However, as other answers have pointed out, this does not work as not all arguments can be normalized to unity in this way.

The simplest approach to compute logarithms by pseudo-division is to use the digit set {0, 1} in a "non-performing" algorithm as I demonstrated in my answer to this question. It is also possible to use the digit set {-1, 0, 1} corresponding to normalization factors (1+2-i), 1, (1-2-i), as De Lugish showed in his Ph.D. thesis. He designed shared hardware for the bit-wise computation of multiplications, divisions, square root, and various transcendental functions:

Bruce Gene De Lugish, A class of Algorithms for Automatic Evaluation of Certain Elementary Functions in a Binary Computer, Report No. 399, Department of Computer Science, University of Illinois at Urbana-Champaign, June 1970.

Below is an ISO-C99 implementation of log() based directly on the description in this report. Since I had been aware of De Lugish's work but never utilized it until now, I wrote the code for clarity of exposition for my own benefit rather than attempting to maximize performance.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

// ln(1+2*(-i))
float tabp [24];
// ln(1-2*(-i))
float tabm [24];

float logf_delugish (float a)
{
    const float three_eigth = 0.375f;
    const float ln2 = 0.69314718056f;
    int sk, expo;
    float sum = tabm[0], x = frexpf (a, &expo), ex2 = 1.0f;
    x = 2.0f * x;
    for (int k = 0; k < 24; k++) {
        sk = 0;
        if ((x - 1.0f) <  (-three_eigth * ex2)) sk = +1;
        if ((x - 1.0f) >= (+three_eigth * ex2)) sk = -1;
        ex2 *= 0.5f;
        if (sk == 1) {
            x = x + x * ex2;
            sum = sum - tabp [k];
        } 
        if (sk == -1) {
            x = x - x * ex2;
            sum = sum - tabm [k];
        }
    }
    return expo * ln2 + sum;
}

float tabp [24] =
{ 
    0.40546510810816f,
    0.22314355131421f,
    0.11778303565638f,
    0.06062462181643f,
    0.03077165866675f,
    0.01550418653597f,
    0.00778214044205f,
    0.00389864041566f,
    0.00195122013126f,
    0.00097608597306f,
    0.00048816207950f,
    0.00024411082753f,
    0.00012206286253f,
    0.00006103329368f,
    0.00003051711247f,
    0.00001525867265f,
    0.00000762936543f,
    0.00000381468999f,
    0.00000190734681f,
    0.00000095367386f,
    0.00000047683704f,
    0.00000023841855f,
    0.00000011920928f,
    0.00000005960464f,
};

// ln(1-2*(-i))
float tabm [24] =
{
    -0.69314718055995f,
    -0.28768207245178f,
    -0.13353139262452f,
    -0.06453852113757f,
    -0.03174869831458f,
    -0.01574835696814f,
    -0.00784317746103f,
    -0.00391389932114f,
    -0.00195503483580f,
    -0.00097703964783f,
    -0.00048840049811f,
    -0.00024417043217f,
    -0.00012207776369f,
    -0.00006103701897f,
    -0.00003051804380f,
    -0.00001525890548f,
    -0.00000762942364f,
    -0.00000381470454f,
    -0.00000190735045f,
    -0.00000095367477f,
    -0.00000047683727f,
    -0.00000023841861f,
    -0.00000011920930f,
    -0.00000005960465f,
};

#define LIMIT (256)

int main (void)
{
    unsigned int count = 0;
    float sumerrsq = 0;
    float maxerr = 0, maxerr_loc = INFINITY;
    float a = 1.0f / LIMIT;
    do {
        float res = logf_delugish (a);
        double ref = log ((double)a);
        float err = fabsf (res - ref);
        sumerrsq += err * err;
        if (err > maxerr) {
            maxerr = err;
            maxerr_loc = a;
        }
        a = nextafterf (a, INFINITY);
        count++;
    } while (a < LIMIT);
    printf ("maxerr = %15.8e @ %15.8e  RMS err = %15.8e\n", 
            maxerr, maxerr_loc, sqrt(sumerrsq / count));
    return EXIT_SUCCESS;
}