Multiply by supernodal L in CHOLMOD?

216 Views Asked by At

How can I multiply by the cholmod_factor L in a supernodal L L^T factorisation? I'd prefer not to convert to simplicial since the supernodal representation results in faster backsolves, and I'd prefer not to make a copy of the factor since two copies might not fit in RAM.

1

There are 1 best solutions below

0
On BEST ANSWER

I wound up understanding the supernodal representation from a nice comment in the supernodal-to-simplicial helper function in t_cholmod_change_factor.c. I paraphrase the comment and add some details below:

A supernodal Cholesky factorisation is represented as a collection of supernodal blocks. The entries of a supernodal block are arranged in column-major order like this 6x4 supernode:

t - - -    (row s[pi[snode+0]])
t t - -    (row s[pi[snode+1]])
t t t -    (row s[pi[snode+2]])
t t t t    (row s[pi[snode+3]])
r r r r    (row s[pi[snode+4]])
r r r r    (row s[pi[snode+5]])
  • There are unused entries (indicated by the hyphens) in order to make the matrix rectangular.
  • The column indices are consecutive.
  • The first ncols row indices are those same consecutive column indices. Later row indices can refer to any row below the t triangle.
  • The super member has one entry for each supernode; it refers to the first column represented by the supernode.
  • The pi member has one entry for each supernode; it refers to the first index in the s member where you can look up the row numbers.
  • The px member has one entry for each supernode; it refers to the first index in the x member where the entries are stored. Again, this is not packed storage.

The following code for multiplication by a cholmod_factor *L appears to work (I only care about int indices and double-precision real entries):

cholmod_dense *mul_L(cholmod_factor *L, cholmod_dense *d) {
  int rows = d->nrow, cols = d->ncol;
  cholmod_dense *ans = cholmod_allocate_dense(rows, cols, rows,
      CHOLMOD_REAL, &comm);
  memset(ans->x, 0, 8 * rows * cols);

  FOR(i, L->nsuper) {
    int *sup = (int *)L->super;
    int *pi = (int *)L->pi;
    int *px = (int *)L->px;
    double *x = (double *)L->x;
    int *ss = (int *)L->s;

    int r0 =  pi[i], r1 =  pi[i+1], nrow = r1 - r0;
    int c0 = sup[i], c1 = sup[i+1], ncol = c1 - c0;
    int px0 = px[i];

    /* TODO: Use BLAS instead. */
    for (int j = 0; j < ncol; j++) {
      for (int k = j; k < nrow; k++) {
        for (int l = 0; l < cols; l++) {
          ((double *)ans->x)[l * rows + ss[r0 + k]] +=
              x[px0 + k + j * nrow] * ((double *)d->x)[l*rows+c0 + j];
        }
      }
    }
  }
  return ans;
}