Dynamically define a SAS hash

602 Views Asked by At

So here is the problem.

I have a dataset and for each record, depending on a criteria, I want to load a different hash. I do not know the exact hash structure for each hash I will be loading during runtime. So I want to be able to execute the definedata statement conditionally. But since I do not know the hash structure, I thought of passing the parameter to the definedata statement via a variable, but it does not work. How can I accomplish this? Here is what I have so far:

/* Hashes have the same key field */
data hash1;
  key = '1';  a = 10; b = 20; output;
  key = '2';  a = 30; b = 40; output;
run;

/* Hash objects can have different data members and  types */
data hash2;
  key = '1';  x = 'AAA'; y = 'BBB'; output;
  key = '2';  x = 'CCC'; y = 'DDD'; output;
run;

/* This the dataset I want to process */
/* hid specifies which hash I should lookup */
/* key contains the key value to use for the lookup */
/* def is the hash data definition piece of the hash. 
   In practice I will use another hash to retrieve this definition
   But for simplicity we can assume that is part of the have dataset itself */

data have;
  hid = '1'; key = '2'; def = "'a', 'b'"; output;
  hid = '2'; key = '1'; def = "'x', 'y'"; output;
run;

/* This is what I want */

data want;
  set have;

  /* Though I don't know the structure of each hash, I can get a list of all hashes at the onset via some macro processing. So this statement is doable */
  if _N_ = 0 then set hash1 hash2;

  /* This part is OK. The hash declaration is able to accept a variable for the dataset name */

  hashname = "hash" || hid;
  declare hash hh(dataset: dsname);
  hh.definekey('key');

  /* The following line is the problematic piece */
  hh.definedata(def);

  hh.definedone();

  rc = hh.find();
  /* Do something with the values */

  /* Finally delete the object so that it can be redefined again on the next record */
  hh.delete();

run;

The error I get is: ERROR: Undeclared data symbol 'a', 'b' for hash object. I think the issue here is that the definedata method parses the variables one by one and ends up treating the entire string 'a', 'b' as one variable.

If I were to define the hash to be a superset of all possible variables, then it complains when I load a dataset that contains a subset of those variables. Also I cannot have the hashes defined to contain a superset of all variables (i.e. I cannot create all hashes to contain a, b, x and y and leave the extraneous elements missing).

So my question is how can I accomplish what I am trying to do here? Is it possible to do macro %do like iterations just using datastep constructs to provide each variable one by one? Or are there other ways to do it?

Constraints

  1. I cannot rely on macro processing since I only know which hash I am going to use during runtime.
  2. I cannot load all the definitions ahead of time for memory reasons.

Any help will be appreciated.

2

There are 2 best solutions below

5
On BEST ANSWER

Your program can be made to work but I think performance will be poor.

Notice I changed the value of DEF so it would be easier to SCAN.

data have;
   hid = '1'; key = '2'; def = "a b"; output;
   hid = '2'; key = '1'; def = "x y"; output;
   run;

/* This is what I want */

data want;
   if _N_ = 0 then set hash1 hash2;
   call missing(of _all_);
   set have;
   hashname = "hash" || hid;
   declare hash hh(dataset: hashname);
   hh.definekey('key');
   /* The following line is the problematic piece */
   length v $32;
   do i = 1 by 1;
      v = scan(def,i,' ');
      putlog v= i=;
      if missing(v) then leave;
      *hh.definedata(def);
      hh.definedata(v);
      end;
   hh.definedone();
   *hh.output(dataset: cats('X',hashname));

   rc = hh.find();
   /* Do something with the values */

   /* Finally delete the object so that it can be redefined again on the next record */
   hh.delete();
   run;
3
On

You can store hash references in a separate hash. This is called a hash of hashes. Load the hash of hashes with the references to the individual hashes that are loaded only one time at the start of the step.

Example:

data hash1;length key $1;input
key a b; datalines;
1 10 20
2 30 40
3 50 60
4 70 80  
run;

data hash2;length key $1;input
key x $3. y: $3.; datalines;
1 AAA BBB
2 CCC DDD
3 EEE FFF
4 GGG HHH
run;

data hashdataspec; length hid $1;input
hid datavars&: $15.;datalines;
1   a,b
2   x,y
run;

data have;
  do rowid = 1 to 100;
    p = floor (100*ranuni(123));
    q = 100 + ceil(100*ranuni(123));

    length r s $15;
    r = scan ("One of these will become the R value", ceil(8*ranuni(123)));
    s = scan ("How much wood would a woodchuck chuck if ...", ceil(9*ranuni(123)));

    length hid key $1;
    hid = substr('12',   ceil(2*ranuni(123)));
    key = substr('1234', ceil(4*ranuni(123)));

    output;
  end; 
run;

data want;
  sentinel0 = ' ';
  if 0 then set hash1-hash2 hashdataspec; * prep pdv for hash host variables;
  sentinel1 = ' ';

  * prep hashes, one time only;
  if _n_ = 1 then do;
    * load hash data specifiers;
    declare hash hds(dataset:'hashdataspec');
    hds.defineKey('hid');
    hds.defineData('hid', 'datavars');
    hds.defineDone();

    * prep hash of hashes;
    declare hash h;      /* dynamic hash that will be added to hoh */
    declare hash hoh();  /* hash of hashes */
    hoh.defineKey ('hid');
    hoh.defineData ('h');
    hoh.defineDone();

    * loop over hashdataspec, loading dynamically created hashes;
    declare hiter hi('hds');
    do while(hi.next() = 0);
      h = _new_ hash(dataset:cats('hash',hid));    * create dynamic hash;
      h.defineKey('key');
      do _n_ = 1 to countw(datavars);
        h.defineData(scan(datavars,_n_,','));      * define data vars, one at a time;
      end;
      h.defineDone();
      hoh.add();  * add the dynamic hash to the hash of hashes;
    end;
  end;

  * clear hash host variables;
  call missing (of sentinel0--sentinel1);

  set have;

  * lookup which hash (hid) to use
  * this will select the appropriate dynamic hash from hoh and update hash variable h;
  hoh.find();

  * lookup data for key in the hids hash;
  h.find();

  drop datavars;
run;