Creating a new flag based on multiple conditions on SAS

23 Views Asked by At

I'm struggling to find out how to create a flag. My data set looks like this:

CPT    PRODUCT         DATE         A        B           C        D       etc.
1         A            date1        .        .           .        .
1         A            date2        .        .           .        .
1         C            date2        .        .           .        .
1         B            date3        .        .           .        .
1         B            date3        .        .           .        .
2         A            date3        .        .           .        .
2         B            date1        .        .           .        .
2         B            date1        .        .           .        .
2         B            date2        .        .           .        .
2         C            date2        .        .           .        .
etc.

where cpt(i) represents each counterparty, product(i) represents the product bought by each cpt(i), date(i) represents the purchase date and A, B, C, D, etc. are other categorical/numerical variables of the data set. I'd like to create a flag every time if and only if there are two or more rows having the same cpt(i), the same product(i) and the same date(i). Thus, the other columns should not be considered. What I'd like to get is shown below:

CPT    PRODUCT         DATE         A        B           C        D       flag
1         A            date1        .        .           .        .        0
1         A            date2        .        .           .        .        0
1         C            date2        .        .           .        .        0  
1         B            date3        .        .           .        .        1
1         B            date3        .        .           .        .        1
2         A            date3        .        .           .        .        0
2         B            date1        .        .           .        .        1
2         B            date1        .        .           .        .        1
2         B            date2        .        .           .        .        0
2         C            date2        .        .           .        .        0
etc.

Any tips on how to get what I want? Cheers

1

There are 1 best solutions below

0
Richard On

Use a BY statement. The statement will implicitly create automatic flag variables first.<byvar> and last.<byvar>. When both variables are =1 the implied condition is that the by group contains a single row. You are looking to flag the inverse condition (when a by group contains more than one row).

data want ;
  set have ;
  by cpt product date ;
  flag = not (first.date and last.date) ;
run ;