I'm getting an Out Of Memory error when running a program with an input of ~5M obs. and 153 variables.
When running the same program on ~40,000 obs. - program executes and completes great. When I try to increment number of input obs. to more than 100K - I get an error : Out of memory.
I tried to follow the below tips and uncheck few boxes:
1.Tools ► Options ► Results General ► deselect all Result Formats.
2.Tools ► Options ► Results General ► deselect Open generated data/results automatically.
3.Tools ► Options ► Results General ► deselect Link handcoded ODS results.
4.Tools ► Options ► SAS Programs ► deselect Automatically direct results back to SAS Enterprise Guide.
Still - Not working.
I use Enterprise Guide 5.1 but eventually will insert the working code into a User Written Code transformation on Data Integration Studio 4.6.
Does anyone have a clue how to get this program work?
edited
This is the piece of code causes the error messaage:
data _null_;
set all_include_Stornos ;
IF TREATMENT_IND = 1 AND DDS_ROW_IND NE 1 THEN DO;
CALL EXECUTE ('%STORNO_TKUFA ('||POLICY_RK||');');
CALL EXECUTE ('%UPDATE (STORNO_TKUFA_CUMULATE);');
END;
IF TREATMENT_IND in (4) AND DDS_ROW_IND NE 1 THEN DO;
CALL EXECUTE ('%HAKPAA ('||POLICY_RK||','||POLICY_VERSION||');');
call execute ('%UPDATE(HAKPAA_CUMULATE);');
END;
IF TREATMENT_IND = 5 AND DDS_ROW_IND NE 1 and count_bitul_no <2 or (count_bitul_no >1 and max_bitul_ver = policy_verSion ) THEN DO;
CALL EXECUTE ('%BITUL ('||POLICY_RK||','||POLICY_VERSION||');');
CALL EXECUTE ('%UPDATE(BITUL_CUMULATE);');
END;
IF TREATMENT_IND = 6 AND DDS_ROW_IND NE 1 THEN DO;
CALL EXECUTE ('%LAST_STATE ('||POLICY_RK||','||POLICY_VERSION||');');
CALL EXECUTE ('%UPDATE (LAST_STATE_calc);');
END;
IF DDS_ROW_IND NE 1 and ((PREV_TREATMENT_IND = 4 AND TREATMENT_IND NOT IN (1,2,5)) or treatment_ind = 3) THEN DO;
CALL EXECUTE ('%HAFSHARA ('||POLICY_RK||','||POLICY_VERSION||');');
CALL EXECUTE ('%UPDATE (HAFSHARA_CUMULATE);');
END;
IF TREATMENT_IND = 2 AND POLICY_VERSION - 1 = max_bitul_ver AND DDS_ROW_IND NE 1 THEN DO;
CALL EXECUTE ('%STORNO_BITUL ('||POLICY_RK||','||POLICY_VERSION||');');
CALL EXECUTE ('%UPDATE (STORNO_BITUL_CUMULATE);');
END;
RUN;
and these are 2 macros (shortened) executed when treatment_ind = 5:
%macro BITUL (pol_rk , pol_ver );
proc sql;
create table macro_BITUL
as select * from all_include_Stornos
where policy_rk = &pol_rk
and treatment_ind_5 = &pol_ver
order by policy_rk, policy_version;
quit;
data BITUL_calc;
set macro_BITUL;
BY POLICY_RK;
IF LAST.policy_rk THEN ACT_DAILY_AMT_END_DT = POLICY_VERSION_END_DT;
ELSE ACT_DAILY_AMT_END_DT = NEXT_POLICY_VERSION_START_DT;
VERSION_EXPOSURE_DAYS_NO = ACT_DAILY_AMT_END_DT - ACT_DAILY_AMT_START_DT + 1;
BITUL_DURATION = (POLICY_EXPIRATION_DT_5 - POLICY_VERSION_START_DT + 1) / (POLICY_VERSION_END_DT_5 - POLICY_VERSION_START_DT + 1);
GAINED_NET_PREMIUM_V_AMT = NET_PREMIUM_V_AMT *BITUL_DURATION;
.
.
.
GAINED_NET_COMMISION_B_IB_V_AMT = PRODUCR_B_NET_COMM_IB_V_AMT * BITUL_DURATION;
run;
/**/
PROC SQL;
CREATE TABLE TOTAL_GAINED AS
SELECT POLICY_RK,
SUM(GAINED_NET_PREMIUM_V_AMT) AS TOT_GAINED_NET_PREMIUM_V_AMT,
SUM(GAINED_NET_FEES_V_AMT) AS TOT_GAINED_NET_FEES_V_AMT,
.
.
.
SUM(GAINED_NET_COMMISION_B_IB_V_AMT) AS TOT_GAINED_NET_COMMN_B_IB_V_AMT
FROM BITUL_calc
GROUP BY POLICY_RK;
QUIT;
PROC SQL;
CREATE TABLE BITUL_calc_AND_TOTALS AS
SELECT A.* ,
TOT_GAINED_NET_PREMIUM_V_AMT,
.
.
JOIN TOTAL_GAINED AS B
ON
(A.POLICY_rK = B.POLICY_RK
)order by policy_rk, policy_version;
QUIT;
DATA bitul_CALCULATED;
SET BITUL_calc_AND_TOTALS;
IF TOT_GAINED_NET_PREMIUM_V_AMT = 0 THEN CALCULATED_NET_PREMIUM_V_AMT = 0;
ELSE CALCULATED_NET_PREMIUM_V_AMT = NET_PREMIUM_AMT_5 * GAINED_NET_PREMIUM_V_AMT / TOT_GAINED_NET_PREMIUM_V_AMT;
.
.
ELSE CALC_NET_COMMISION_B_IB_V_AMT = PRODUCR_B_NET_COMM_IB_AMT_5 * GAINED_NET_COMMISION_B_IB_V_AMT / TOT_GAINED_NET_COMMN_B_IB_V_AMT;
.
.
.
.
.
run;
DATA BITUL_CUMULATE;
SET bitul_CALCULATED;
.
.
.
.
.
.
run;
%mend ;
%MACRO UPDATE (TABLE_NAME);
PROC SQL;
DELETE FROM all_include_Stornos
WHERE CATS(POLICY_RK,POLICY_VERSION) IN ( SELECT CATS(POLICY_RK,POLICY_VERSION) FROM &TABLE_NAME);
INSERT INTO all_include_Stornos
select * FROM &TABLE_NAME
;QUIT;
/* PROC SORT DATA=all_include_Stornos OUT=all_include_Stornos; BY POLICY_RK POLICY_VERSION;RUN;*/
%MEND;
Based on the edit and your comment, i suspect that the 'order by' and/or 'proc sort' parts just cannot handle the volume on the system it is running.
50M is a lot of records, but still manageable by most. But if your data grows to 500M records, it might be too much.
I recall a client I had a while ago where the dataset was 120M rows and (iirc) +-300 variables and we did get similar issues.
Luckily, there are some ways to handle larger amounts of data in a sort.
1. Use the tagsort option
You can add tagsort as an option to the proc sort statement. In that case, SAS only sorts using the columns that are in the by statement and attaches the remaining part of the records afterwards. It is less resource-intensive, but of course, it is slower.
2. Use indexes for by statements
If you need the data to be in a particular order for a by statement later on in a proc/data step, you can create an index on these variables instead of sorting the data. My experience is that this will allow SAS to cope with larger datasets as well. Of course, it is slower, it consumes more space (for the index) and you need to take into account the existence (and destruction) of indexes.
3. Split the dataset, sort the parts and stitch everything together afterwards
A final method (which is probably preferrable to option 2 in many cases) is to:
This list method (and tagsort) are explained here: http://www.nesug.org/Proceedings/nesug12/cc/cc36.pdf
Note that all these methods require you to get rid of any order by and follow up those proc sql pieces with the appropriate sort procedure.