How to use the Percentile Function for Ranking purpose? (Matlab)

Question

How to use the Percentile Function for Ranking purpose? (Matlab)

786 Views Asked by John At 22 December 2016 at 11:07

I have the following 606 x 274 table: see here

Goal:

For every date calculate lower and upper 20% percentiles and, based on the outcome, create 2 new variables, e.g. 'L' for "lower" and 'U' for "upper", which contain the ticker names as seen in the header of the table.

Step by step:

% Replace NaNs with 'empty' for the percentile calculation (error: input to be cell array)
     T(cellfun(@isnan,T)) = {[]}
% Change date format
     T.Date=[datetime(T.Date, 'InputFormat', 'eee dd-MMM-yyyy')];
% Take row by row 
     for row=1:606
% If Value is in upper 20% percentile create new variable 'U' that contains the according ticker names.
% If Value is in lower 20% percentile create new variable 'L' that contains the according ticker names.
     end;

So far, experimenting with 'prctile' only yielded a numeric outcome, for a single column. Example:

Y = prctile(T.A2AIM,20,2);

Thanks for your help and ideas!

Original Q&A

There are 1 best solutions below

**lucianopaz** · Accepted Answer · 2016-12-22T18:02:17.243000

Generally speaking, if you have an array of numbers:

a = [4 2 1 8 -2];

percentiles can be computed by first sorting the array and then attempting to access the index supplied in the percentile. So prctile(a,20)'s functionality could in principle be replaced by

b = sort(a);
ind = round(length(b)*20/100);
if ind==0
    ind = 1;
end
b = b(ind);
% b = -2

However, prctile does a bit more of fancy magic interpolating the input vector to get a value that is less affected by array size. However, you can use the idea above to find the percentile splitting columns. If you chose to do it like I said above, what you want to do to get the headers that correspond to the 20% and 80% percentiles is to loop through the rows, remove the NaNs, get the indeces of the sort on the remaining values and get the particular index of the 20% or 80% percentile. Regrettably, I have an old version of Matlab that does not support tables so I couldn't verify if the header names are returned correctly, but the idea should be clear.

L = cell(size(T,1),1);
U = cell(size(T,1),1);
for row=1:size(T,1)
    row_values = T{row,:};
    row_values = row_values(2:end); % Remove date column
    non_nan_indeces = find(~isnan(row_values));
    if not(isempty(non_nan_indeces))
        [row_values,sorted_indeces] = sort(row_values(non_nan_indeces));
        % The +1 is because we removed the date column
        L_ind = non_nan_indeces(sorted_indeces(1:round(0.2*length(row_values))))+1;
        U_ind = non_nan_indeces(sorted_indeces(round(0.8*length(row_values)):end))+1;
        % I am unsure about this part
        L{row} = T.Properties.VariableNames(L_ind);
        U{row} = T.Properties.VariableNames(U_ind);
    else
        L{row} = nan;
        U{row} = nan;
    end
end;

If you want to use matlab's prctile, you would have to find the returned value's index doing something like this:

L = cell(size(T,1),1);
U = cell(size(T,1),1);
for row=1:size(T,1)
    row_values = T{row,:};
    row_values = row_values(2:end); % Remove date column
    non_nan_indeces = find(~isnan(row_values));
    if not(isempty(non_nan_indeces))
        [row_values,sorted_indeces] = sort(row_values(non_nan_indeces));
        L_val = prctile(row_values(non_nan_indeces),20);
        U_val = prctile(row_values(non_nan_indeces),80);
        % The +1 is because we removed the date column
        L_ind = non_nan_indeces(sorted_indeces(find(row_values<=L_val)))+1;
        U_ind = non_nan_indeces(sorted_indeces(find(row_values>=U_val)))+1;
        % I am unsure about this part
        L{row} = T.Properties.VariableNames(L_ind);
        U{row} = T.Properties.VariableNames(U_ind);
    else
        L{row} = nan;
        U{row} = nan;
    end
end;

How to use the Percentile Function for Ranking purpose? (Matlab)

There are 1 best solutions below

Related Questions in MATLAB

Related Questions in LOOPS

Related Questions in FOR-LOOP

Related Questions in RANKING

Related Questions in PERCENTILE

Trending Questions

Popular # Hahtags

Popular Questions