Optimised EmEditor Macro to return Min/Max column lengths on large delimited data

185 Views Asked by At

I currently have large delimited data sets and I need to return the min\max lengths for each column. I'm currently using the following code in Emeditor v20.3, which works great, but am wondering if there is a quicker way, particularly when there are million of lines of data and hundreds of columns (and this code is slow).

Any quicker approaches or ideas would that could be wrapped into a javascript macro would be much appreciated.


for( col = colStart; col <= MaxCol; col++ ) {
    sTitle = document.GetCell( 1, col, eeCellIncludeNone );
    min = -1;
    max = 0;
    for( line = document.HeadingLines + 1; line < MaxLines; line++ ) {
        str = document.GetCell( line, col, eeCellIncludeQuotesAndDelimiter );
        if( min == -1 || min > str.length ) {
            min = str.length;
        }
        if( max < str.length ) {
            max = str.length;
        }
    }
    OutputBar.writeln( col + min + "    " + max + " " + sTitle);
}
1

There are 1 best solutions below

1
On BEST ANSWER

Please update EmEditor to 20.3.906 or later, and run this macro:

colStart = 1;
MaxCol = document.GetColumns();
document.selection.EndOfDocument();
yLastLine = document.selection.GetActivePointY( eePosCellLogical );

min = -1;
max = 0;
for( col = colStart; col <= MaxCol; col++ ) {
    sTitle = document.GetCell( 1, col, eeCellIncludeNone );
    document.selection.SetActivePoint( eePosCellLogical, col, 1 );
    
    editor.ExecuteCommandByID( 4064 );  // Find Empty or Shortest Cell
    y = document.selection.GetActivePointY( eePosCellLogical );
    if( y < yLastLine ) {  // check if not the last empty line
        str = document.GetCell( y, col, eeCellIncludeQuotes );
        min = str.length;
    }
    else {  // if the last empty line
        document.selection.SetActivePoint( eePosCellLogical, col, 1 );
        editor.ExecuteCommandByID( 4050 );  // Find Non-empty Shortest Cell
        y = document.selection.GetActivePointY( eePosCellLogical );
        str = document.GetCell( y, col, eeCellIncludeQuotes );
        min = str.length;
    }
    
    document.selection.SetActivePoint( eePosCellLogical, col, 1 );
    editor.ExecuteCommandByID( 4049 );  // Find Longest Cell
    y = document.selection.GetActivePointY( eePosCellLogical );
    str = document.GetCell( y, col, eeCellIncludeQuotes );
    max = str.length;
    OutputBar.writeln( col + " : " + min + "    " + max + " " + sTitle);
}