How to flatten non perfect loop in Vitis HLS

1k Views Asked by At

My project is to encode an input string into an integer vector. I already have an encoding method. I create a lookup table and begin to stream the input string in. Compare each char of the input string to the char key of the lookup table, get the vector value and add them up. Here is an example:

lookup table
$       {1, -1, -1, 1, -1, ...., 1}
c       {1, -1, 1, 1, 1, ...., 1}
C       {1, 1, -1, 1, -1, ...., 1}
....
*       {-1, 1, -1, 1, -1, ...., 1}
&       {1, 1, -1, 1, 1, ...., 1}

input[in_len] = {'*','C','(','=','O',')','[','C','@','H',']','(','C','C','C','C','N','C','(','=','O',')','O','C','C','O','C',')','N','C','(','=','O',')','O','C','C','O','C'};
""

Here is my code:

void ENCODING_HV(FPGA_DATA input[in_len], 
                 FPGA_DATA lookup_key[NUM_TOKEN], 
                 LOOKUP_DATA lookup_HV[NUM_TOKEN],
                 int result_HV[CHUNK_NUM][CHUNK_SIZE]){
    
    for(int i = 0; i < in_len; i++){
        for(int j = 0; j < NUM_TOKEN; j++){
            if(input[i] == lookup_key[j]){
                for(int k = 0; k < CHUNK_NUM; k++){
    //#pragma HLS PIPELINE II=1
                    for(int l = 0; l < CHUNK_SIZE; l++){
    #pragma HLS ARRAY_RESHAPE variable=lookup_HV complete dim=3
    #pragma HLS ARRAY_RESHAPE variable=result complete dim=2
                        if(lookup_HV[j].map[k][l] == -1)
                            result_HV[k][l] = result_HV[k][l] + 1;
                        else
                            result_HV[k][l] = result_HV[k][l] - 1;
                    }
                }
            }
        }
    }
}

In the second for loop I have an if statement to compare the input char with key char of lookup table. And the Vitis HLS said that "Cannot flatten loop 'VITIS_LOOP_60_2'", and it take a long time to synthesis. Could anyone give me an idea how to do it? Thank you

WARNING: [HLS 200-960] Cannot flatten loop 'VITIS_LOOP_60_2' (Stream_Interface/HLS_scholarly/data_tokenize_2.cpp:60:28) in function 'create_sample_HV' the outer loop is not a perfect loop because there is nontrivial logic before entering the inner loop.
Resolution: For help on HLS 200-960 see www.xilinx.com/cgi-bin/docs/rdoc?v=2021.1;t=hls+guidance;d=200-960.html
1

There are 1 best solutions below

0
On

Disclaimer: since you aren't providing a full code, I cannot test my solution myself.

Performance-wise, as a rule of thumb, if-statements are "cheap" in hardware and HLS, since they most of the time resolve in MUXes. That is not true in software, where they cause discontinuities.

In your algorithm, the if-statement guards two nested for-loops and eventually does not execute them. However, in HLS, for-loops cannot just be "skipped", because they will end up representing some physical hardware components. Hence, by looking at your code, I would move the if-statement inside the nested for-loops, since it doesn't affect the algorithm.

A final solution might therefore be:

void ENCODING_HV(FPGA_DATA input[in_len], 
                 FPGA_DATA lookup_key[NUM_TOKEN], 
                 LOOKUP_DATA lookup_HV[NUM_TOKEN],
                 int result_HV[CHUNK_NUM][CHUNK_SIZE]){
#pragma HLS ARRAY_RESHAPE variable=lookup_HV complete dim=3
#pragma HLS ARRAY_RESHAPE variable=result complete dim=2
    for(int i = 0; i < in_len; i++){
        for(int j = 0; j < NUM_TOKEN; j++){
            for(int k = 0; k < CHUNK_NUM; k++){
#pragma HLS PIPELINE II=1
                for(int l = 0; l < CHUNK_SIZE; l++){
                    if(input[i] == lookup_key[j]){
                        if(lookup_HV[j].map[k][l] == -1)
                            result_HV[k][l] = result_HV[k][l] + 1;
                        else
                            result_HV[k][l] = result_HV[k][l] - 1;
                    }
                }
            }
        }
    }
}

Side note: by keeping all the if-statements inside the nested for-loops, you can even move the PIPELINE pragma above to fully unroll the most inner for-loops.