Understanding ICU ubidi. Direction is always UBIDI_LTR

236 Views Asked by At

I have written a piece of sample code referring to ICU reference, to read a line from a file and get it's base direction and see the result of Unicode Bidi algorithm on it.

in my input file I have written فارسی which is a sequence of right to left characters. but this line std::cout << ubidi_getBaseDirection(us.getBuffer(), us.length()) << std::endl; prints 0 which is UBIDI_LTR. And no matter what combination of characters (RTL and LTR combinations) I give in the input file, it will always have one run with direction UBIDI_LTR.

Is there something wrong with my code?

#include "unicode/utypes.h"
#include "unicode/uchar.h"
#include "unicode/localpointer.h"
#include "unicode/ubidi.h"
#include <unicode/unistr.h>
#include<string>
#include<iostream>
#include <fstream>
#include "unicode/ustream.h"


using namespace icu;
using icu::UnicodeString;

int main(int argc, char* argv[])
{
    std::string input;
    std::string output;

    std::ifstream MyReadFile("in.txt");
    getline(MyReadFile, input);


    UnicodeString us(input.c_str());
    UBiDi* bidi = ubidi_open();
    UErrorCode errorCode = U_ZERO_ERROR;
    ubidi_setPara(bidi, us.getBuffer(), us.length(), UBIDI_RTL, nullptr, &errorCode);

    std::cout << (ubidi_getBaseDirection(us.getBuffer(), us.length()) == UBIDI_LTR) << std::endl;

    std::ofstream MyFile;
    MyFile.open("out.txt");

    if (U_SUCCESS(errorCode))
    {
        UnicodeString Ustring(ubidi_getText(bidi));
        std::string Ustr;
        Ustring.toUTF8String(Ustr);
        int32_t count = ubidi_countRuns(bidi, &errorCode);
        int32_t logicalStart, length;

        if (count > 0)
            MyFile << "VisualRun \t" << "direction" << "\t" << "s" << '\t' << "l" << '\t' << "output" << std::endl;

        for (int32_t i = 0; i < count; i++) {

            UBiDiDirection dir = ubidi_getVisualRun(bidi, i, &logicalStart, &length);
            std::string dirstr = "UBIDI_LTR";
            if (dir == UBIDI_RTL)
                dirstr = "UBIDI_RTL";

            UnicodeString temp = Ustring.tempSubString(logicalStart, length);

            char* dest = (char*)malloc(temp.length());
            temp.extract(logicalStart, length, dest, strlen(dest));

            output = std::string(dest);

            MyFile << "VisualRun \t" << dirstr << "\t" << logicalStart << '\t' << length << '\t' << output << std::endl;

        }
    }
    else
    {
        std::cout << "Failed" << std::endl;
    }
    MyFile.close();


    return 0;
}
0

There are 0 best solutions below