UTF-8 encoding polish char - C#

83 Views Asked by At

My program reads names containing Polish characters from an Excel file, it should convert them to UTF-8, but it gets a question mark (?) in place of the Polish letters, e.g. instead of Leśna Le?na. Below is a code snippet, I would appreciate any suggestions as to why it is not working.

List<Adress> list = new List<Adress>();

using (var reader = new StreamReader(filePath, Encoding.UTF8))
using (var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.CurrentCulture) { Encoding = Encoding.UTF8 }))
{
    while (csv.Read())
    {
        if (csv.GetField(3) == "Miasto" || csv.GetField(4) == "Ulica")
        {
            continue;
        }
        string consumerId = csv.GetField(0);
        string investmentId = csv.GetField(6);

        if (string.IsNullOrWhiteSpace(investmentId))
        {
            investmentId = " ";
        }
        else if (string.IsNullOrWhiteSpace(consumerId))
        {
            consumerId = " ";
        }

        list.Add(new Adress
        {
            ConsumerId = consumerId,
            AccountGroup = csv.GetField(1),
            PostalCode = csv.GetField(2),
            City = csv.GetField(3),
            Street = csv.GetField(4),
            Number = csv.GetField(5),
            InvestmentId = investmentId
        });
    }

    var records = list.ToArray();
    return records;
}

I tried:

using (var reader = new StreamReader(filePath, Encoding.UTF8)) 
using (var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.CurrentCulture) { Encoding= Encoding.UTF8 }))

and

Street = Encoding.UTF8.GetString(Encoding.UTF8.GetBytes(csv.GetField(4))),

I also tried Windows-1252 encoding

string charset = detector.Charset.ToLowerInvariant();
Encoding encoding;

if (charset == "windows-1252")
{
    Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
    encoding = Encoding.GetEncoding(1252);
}

and after everything still can't get Polish chars.

0

There are 0 best solutions below