My program reads names containing Polish characters from an Excel file, it should convert them to UTF-8, but it gets a question mark (?) in place of the Polish letters, e.g. instead of Leśna Le?na. Below is a code snippet, I would appreciate any suggestions as to why it is not working.
List<Adress> list = new List<Adress>();
using (var reader = new StreamReader(filePath, Encoding.UTF8))
using (var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.CurrentCulture) { Encoding = Encoding.UTF8 }))
{
while (csv.Read())
{
if (csv.GetField(3) == "Miasto" || csv.GetField(4) == "Ulica")
{
continue;
}
string consumerId = csv.GetField(0);
string investmentId = csv.GetField(6);
if (string.IsNullOrWhiteSpace(investmentId))
{
investmentId = " ";
}
else if (string.IsNullOrWhiteSpace(consumerId))
{
consumerId = " ";
}
list.Add(new Adress
{
ConsumerId = consumerId,
AccountGroup = csv.GetField(1),
PostalCode = csv.GetField(2),
City = csv.GetField(3),
Street = csv.GetField(4),
Number = csv.GetField(5),
InvestmentId = investmentId
});
}
var records = list.ToArray();
return records;
}
I tried:
using (var reader = new StreamReader(filePath, Encoding.UTF8))
using (var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.CurrentCulture) { Encoding= Encoding.UTF8 }))
and
Street = Encoding.UTF8.GetString(Encoding.UTF8.GetBytes(csv.GetField(4))),
I also tried Windows-1252 encoding
string charset = detector.Charset.ToLowerInvariant();
Encoding encoding;
if (charset == "windows-1252")
{
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
encoding = Encoding.GetEncoding(1252);
}
and after everything still can't get Polish chars.