I have a HTML String with Chinese/Korean characters. I want to convert the HTML to PDF using iText. I have read that we need to embed the FONT to the PDF to get the unicode characters to show up on PDF.
When I am trying to embed wts11.ttf (With encoding IDENTITY_H) or STSong-Light( with encodingUniGB-UCS2-H), I am able to see only Chinese characters but I cannot see Korean characters. I tried using arialuni.ttf (With encoding IDENTITY_H) but still can see only Chinese characters but not Korean.
Can someone please tell me what should be exact font. Or if I am missing something.
Below is the code snippet:
Document document = new Document();
Paragraph paragraph=new Paragraph();
PdfWriter.getInstance(document, baos);
document.open();
BaseFont bff = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H", BaseFont.EMBEDDED);
Font f = new Font(bff);
// FontFactory.registerDirectories();
// Font f = FontFactory.getFont("Arial Unicode MS", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
document.add(new Paragraph());
HTMLWorker htmlWorker = new HTMLWorker(document);
List<Element> objects=htmlWorker.parseToList(new StringReader(message),null);
paragraph.setFont(f);
for (Element elem : objects) {
paragraph.add(elem);
}
document.add(paragraph);
There are different ways to solve this problem if you upgrade to using XML Worker.
I reused the code from the official examples, more specifically the ParseHtmlAsian example, and I adapted the HTML that is used as the source for this example like this:
The result looks like this:
As you can see, all the text is rendered correctly, so please do not spread incorrect messages such as "iText not rendering Chinese/Korean characters" ;-)
Please forward this answer to your management so that your CTO understands that investing time in an old iText version is more expensive than buying a license to use the new iText version.