How do I add line breaks e.g. \\n in a Apache POI HWPF Document

2k Views Asked by At

I have to modify Word Document in the old .doc format. Using Apache POI with the HWPF representation of the document. I struggled to insert line breaks into any table cell. In the modified document line breaks look like empty boxes.

table cell with added line break

The Code I used for this after I selected the specific cell:

cell.insertBefore("Test "+System.lineSeparator()+" Test");

The Following also doesnt work:

cell.insertBefore("Test "+System.getProperty("line.seperator")+" Test"); 
cell.insertBefore("Test \n Test");
cell.insertBefore("Test \r\n Test");

everything I tried was transformed into boxes.

I also tried writing the document to a temp file and then just replacing a placeholder with HWPF -> empty boxes.Does anybody know a solution to this? Thanks in advance.

1

There are 1 best solutions below

5
On BEST ANSWER

Forget about apache poi HWPF. It is in scratchpad and without any progress since decades. And there are no useable methods to insert or create new paragraphs. All Range.insertBefore and Range.insertAfter methods which take more than only text are private and deprecated and doesn't work properly also since decades. The reason of that may be that the binary file format of Microsoft Word HWPF of course is the most horrible file format of all the other horrible file formats like HSSF, HSLF. So who wants bothering with this?

But to answer your question:

In word processing text is structured in paragraphs containing text runs. Each paragraph takes a new line by default. But "Text\nText" or "Text\rText" or "Text\r\nText" stored in a text run would only mark a line break within that text run but not a new paragraph. Would ..., because of course Microsoft Word has it's own rules. There \u000B marks that line break within the text run.

So what you could do is the following:

import java.io.FileInputStream;
import java.io.FileOutputStream;

import org.apache.poi.hwpf.*;
import org.apache.poi.hwpf.usermodel.*;

public class ReadAndWriteDOCTable {

 public static void main(String[] args) throws Exception {

  HWPFDocument document = new HWPFDocument(new FileInputStream("TemplateDOC.doc"));

  Range bodyRange = document.getRange();
  System.out.println(bodyRange);
  
  TableIterator tableIterator = new TableIterator(bodyRange);
  while (tableIterator.hasNext()) {
   Table table = tableIterator.next();
   System.out.println(table);
   TableCell cell = table.getRow(0).getCell(0); // first cell in table
   System.out.println(cell);
   Paragraph paragraph = cell.getParagraph(0); // first paragraph in cell
   System.out.println(paragraph); 
   CharacterRun run = paragraph.insertBefore("Test\u000BTest");
   System.out.println(run); 
  }
  
  FileOutputStream out = new FileOutputStream("ResultDOC.doc");
  document.write(out);
  out.close();
  document.close();
  
 }
}

That places the text run "Test\u000BTest" before first paragraph in first cell of each table in the document. And the \u000B marks a line feed within that text run.

Maybe that is what you wanted to achieve? But, as said, forget about apache poi HWPF. The next unsolvable problem is only a step far away.