i am using NodeCollection and LayoutCollector to get text and start pagenumber:
Document doc = new Document("input.docx");
LayoutCollector layoutCollector = new LayoutCollector(document);
NodeCollection nodes = doc.getChildNodes(NodeType.ANY, true);
for (Node node : (Iterable<Node>)nodes)
{
System.out.println("Start PageNumber : " + layoutCollector.startPageIndex((node));
switch (node.getNodeType())
{
case NodeType.PARAGRAPH:
System.out.println(node.getText());
break;
}
}
here i want to get start line number of node along with page number. How can i achieve it
As you know there is no concept of page or line in MS Word documents due to their flow nature. The consumer applications build document layout on the fly, the same does Aspose.Words using it's own layout engine.
LayoutCollectorandLayoutEnumeratorclasses provides a limited access to document layout information. unfortunately, there is no direct way to get the line index of some node using these classes. However, you can get layout entity ofLayoutEntityType.LINEtype of a particular node. For example, the following code demonstrates the basic technique of splitting document content into lines: