I want to implement a feature which allows the user to double-click to highlight a word in a PDF document using the JPedal library. This would be trivial to do if I could get a word's bounding rectangle and see if the MouseEvent location falls within it; the following snippet demonstrates how to highlight a region:
private void highlightText() {
Rectangle highlightRectangle = new Rectangle(firstPoint.x, firstPoint.y,
secondPoint.x - firstPoint.x, secondPoint.y - firstPoint.y);
pdfDecoder.getTextLines().addHighlights(new Rectangle[]{highlightRectangle}, false, currentPage);
pdfDecoder.repaint();
}
I can only find plaintext extraction examples in the documentation however.
After looking at Mark's examples I managed to get it working. There are a few quirks so I'll explain how it all works in case it helps someone else. The key method is
extractTextAsWordlist
, which returns aList<String>
of the form{word1, w1_x1, w1_y1, w1_x2, w1_y2, word2, w2_x1, ...}
when given a region to extract from. Step-by-step instructions are listed below.Firstly, you need to transform the
MouseEvent
's Component/screen coordinates to PDF page coordinates and correct for scaling:Next, create a box to scan for text. I chose to make this the width of the page and +/- 20 page units vertically (this is a fairly arbitrary number), centered at the
MouseEvent
:Then I parsed this into a sequence of
Rectangle
s:Then identified which
Rectangle
theMouseEvent
fell within:For some reason, just passing this Rectangle to the highlighting method didn't work. After some tinkering, I found that shrinking the
Rectangle
by a point on each side resolved the problem:Then I just passed it to this method to add highlights:
Finally, all the above calls are packed into this convenient method: