I have designed and built an algorithm in my vb.net VSTO to break up any text into it's sentences. It's used by a language specialist to determine the readability of text (I'm also automating the readability analysis). I do not rely on the Find with Wildcards as that is too limited. The algorithm has to deal with quoted text (even when part of a larger sentence), lists and headings.
So far so good. The algorithm works as intended.
I thought it be useful to show in colors what the outcome of the 'break up text into sentences' algorithm is by highlighting the various texts in the document. Doing this I encountered an inconsistency in Word's behaviour (using Office 2016 Professional 32-bit on Win10). I'd like to share this and see if anyone can provide more insight before I devise a solution. Am I missing anything?
Outside a table, I can set a range to any text and then set the .HighlightColorIndex property and the color is changed and visible in the Word editor.
Inside a table cell, it works the same AS LONG AS there is no paragraph marker (vbCr) immediately following the text (I'm not including the vbCr in my range). In such case when the .HighlightColorIndex is changed (confirmed in the debugger) the color is not visibly changed in the Word editor. It only works when I include the paragraph marker in my range. Outside a table that is not required.
Basic code flow (partly non-code) with some additional comments for clarity
For each para as Paragraph in selectedRange.paragraphs
' Identify a sentence by looping the para.range.text
' looking at punctuation marks, quotes, abbreviations, false positives etc.
... complicated logic ...
' If sentence identied, find it so we have the right range.
' This find is actually in a Sub FindSentence (rng, sentence)
' shown here for readability.
' The sub includes some complicated logic to overcome the Find text length limit.
para.range.find (sentence.text)
if para.range.found then
' This code is actually in Sub Highlight(rng, sentence)
' shown here in the main code for readability.
' The debugger shows that the properties are changed.
' If range is in a Table then highlight is not shown in Word
' unless the found para.range includes the vbCr.
' The exact same logic works fine when text is not in a table.
' The table behaviour is problematic because a table cell
' can contain multiple sentences and only the last one would have the vbCr.
Select Case sentence.type
Case EtSentenceType.Normal
rng.HighlightColorIndex = WdColorIndex.wdGray25
rng.Font.ColorIndex = WdColorIndex.wdAuto
Case EtSentenceType.Question
rng.HighlightColorIndex = WdColorIndex.wdBrightGreen
rng.Font.ColorIndex = WdColorIndex.wdAuto
Case EtSentenceType.Exclamation
rng.HighlightColorIndex = WdColorIndex.wdRed
rng.Font.ColorIndex = WdColorIndex.wdWhite
Case EtSentenceType.SingleQuote, EtSentenceType.DoubleQuote
rng.HighlightColorIndex = WdColorIndex.wdTurquoise
rng.Font.ColorIndex = WdColorIndex.wdAuto
Case EtSentenceType.Heading
rng.HighlightColorIndex = WdColorIndex.wdYellow
rng.Font.ColorIndex = WdColorIndex.wdAuto
Case EtSentenceType.List
rng.HighlightColorIndex = WdColorIndex.wdPink
rng.Font.ColorIndex = WdColorIndex.wdAuto
End Select
end if
End loop
Next