I want to use Syntaxnet for getting the POS tags of tweets (and more specifically, extracting named entities from the text). However, Parsey McParseface is case-sensitive by default. Since tweets often not use capitalization, I was thinking of using a case-less tagger. I found something about capitalization in the code, but I was not sure if and how to use it:
Let me give an example to be more clear. Consider the example sentences John gave the money to Maria
and john gave the money to maria
(with case and without case):
With caps:
gave VBD ROOT
+-- John NNP nsubj
+-- money NN dobj
| +-- the DT det
+-- to IN prep
+-- Maria NNP pobj
Without caps:
gave VBD ROOT
+-- john NNP nsubj
+-- money NN dobj
| +-- the DT det
+-- to TO prep
+-- maria NN pobj
As you can see, Maria is a NNP, whereas maria (without caps) is NN. When extracting named entities, it makes a difference if a word is tagged as NN or as NNP.
Is there a way to improve this? Is there a case-less Parsey McParseface for Syntaxnet?