Sunspot Solr Reindexing failing due to illegal characters

838 Views Asked by At

I'm having an issue where Solr is failing to reindex my site, due to the following error from my production log:

bundle exec rake sunspot:solr:reindex
rake aborted!
RSolr::Error::Http: RSolr::Error::Http - 400 Bad Request
Error: Illegal character ((CTRL-CHAR, code 12))
 at [row,col {unknown-source}]: [155,1]

I am not sure where this 'illegal character' is being generated from, nor where to find this. I more than appreciate everyone's help, as it is causing a 500 server error on my app right now. Thank you, and let me know if more information is needed.

(Rails 3.2) (Rsolr 1.0.10)

1

There are 1 best solutions below

0
On BEST ANSWER

Usually this is caused by bad data in your database. If you're using MySQL you can find any instances of control character 12 with a query like this:

SELECT * FROM table WHERE col REGEXP CHAR(12);

Then you can remove the character from the content of any matched rows & proceed to reindex.

You could also do something like this to remove the control characters:

UPDATE table SET col=REPLACE(col, CHAR(12), '') WHERE col REGEXP CHAR(12);