searching "-" in websolr

5.3k Views Asked by At

websolr is returning

 RSolr::Error::Http - 400 Bad Request
Error: <html><head><title>Apache Tomcat/6.0.28 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Cannot parse '----': Encountered &quot; &quot;-&quot; &quot;- &quot;&quot; at line 1, column 1.
Was expecting one of:
   &quot;(&quot; ...
   &quot;*&quot; ...
   <QUOTED> ...
   <TERM> ...
   <PREFIXTERM> ...
   <WILDTERM> ...
   &quot;[&quot; ...
   &quot;{&quot; ...
   <NUMBER> ...

when ever tried to search "-" character.

other special characters works fine like ":" etc i have tried to use CGI.escape but its not making escape to these characters.

2

There are 2 best solutions below

0
On BEST ANSWER

Have you tried escaping it with backslash?

Normally when you index your documents, the tokenizer will remove dash characters on their own, so you may want to just strip the dash anyway, unless you mean for it to be a negative query.

The full Solr query syntax is here: http://wiki.apache.org/solr/SolrQuerySyntax

0
On

As Chris correctly notes, you need to escape the backslash.

Depending on which query parser you're using, there are some special characters that have meaning. As of this writing, the Lucene (and thus Solr) query parser assigns special meaning to these characters:

+ - && || ! ( ) { } [ ] ^ " ~ * ? : \

You should refer to the docs for Lucene query parser syntax for their full meaning. The default Solr query parser offers a superset of the Lucene query parser syntax, as described by the SolrQueryParser wiki page.

If you don't want to worry about escaping things, the DisMax Query Parser is designed to accept input that's closer to what a user might type into a search box. I haven't tested the various special against it recently, but as a rule it's probably more graceful in the input that it accepts.