We are migrating from CF 4.51 to CF 2023 and have run into the following problem with Solr collection searches.

I'm assuming that the default Request-Handler used by the Solr Admin Query Tool to query our collection "Resumes" would be the default Request-Handler that a CF-2023 CFSEARCH uses for the same collection. The only Solr parameter I have modified is the min/max for Java memory; and have not touched the .xml file.

It may be that I need to modify the Request-Handler to get the searches to render the same results as the Solr Admin query tool renders?

Basically, we are now taking a variable value from a forms code as shown below:

Then we are populating CFSEARCH with the syntax for a given query by using encodeForURL & decodeFromURL where appropriate in the application.

So our CFSEARCH code is: Note: We have included an line of code for testing.


<CFSEARCH NAME="applicants_#var#"
COLLECTION="Resumes"
TYPE="standard"
CRITERIA="#encodeForURL( Form.keywords)#">

<cfoutput>#applicants_test.recordcount#</cfoutput><cfabort>

The output shows we are using the solr syntax needed for a given search.

However, the only time where we get the correct number of hits is when we use a single keyword in the search. More complex searches involving multiple keywords and/or special characters give us ridiculously high results.

Obviously, I am missing something that is key to producing the accurate results that the Solr Admin tool query tool always provides.

To be more precise I have discovered that the use of any quotes beyond the quotes that are required at the beginning & end of a given syntax or spaces or special characters such as a ? * "~ + or - character will give us high results or cause an error for the search.

For example a search on "unix" AND "linux" renders 3685 hits using the Solr Admin query tool & on our CF 4.51 production server running Verity searches. So we know the results are accurate.

However, using the CF 2023 CFSEARCH code described above we get 43586 hits! About 80% of our collection!

Frankly, at this point I am dead in the water. Can someone please tell me where my problem is and how to overcome it. Thank you very much in advance.

Alex Craig, General Manager

I tried the solution I previously described and got query return hits in many cases that were far greater than the actual values should have been.

In response to Eric Lavault kind advice:


The Request-Handler (qt) field I am using is the default /select. FYI, I made no changes whatsoever to the default values of the Solr Admin UI, I simply type in a valid query syntax such as: "tribology" AND "friction" and I get 19 hits which mirrors the number I get on the production server running CF 4.51

If I understand your URL queston. I am entering the "tribology" AND "friction" in the Q field in the tool and the "Q" response is: ""tribology" AND "friction"",

In cfsearch I am entering "tribology" AND "friction" in the form and the criteriia variable is also "tribology" AND "friction". I don't know how I would ascertain what the solr "Q" response is when running a cfsearch.

I am not using the (e)dismax parser in the Solr Admin Query. I am using the default lucene. To reiterate, I get valid results from the Solr Admin Query tool using the all the default parameters. The only entry I making is the Q field with a valid solar query string.

I am afraid I am going to need to be spoonfed as solr is new to me. I do not see (parsed) params on the page.

Sorry for the need to be spoonfed. But if you can provide some additional guidance I will be happy to try and answer your questions.


Edit 2 Found some Q data that my be helpful.

OK. The Q criteria value of the http link generated by the Solr Admin Query tool is: q=%22tribology%22%20AND%20%22friction%22

Makes sense as that is the ASCII equivalent.

The Q encodeForURL value of the CFSearch criteria output is:

%2522tribology%2522%2BAND%2B%2522friction%2522

Doing some research I discovered the %2522 indicated double quotes and the %2B is + sign.

Well, now why is the same alphanumeric string "tribology" AND "friction" producing different outputs? And more importantly, how do I get CFSearch to generate the ASCII equivalent?

0

There are 0 best solutions below