I have Solr 9 installed and working (on Windows 10) after following these tutorials - https://solr.apache.org/guide/solr/latest/getting-started/solr-tutorial.html
I'm using the techproducts_config that comes with the install and is supposed to handle multiple file types, as the output states when I try to index using the built in post.jar:
java -jar -Dc=cd2 -Dauto .\post.jar /pathTo/myFiles
Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
However, for each of my files, the post tool reports a 404:
POSTing file example1.txt (text/plain) to [base]/extract - SimplePostTool: WARNING: Solr returned an error #404
I only have success if I call specific file types (text files, here):
java -jar -Dc=cd2 -Dauto .\post.jar /pathTo/myFiles/*.txt
The built-in solrconfig.xml
that I'm using has an update handler:
<requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" >
Works after removing everything and starting clean.