When Solr 9 updates managed-schema.xml on the file system?

378 Views Asked by At

In the out of the box configuration (standalone) with a custom configset I observed that sometimes the managed-schema.xml changes, sometimes it does not when creating and filling new core with data. Noticed this because the configset is under version control and diff is easy to see as well as revert the automatic changes, which I did several times. I've also deleted cores and created same-named ones.

If it matters solrconfig.xml is the same as in the default configset. New cores are added by a POST to /api/cores (only name and confiSet given) and filled with /update API.

The goal is to have configset as a starting point for each core's own schema, but not to change the managed schema all the time.

The question is: what are the conditions for Solr to change the managed schema on disk? How to prevent that and use per core schemas derived from one configset?

The most confusing is that it works differently. For example, after all those manipulations (creating, deleting cores, restoring original managed schema, restarting solr) the managed-schema.xml does not change any more, but I can't trust that it will not because I do not know when it supposed to automatically change.

If it matters, when the schema changes, this appears near the top:

   <!-- Solr managed schema - automatically generated - DO NOT EDIT -->

Note that Solr 9 compared to previous versions changed the file name from managed-schema to managed-schema.xml.

I've found that changing a specific schema via Solr web interface changes the managed schema file. The funny thing is however that cores still retain their own schema's (while Solr admin web interface shows the same files for each core), so I still do not understand what is the logic and how to use configset only to bootstrap cores' settings. Even conflicting dynamic fields seem to be ok when destined for different cores, but it's hard to say whether it's just a coincidence or part of the normal logic.

In short the desired logic is:

  • that specific configset sets an initial config for each new core (it mostly defines dynamic and copy fields)
  • that configset never changes but via changes to the schema xml file
  • however, each core's very own schema can be automatically changed as new data comes in (each core to have schemaless mode)

So far it worked mostly like that until I noticed the changes to the configset itself happening from time to time. It's of course possible to make one configset per core, but it's an extra hassle if there is a better way.

0

There are 0 best solutions below