How do you confiugre /export requestHandler in SolrCloud to use all shards

Question

How do you confiugre /export requestHandler in SolrCloud to use all shards

916 Views Asked by E Paiz At 15 August 2025 at 10:37

I'm using solr 4.10.2. I got an /export handler working to export large datasets. When I deployed the config into my solr cluster environment I noticed that the export function was missing some records.

If I ran the same query string through /select and /export I would get less records in the /export call.

Is there anything special you need to do to get the /export to work in a SolrCloud environment?

  <requestHandler name="/export" class="solr.SearchHandler">
    <lst name="invariants">
      <str name="rq">{!xport}</str>
      <str name="wt">xsort</str>
      <str name="distrib">false</str>
    </lst>

    <arr name="components">
      <str>query</str>
    </arr>
  </requestHandler>

I tried changing the "distrib" attribute to true hoping that would help, but that caused other errors.

Any suggestions?

Original Q&A

There are 2 best solutions below

E Paiz On 26 December 2016 at 22:45

Here is some code that will get what is described above:

final CloudSolrServer server = new CloudSolrServer(zooKeeperEndpoint);
server.connect();
final ClusterState clusterState = server.getZkStateReader().getClusterState();
// and get the leader of the collection...
Replica leader1 = clusterState.getLeader("search_index", "shard1");
Replica leader2 = clusterState.getLeader("search_index", "shard2");
Replica leader3 = clusterState.getLeader("search_index", "shard3");

List<String> listOfNodes = new ArrayList<String>();
listOfNodes.add((String) leader1.get("core"));
listOfNodes.add((String) leader2.get("core"));
listOfNodes.add((String) leader3.get("core"));

Then loop over the list calling each core of the solr index:

String solrURL = "http://mysolrserver/solr" + "/" + nodeEndpoint + "/export?q=*:*" + "&fq=text:\"*SEARCHSTRING*\"&fl=field1,field2&sort=sortFieldId asc";

**MatsLindh** · Accepted Answer

The /export endpoint is only relevant to the local node, but the Streaming Expressions API (available under /stream without any further configuration) is built on top of the /export endpoint and is meant to be the Cloud alternative.

This also allows you to process the content when requesting it, if applicable.

The required parameters for /stream is the same as for the /export.

But since you're on 4.10.2, you're going to have to request the clusterstate.json from Zookeeper and then query each node by itself, before merging the results locally.

You can retrieve this file by connecting to Zookeeper:

zkCli.sh -server ip:2181

and then retrieve the clusterstate:

get /clusterstate.json

You'll find a list of shards and their replicas for each collection, and you can then iterate over those values and retrieve your results from the /export handler on each server.

How do you confiugre /export requestHandler in SolrCloud to use all shards

There are 2 best solutions below

Related Questions in SOLR

Related Questions in SOLRJ

Related Questions in SOLRCLOUD

Trending Questions

Popular # Hahtags

Popular Questions