Elasticsearch assigning primary and replica nodes for indices

434 Views Asked by At

I have setup an Elasticsearch cluster with 1 master, 1 client and 2 data nodes. The client and the 2 data nodes are in one machine and the master is on a separate machine. The IPs are as follows:

192.168.1.3 - master
192.168.1.2:9201 - client
192.168.1.2:9200 - data1
192.168.1.2:9202 - data2

I have data belonging to two indices (movie-ame and movie-eur) and want to keep the data in the nodes as shown below. Using logstash I imported data into data nodes.

movie-ame

primary shards in data1
1 replica in data2

logstash.conf

input {  
  file {
    path => "C:\Users\azinneera\Desktop\logstash-5.1.1\bin\data.csv"
    start_position => "beginning"    
  }
}

filter {  
  csv {
      separator => ","
      columns => ["ID","MovieName","ReleaseYear","Country","Genres"]
  }
}

output {  
    elasticsearch {
        action => "index"
        hosts => ["192.168.1.2:9200"] 
        index => "movie-ame"
    }
    stdout {codec => rubydebug}
}

movie-eur

primary shards in data2
1 replica in data1

logstash.conf

input {  
  file {
    path => "C:\Users\azinneera\Desktop\logstash-5.1.1\bin\movieeur.csv"
    start_position => "beginning"    
  }
}

filter {  
  csv {
      separator => ","
      columns => ["ID","MovieName","ReleaseYear","Country","Genres"]
  }
}

output {  
    elasticsearch {
        action => "index"
        hosts => ["192.168.1.2:9202"] 
        index => "movie-eur"
    }
    stdout {codec => rubydebug}
}

But it seems that data1 acts as primary for both indices and the replicas for both indices are in data2.

This is what the cluster state shows enter image description here enter image description here

1

There are 1 best solutions below

0
On BEST ANSWER

There is nothing wrong. Elasticsearch will put the primary/replica on different nodes and keeps them in sync on a per shard basis.

When you query data out of Elasticsearch, it will query from either the primary or one of the replicas (because they are considered identical copies). Overall, Elasticsearch will handle the load balancing and you don't have to worry about it.

If you really do want to worry about it, there are some settings that you can tweak.