clickhouse cluster multiple record when insert into data

1.8k Views Asked by At

I use ReplicatedMergeTree and Distributed table in clickhouse to make a HA cluster. And I think it should store two replicas in cluster,it will be ok when one of node has so problems. This is some of my configuration(config.xml): ...

        <logs>
        <shard>
            <weight>1</weight>
            <internal_replication>true</internal_replication>
            <replica>
                <host>node1</host>
                <port>9000</port>
            </replica>
            <replica>
                <host>node2</host>
                <port>9000</port>
            </replica>
        </shard>
        <shard>
            <weight>1</weight>
            <internal_replication>true</internal_replication>
            <replica>
                <host>node2</host>
                <port>9000</port>
            </replica>
            <replica>
                <host>node3</host>
                <port>9000</port>
            </replica>
        </shard>
        <shard>
            <weight>1</weight>
            <internal_replication>true</internal_replication>
            <replica>
                <host>node3</host>
                <port>9000</port>
            </replica>
            <replica>
                <host>node1</host>
                <port>9000</port>
            </replica>
        </shard>
        </logs>
...
<!-- each node is different -->
<macros>
    <layer>01</layer>
    <shard>01</shard>
    <replica>node1</replica>
</macros>
<!-- below is node2 and node3 configuration 

<macros>
    <layer>02</layer>
    <shard>02</shard>
    <replica>node2</replica>
</macros>

<macros>
    <layer>03</layer>
    <shard>03</shard>
    <replica>node3</replica>
</macros>
-->
...

And then I create table in each node by clickhouse-client --host cmd:

create table if not exists game(uid Int32,kid Int32,level Int8,datetime Date) 
ENGINE = ReplicatedMergeTree('/clickhouse/data/{shard}/game','{replica}') 
PARTITION BY  toYYYYMMDD(datetime)  
ORDER BY (uid,datetime);

After create ReplicatedMergeTree table , I then create distribute table in each node (just for each node have this table, in fact it only create on one node)

CREATE TABLE game_all AS game  
ENGINE = Distributed(logs, default, game ,rand())

This is just ok now.And I also think it is ok when i insert data to game_all.But when I query data from game table and game_all table , I find it must be something wrong. Because I insert one record to game_all table ,but the result is 3 which it must be one ,and I query each game table ,just one table has 1 record.Finally I check each node's disk and it seems to have no replicas in this table ,Because just one node have some disk use over 4KB ,others have no disk use just 4KB.

0

There are 0 best solutions below