I have some trouble when I trying to add column to Cassandra(4.0.5) ColumnFamily.
I use 3 node cluster in 1 DC
./nodetool -h cass-host status
Datacenter: PERF-DC
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.24.40.53 391.57 KiB 16 100.0% 2e8207df-1954-4ba6-b438-1c1215cd264f PERF-DC-RACK1
UN 10.24.40.133 388.83 KiB 16 100.0% faaa220b-83ec-4489-8814-83739cf0c6c5 PERF-DC-RACK7
UN 10.24.40.132 402.47 KiB 16 100.0% ef89f057-a315-4451-83e3-a9c27a0ee0c9 PERF-DC-RACK8
10.24.40.53 - cass-host1
10.24.40.133 - cass-host2
10.24.40.132 - cass-host3
I have one ColumnFamily
CREATE TABLE table_01
(
tdsrId BIGINT,
id BIGINT,
contents VARCHAR,
PRIMARY KEY ((tdsrId),id)
) WITH default_time_to_live = 3600
AND gc_grace_seconds = 3600;
At first I have started activity for reading data from my java-app
HOST: cass-host3
Consistency: LOCAL_QUORUM
QUERY: SELECT tdsrId,id,contents FROM table_01 WHERE tdsrId=1234567
load: 1000 operation per second
And then I tried to add column to ColumnFamily:
$cqlsh cass-host1
Connected to PERF-CLUSTER at cass-host1:9042
[cqlsh 6.0.0 | Cassandra 4.0.5 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cassandra@cqlsh> alter table ks.table_01 add t01 int;
As a results:
1)On the cass-host3 I got 5 errors in cassandra/log/debug.log
ERROR [Messaging-EventLoop-3-1] 2023-12-26 18:03:01,094 InboundMessageHandler.java:182 - /10.24.40.53:7000->/10.24.40.132:7000-SMALL_MESSAGES-ff4dd09b unexpected exception caught while deserializing a message
java.lang.RuntimeException: Unknown column t01 during deserialization
at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:489)
at org.apache.cassandra.db.filter.ColumnFilter$Serializer.deserializeRegularAndStaticColumns(ColumnFilter.java:1072)
at org.apache.cassandra.db.filter.ColumnFilter$Serializer.deserialize(ColumnFilter.java:1021)
at org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:928)
at org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:833)
at org.apache.cassandra.net.Message$Serializer.deserializePost40(Message.java:782)
at org.apache.cassandra.net.Message$Serializer.deserialize(Message.java:642)
at org.apache.cassandra.net.InboundMessageHandler.processSmallMessage(InboundMessageHandler.java:168)
at org.apache.cassandra.net.InboundMessageHandler.processOneContainedMessage(InboundMessageHandler.java:151)
at org.apache.cassandra.net.AbstractMessageHandler.processFrameOfContainedMessages(AbstractMessageHandler.java:242)
at org.apache.cassandra.net.AbstractMessageHandler.processIntactFrame(AbstractMessageHandler.java:227)
at org.apache.cassandra.net.AbstractMessageHandler.process(AbstractMessageHandler.java:218)
at org.apache.cassandra.net.FrameDecoder.deliver(FrameDecoder.java:321)
at org.apache.cassandra.net.FrameDecoder.channelRead(FrameDecoder.java:285)
at org.apache.cassandra.net.FrameDecoder.channelRead(FrameDecoder.java:269)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
2)On my java-app I have 5 errors
2023-12-26 18:02:53.941 ERROR 1821233 --- [ scheduling-1] console_out : err count: 5, message:Query;
CQL [SELECT tdsrId,id,contents FROM table_01 WHERE tdsrId=?];
Cassandra failure during read query at consistency LOCAL_QUORUM (2 responses were required but only 1 replica responded, 1 failed);
nested exception is com.datastax.oss.driver.api.core.servererrors.ReadFailureException:
Cassandra failure during read query at consistency LOCAL_QUORUM (2 responses were required but only 1 replica responded, 1 failed)
And after that, next cql works fine without any error.
I expect, that ALTER TABLE is danger operation and I lost some query result, witch had been executing simultaneously with ALTER.
Can someone explain, is this behave correctly or may be I tune my cassandra and use ALTER TABLE ADD COLUMN without this errors?
Are you running alter and select statements via some automation job that does these in quick successions? Could you perform
nodetool describeclusteras soon as you do theALTERstatement to ensure the schema changes have propagated to all the nodes in the cluster (in which case all nodes will have the same UUID of the schema) prior to issuing a select on the newly added column on that table? If the result of the command has multiple UUIDs, that means the schema changes have not propagated to other/all nodes in the cluster and you need to investigate it by looking atsystem.loganddebug.logfile(s).On the other issue that you ran into,
This is case where the C* cluster isn't properly spec'd for the load that you're attempting to put on the cluster. Have you already gone through the proper cluster sizing and testing of the antipated (+ some buffer for cushion) testing of the actual load that will be operated on this cluster during the initial planning stages? If not, I'd strongly encourage you to check out this documentation and perform the necessary testing/sizing of the cluster. Cheers!