Cassandra COPY FROM query error, with a CSV file

769 Views Asked by At

The problem:

I'm trying to get it so I can use Cassandra to work with Python properly. I've been using a toy dataset to practice uploading a csv file into Cassandra with no luck. Cassandra seems to work fine when I am not using COPY FROM for csv files.

My intention is to use this dataset as a test to make sure that I can load a csv file's information into Cassandra, so I can then load 5 csv files totaling 2 GB into it for my originally intended project.

Note: Whenever I use CREATE TABLE and then run SELECT * FROM tvshow_data, the columns don't appear in the order that I set them, is this going to affect anything, or does it not matter?

Info about my installations and usage:

  • I've tried running both cqlsh and cassandra with an admin powershell.
  • I have Python 2.7 installed inside of the apache-cassandra-3.11.6 folder.
  • I have Cassandra version 3.11.6 installed.
  • I have cassandra-driver 3.18.0 installed, with conda.
  • I use Python 3.7 installed for everything other than Cassandra's directory.
  • I have tried both CREATE TABLE tvshow and CREATE TABLE tvshow.tvshow_data.

My Python script:

from cassandra.cluster import Cluster

cluster = Cluster()
session = cluster.connect()

create_and_add_file_to_tvshow = [
    "DROP KEYSPACE tvshow;",
    "CREATE KEYSPACE tvshow WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};",
    "USE tvshow;",
    "CREATE TABLE tvshow.tvshow_data (id int PRIMARY KEY, title text, year int, age int, imdb decimal, rotten_tomatoes int, netflix int, hulu int, prime_video int, disney_plus int, is_tvshow int);",
    "COPY tvshow_data (id, title, year, age, imdb, rotten_tomatoes, netflix, hulu, prime_video, disney_plus, is_tvshow) FROM 'C:tvshows.csv' WITH HEADER = true;"
]

print('\n') 
for query in create_and_add_file_to_tvshow:
    session.execute(query)
    print(query, "\nsuccessful\n")

Resulting python error:

This is the error I get when I run my code in the powershell with the following command, python cassandra_test.py.

cassandra.protocol.SyntaxException: <Error from server: code=2000 [Syntax error in 
CQL query] message="line 1:0 no viable alternative at input 'COPY' ([

Resulting cqlsh error:

Running the previously stated cqlsh code in the create_and_add_file_to_tvshow variable in powershell after running cqlsh in the apache-cassandra-3.1.3/bin/ directory, creates the following error.

Note: The following error is only the first few lines to the code as well as the last new lines, I choose not to include it since it was several hundred lines long. If necessary I will include it.

Starting copy of tvshow.tvshow_data with columns [id, title, year, age, imdb, rotten_tomatoes, netflix, hulu, prime_video, disney_plus, is_tvshow].
Failed to import 0 rows: IOError - Can't open 'C:tvshows.csv' for reading: no matching file found,  given up after 1 attempts
Process ImportProcess-44: 
PTrocess ImportProcess-41:
raceback (most recent call last):
PTPProcess ImportProcess-42:
...
...
...
AA   cls._loop.add_timer(timer)
AAttributeError: 'NoneType' object has no attribute 'add_timer'
ttributeError: 'NoneType' object has no attribute 'add_timer'
AttributeError: 'NoneType' object has no attribute 'add_timer'
ttributeError: 'NoneType' object has no attribute 'add_timer'
ttributeError: 'NoneType' object has no attribute 'add_timer'
Processed: 0 rows; Rate:       0 rows/s; Avg. rate:       0 rows/s
0 rows imported from 0 files in 1.974 seconds (0 skipped).

A sample of the first 10 lines of the csv file used to import

I have tried creating a csv file with just these first two lines, for a toy's toy test, since I couldn't get anything else to work.

id,title,year,age,imdb,rotten_tomatoes,netflix,hulu,prime_video,disney_plus,is_tvshow
0,Breaking Bad,2008,18+,9.5,96%,1,0,0,0,1
1,Stranger Things,2016,16+,8.8,93%,1,0,0,0,1
2,Money Heist,2017,18+,8.4,91%,1,0,0,0,1
3,Sherlock,2010,16+,9.1,78%,1,0,0,0,1
4,Better Call Saul,2015,18+,8.7,97%,1,0,0,0,1
5,The Office,2005,16+,8.9,81%,1,0,0,0,1
6,Black Mirror,2011,18+,8.8,83%,1,0,0,0,1
7,Supernatural,2005,16+,8.4,93%,1,0,0,0,1
8,Peaky Blinders,2013,18+,8.8,92%,1,0,0,0,1
0

There are 0 best solutions below