How to join multiple data sources in Cassandra

208 Views Asked by At

I'm working for the first time with Cassandra and I have some doubts. My data sources are csv files. I have three: flights, airplane and airport. I will put the structure of each csv file to contextualize my problem.

Airport

ID_airport | airport | city | state | country | latitude | longitude

Airplane

ID_airplane |type |manufacturer |issue_date |model |engine_type |aircraft_type

Flights

ID_flight |date |Flight_Numb |ID_airplane |ID_airport_origin |ID_airport_dest

DepartureTime |Arrival_time |airline |distance |DepDelay |ArrivalDelay.

The Flights file is the main and has millions of records. The other two are supplemental data.

According to what I read about Cassandra, first should be defined the necessary queries and then created column families that meet our needs. However Cassandra not support JOIN's. How can I relate data that is in a csv file with another in order to create a column family with different csv file fields?

For example, if I want to know which airplane model registers more delays in flights. In the relational model this is possible doing JOIN's but in Cassandra I think it's impossible.

There is any way to do this in Cassadra? How I can have a column family with different csv file fields?

0

There are 0 best solutions below