I'm working for the first time with Cassandra and I have some doubts. My data sources are csv files. I have three: flights, airplane and airport. I will put the structure of each csv file to contextualize my problem.
Airport
ID_airport | airport | city | state | country | latitude | longitude
Airplane
ID_airplane |type |manufacturer |issue_date |model |engine_type |aircraft_type
Flights
ID_flight |date |Flight_Numb |ID_airplane |ID_airport_origin |ID_airport_dest
DepartureTime |Arrival_time |airline |distance |DepDelay |ArrivalDelay.
The Flights file is the main and has millions of records. The other two are supplemental data.
According to what I read about Cassandra, first should be defined the necessary queries and then created column families that meet our needs. However Cassandra not support JOIN's. How can I relate data that is in a csv file with another in order to create a column family with different csv file fields?
For example, if I want to know which airplane model registers more delays in flights. In the relational model this is possible doing JOIN's but in Cassandra I think it's impossible.
There is any way to do this in Cassadra? How I can have a column family with different csv file fields?