Apache Spark on a cloud Infrastructure

98 Views Asked by At

How can I process the given data efficiently using Apache Spark on a cloud Infrastructure as a Service (IaaS) platform? I have a dataset of over 60 million data that I need to run the dataset effectively.

1

There are 1 best solutions below

0
Subash On

There are many options to do the same. In Azure you can use Synapse/Azure Data Factory. In GCS,you can use Dataproc cluster with Cloud Composer.It would be great if you can mention the whole scenario what is your exact source(csv/RDBMS table/IOT) and what would be the target/sink then it would be easier to provide answer