Databricks - converting Spark dataframe to table: is it the same data source?

452 Views Asked by beyondtdr At 17 August 2025 at 18:44

You will need to perform quite some compute to make from the source dataframe, a Spark table, no? Or are dataframe and table both a pointer to the same data (i.e. when creating a table you are not creating duplicate data)?

I guess what I'm trying to figure out is whether you can 'switch on switch off' from a Spark dataframe to a table or if doing so is (very) computationally expensive (it's big data, after all...)

Original Q&A

There are 1 best solutions below

Nikunj Kakadiya On 26 April 2021 at 04:45

Dataframe and table both are different in spark.

Dataframe is an immutable distributed collection of data.

Table is the one which has metadata that points to the physical location form where it has to read the data.

When you are converting spark dataframe to a table , you are physically writing data to disc and that could be anything like hdfs,S3, Azure container etc. Once you have the data saved as table you can read it from anywhere like from different spark job or through any other work flow.

Now talking about dataframe it is just valid for the specific spark session in which you created that dataframe and once you close your spark session you cannot read that dataframe or access it values. Dataframe does not have any specific memory location or physical path where it gets saved. Dataframe is just the representation of the data that you read from any specific location.

Databricks - converting Spark dataframe to table: is it the same data source?

There are 1 best solutions below

Related Questions in APACHE-SPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in SPARKTABLE

Trending Questions

Popular # Hahtags

Popular Questions