i used autoloader to read data file and write it to table periodically(without partition at first) by below code:
.writeStream\
.option("checkpointLocation", "path") \
.format("delta")\
.outputMode("append")\
.start("table")
Now data size is growing, and want to partition the data with adding this option " .partitionBy("col1") "
.writeStream\
.option("checkpointLocation", "path") \
.partitionBy("col1")\
.format("delta")\
.outputMode("append")\
.start("table")
I want to ask if this option partitionBy("col1") will partition the existing data in the table? If not, how to partition all the data (include existing data and new data ingested)
No, it wont' partition existing data automatically, you will need to do it explicitly. Something like this, test first on a small dataset: