DEV Community

Streaming Data in Databricks Delta Tables

Will Velida on July 23, 2018

Databricks Delta uses both Apache Spark and Databricks File System (DBFS) to provide a transactional storage layer that can do incredible things fo...

Read full post

Swati Arora • Aug 7 '18 • Edited

Hi Will,
Thanks for amazing write up.
But I am facing an issue while executing cmd:
tweets.write.format("delta").mode("append").saveAsTable("tweets")
for the first time the data is stored in delta table, but executing it again gives me the error:
"org.apache.spark.sql.AnalysisException: Cannot create table ('default.tweets'). The associated location ('dbfs:/user/hive/warehouse/tweets') is not empty.;
"

How can I make sure the data is continuously getting stored in table format as well.

Thanks in advance

Muhammad Bilal Shafqat • Apr 10 '20

Hey Will nice post, well I think, I would directly write data to delta table instead of writing it first to parquet files because if I will write them as parquet and then read them in delta table then only first time row present in parquet files on DBFS will get ingested into table, and rows coming after that they will not get ingested into table and I would have to manually run read.saveAsTable() to get them ingested into delta table, Try it, Please share your thoughts. Thanks.