Databricks Delta uses both Apache Spark and Databricks File System (DBFS) to provide a transactional storage layer that can do incredible things fo...
For further actions, you may consider blocking this person and/or reporting abuse
Hi Will,
Thanks for amazing write up.
But I am facing an issue while executing cmd:
tweets.write.format("delta").mode("append").saveAsTable("tweets")
for the first time the data is stored in delta table, but executing it again gives me the error:
"org.apache.spark.sql.AnalysisException: Cannot create table ('
default.tweets'). The associated location ('dbfs:/user/hive/warehouse/tweets') is not empty.;"
How can I make sure the data is continuously getting stored in table format as well.
Thanks in advance
Hey Will nice post, well I think, I would directly write data to delta table instead of writing it first to parquet files because if I will write them as parquet and then read them in delta table then only first time row present in parquet files on DBFS will get ingested into table, and rows coming after that they will not get ingested into table and I would have to manually run read.saveAsTable() to get them ingested into delta table, Try it, Please share your thoughts. Thanks.
Hi Will,
When I try to write a streaming data to a partitioned managed delta table it's not loading data into it and also it's not showing any error,
but the same thing working fine with non partitioned managed delta table
what I'm missing here ??
dfWrite.writeStream\
.partitionBy("submitted_yyyy_mm")\
.format("delta")\
.outputMode("append")\
.queryName("orders")\
.option("checkpointLocation", orders_checkpoint_path)\
.table(user_db+"."+orders_table)