DEV Community

Chris
Chris

Posted on

How to handle complex json schema

I am working in databricks. I have a mounted external directory that is an s3 bucket with multiple subdirectories containing call log files in json format. The files are irregular and complex, when i try to use spark.read.json or spark.sql (SELECT *) i get the UNABLE_TO_INFER_SCHEMA error. the files are too complex to try and build a schema manually, plus there are thousands of files. what is the best approach for creating a dataframe with this data?

Top comments (0)

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up