Spark : Replace collect()[][]

#spark

I am having code as:

new_df=spark.sql("Select col1,col2 from table1 where id=2").collect()[0][0]

I have tried toLocalIterator() but getting message that is not subscriptable.
Please suggest a better way to replace collect()[0][0].

Top comments (1)

Darren Fuller • Jul 10 '21

So that looks like you're after the value in the first column of the first row. If that's the case then you could use the following.

spark.sql("SELECT col1, col2 FROM table1 WHERE id=2").first()[0]

That will return the first row as a Row object which you can then access via index. You should find it works better as well as collect will pull all of the data to the driver before you attempt to access a single row