DEV Community

Discussion on: Tutorial: Intro to Apache Iceberg with Apache Polaris and Apache Spark

Collapse
 
timurkh profile image
Timur K.

Unfortunately I get following error when try to execute configuration block printed by bootstrap.py. Smth wrong with S3 creds. Any ideas how to fix this?

25/11/24 21:51:54 ERROR Utils: Aborting task
java.io.UncheckedIOException: Failed to close current writer
    at org.apache.iceberg.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:128)
    at org.apache.iceberg.io.RollingFileWriter.close(RollingFileWriter.java:156)
    at org.apache.iceberg.io.RollingDataWriter.close(RollingDataWriter.java:32)
    at org.apache.iceberg.spark.source.SparkWrite$UnpartitionedDataWriter.close(SparkWrite.java:778)
    at org.apache.iceberg.spark.source.SparkWrite$UnpartitionedDataWriter.commit(SparkWrite.java:760)
    at org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$1(WriteToDataSourceV2Exec.scala:470)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1397)
    ...
    at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
    at software.amazon.awssdk.services.s3.DefaultS3Client.putObject(DefaultS3Client.java:11883)
    at org.apache.iceberg.aws.s3.S3OutputStream.completeUploads(S3OutputStream.java:443)
    at org.apache.iceberg.aws.s3.S3OutputStream.close(S3OutputStream.java:269)
    at org.apache.iceberg.aws.s3.S3OutputStream.close(S3OutputStream.java:255)
    at org.apache.iceberg.shaded.org.apache.parquet.io.DelegatingPositionOutputStream.close(DelegatingPositionOutputStream.java:40)
    ... 26 more
25/11/24 21:51:54 ERROR DataWritingSparkTask: Aborting commit for partition 0 (task 0, attempt 0, stage 0.0)
25/11/24 21:51:54 WARN AuthManagers: Inferring rest.auth.type=oauth2 since property credential was provided. Please explicitly set rest.auth.type to avoid this warning.
25/11/24 21:51:54 WARN S3FileIO: Encountered failure when deleting batch
java.lang.IllegalStateException: Invalid S3 Credentials: s3.access-key-id not set
    at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkState(Preconditions.java:603)
    at org.apache.iceberg.aws.s3.VendedCredentialsProvider.checkCredential(VendedCredentialsProvider.java:185)
    at org.apache.iceberg.aws.s3.VendedCredentialsProvider.refreshCredential(VendedCredentialsProvider.java:158)
    at java.base/java.util.Optional.orElseGet(Optional.java:369)
    at org.apache.iceberg.aws.s3.VendedCredentialsProvider.lambda$new$0(VendedCredentialsProvider.java:63)
    at software.amazon.awssdk.utils.cache.CachedSupplier.lambda$jitteredPrefetchValueSupplier$8(CachedSupplier.java:300)
    at software.amazon.awssdk.utils.cache.CachedSupplier$PrefetchStrategy.fetch(CachedSupplier.java:448)
    at software.amazon.awssdk.utils.cache.CachedSupplier.refreshCache(CachedSupplier.java:208)
    at software.amazon.awssdk.utils.cache.CachedSupplier.get(CachedSupplier.java:135)
    at org.apache.iceberg.aws.s3.VendedCredentialsProvider.resolveCredentials(VendedCredentialsProvider.java:72)
    at software.amazon.awssdk.auth.credentials.AwsCredentialsProvider.resolveIdentity(AwsCredentialsProvider.java:54)
    at software.amazon.awssdk.services.s3.auth.scheme.internal.S3AuthSchemeInterceptor.lambda$trySelectAuthScheme$6(S3AuthSchemeInterceptor.java:169)
    at software.amazon.awssdk.core.internal.util.MetricUtils.reportDuration(MetricUtils.java:81)
    at software.amazon.awssdk.services.s3.auth.scheme.internal.S3AuthSchemeInterceptor.trySelectAuthScheme(S3AuthSchemeInterceptor.java:169)
    at software.amazon.awssdk.services.s3.auth.scheme.internal.S3AuthSchemeInterceptor.selectAuthScheme(S3AuthSchemeInterceptor.java:87)
    at software.amazon.awssdk.services.s3.auth.scheme.internal.S3AuthSchemeInterceptor.beforeExecution(S3AuthSchemeInterceptor.java:67)
    at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.lambda$beforeExecution$1(ExecutionInterceptorChain.java:59)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
    at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.beforeExecution(ExecutionInterceptorChain.java:59)
    at software.amazon.awssdk.awscore.internal.AwsExecutionContextBuilder.runInitialInterceptors(AwsExecutionContextBuilder.java:315)
    at software.amazon.awssdk.awscore.internal.AwsExecutionContextBuilder.invokeInterceptorsAndCreateExecutionContext(AwsExecutionContextBuilder.java:151)
    at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.invokeInterceptorsAndCreateExecutionContext(AwsSyncClientHandler.java:67)
    at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:76)
    at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
    at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
    at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
    at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
    at software.amazon.awssdk.services.s3.DefaultS3Client.deleteObjects(DefaultS3Client.java:4020)
    at org.apache.iceberg.aws.s3.S3FileIO.deleteBatch(S3FileIO.java:340)
    at org.apache.iceberg.aws.s3.S3FileIO.lambda$deleteFiles$3(S3FileIO.java:276)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
25/11/24 21:51:54 WARN S3FileIO: Failed to delete object at path s3://lakehouse/db/example/data/00000-0-9167437d-312f-4df6-8a65-19d91e04e5f9-0-00001.parquet


...


---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
Cell In[2], line 19
     17 spark.sql("CREATE NAMESPACE IF NOT EXISTS polaris.db").show()
     18 spark.sql("CREATE TABLE IF NOT EXISTS polaris.db.example (name STRING)").show()
---> 19 spark.sql("INSERT INTO polaris.db.example VALUES ('example value')").show()
     20 spark.sql("SELECT * FROM polaris.db.example").show()
Enter fullscreen mode Exit fullscreen mode
Collapse
 
alexmercedcoder profile image
Alex Merced

java.lang.IllegalStateException: Invalid S3 Credentials: s3.access-key-id not set
this seems to be the main problem, make sure that your s3 credentials are set in your environment and in your polaris servers environmental variables.

Collapse
 
timurkh profile image
Timur K.

Alex, thanks a lot for your reply. I think I found the cause, it is the option "stsUnavailable": True in storageConfigInfo properties. I belive it should not be there, as minio supports STS and removing it fixed the issue.