A note for the TempDir in Glue ETL Pyspark case: the fact that is temporary could probably let understimate its role in the process. Instead, from my experience, it is preferred to be pointed to a specific prefix in a bucket to avoid reaching the limit of 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second.
So it is be better to use i.e. s3://aws-glue-temporary--us-east-1/ds1_raw_to_refined/, specifying a prefix for each job.
In fact an error like "503 Slow Down" may appear, without however indicating the source of the problem, which it can be the temporary folder, , not the destination one.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
A note for the TempDir in Glue ETL Pyspark case: the fact that is temporary could probably let understimate its role in the process. Instead, from my experience, it is preferred to be pointed to a specific prefix in a bucket to avoid reaching the limit of 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second.
So it is be better to use i.e. s3://aws-glue-temporary--us-east-1/ds1_raw_to_refined/, specifying a prefix for each job.
In fact an error like "503 Slow Down" may appear, without however indicating the source of the problem, which it can be the temporary folder, , not the destination one.