I recently passed the GCP Professional Data Engineer certification exam and some people asked me for tips for the exam and study materials, so I decided to write this post to explain the path I took.
This is certainly not the only possible way to prepare for this test, but it was the one that worked for me. So be open-minded as each person has different ways of studying and learning.
Unlike some other certifications, the Professional Data Engineer (PDE) is definitely not a simple exam, where just study for 8 hours and you'll be ready. Google writes its questions in a way that only someone with hands-on experience and an understanding of their services can get across.
It is important to note that a certification is the validation of the knowledge you gain. The goal is not for you to memorize questions, but to actually understand the services that the cloud offers to be able to apply them in your day to day work. Keep this in mind as you prepare.
For this exam, Google recommends the following background: 3+ years of industry experience including 1+ years designing and managing solutions using Google Cloud. At the time I took the test I had 3 years of experience in the field of data engineering but I had no background with data-oriented cloud services. It is important to say that I already had a knowledge base in Cloud as I used AWS for 6 months in my previous work, but it was in a devops context and not data engineering. Either way, I think it's important for a person to have basic knowledge of Cloud architecture before attempting a specific exam like the GCP PDE. Therefore, for the GCP Professional Data Engineer exam I needed 4 months of intense studies to feel minimally confident to take the exam.
Before starting your preparation, it is worth exploring the official website. It contains all up-to-date information about the test and its content. It is worth mentioning that on the official website the cost of the exam is 200 dollars, but because I live in Brazil, I only paid 120 dollars. Google provides a section of the site, called Exam Guide, to explain what is on the test. So it's worth reading before starting your study and also taking a look when you feel you've studied a considerable amount of material, to see where you stand on the requirements.
If your company is a Google Cloud partner, it is very likely that it has the benefit of vouchers for its employees. These vouchers are usually available to anyone who completes a knowledge path made available by Google on the Qwiklabs platform.
The exam has a total of 50 questions and is 2 hours long. I consider this time enough to answer all the questions and even review the ones that were left as doubts.
And the last comment about the test. Due to the COVID-19 pandemic, it is possible to take the exam online. Although this way is a bit boring because Google requires you to have a well-prepared environment for the test. Personally, I prefer to do it in person at an accredited center as it eliminates the risk of possible computer and internet problems that may occur at home as well as allowing a greater level of concentration for the test.
There are many online and face-to-face courses that prepare you for the GCP Professional Data Engineer exam. I will register here the ones I had contact with and I can attest to the quality.
This was the main course I used to prepare for the exam. This is probably the main study platform for Cloud certifications. This course pretty much covers everything you need to pass the exam and is very detailed. Although more expensive, this platform also includes labs to practice what is taught in class.
This was another course I took and I really enjoyed it. The instructor can make links between services in a very didactic way. It is also worth mentioning that Udemy always promotes courses. This one in particular I managed to buy for 25 reais. The course structure is constantly evolving along with the test content.
This is the platform where Google makes its own trainings available. Companies that are Google Cloud partners typically have access to this platform. I took most of the courses on the data engineering track but I consider them to be of average quality. The best thing about this platform are the numerous labs to practice. Cloud Guru also has labs but Qwiklabs has a lot more.
As I mentioned at the outset, practical experience is very important when studying for certification. While Google requires a year of cloud work to pass the exam, you can succeed in other ways. The first and most recommended is to create an account in the so-called Free Tier. This account will allow you to use Google Cloud Platform resources to practice what you learn in class. All the courses mentioned teach how to create this account. But if you want to check it out, you can take a look at this video tutorial.
The other way to practice is through platforms that offer labs. As I mentioned in the last section, the two platforms I've used that offer this feature are Cloud Guru and Qwiklabs, with Qwiklabs having a larger amount of labs to practice with.
I make it clear that I prefer the approach of creating a Free Tier account as it allows for a much more complete learning experience than taking a ready-to-use environment.
In addition to the courses, other resources were essential for passing the exam.
This excellent book written by Dan Sullivan (Yes, the same author as the second course listed) contains almost all the information needed to pass the exam. It has well-organized chapters on GCP services and provides many key points to remember to solve the exam. At the end of each section, there are tests on the presented content. To tell you the truth, after reading this book, I was able to improve my knowledge by about 40-50%, and my results in the mock test improved significantly.
The YouTube channel The Cloud Girl and the book Visualizing Google Cloud are produced by Priyanka Vergadia, a Google Cloud Developer Advocate. She manages to explain GCP's products and services through beautifully crafted illustrations. Both the book and the channel are great teaching resources and helped me a lot in understanding important concepts for the exam.
I consider reading the documentation of each GCP service and product very important not only for the test but also for playing the role of GCP Data Engineer in the day to day work. I always recommend consulting the documentation when any point of doubt about Google Cloud services arises, that is, it is a resource that you should use from the initial moment of preparation for the exam until the end.
An extremely important step in your preparation is taking practical exams. This is the classic, the more the better. See which are the areas of study where you are getting less right and read the corresponding documentation again. Prepare yourself for many business scenario questions, asking which technology to use and for questions with multiple alternatives. In my exam, I found several questions to select two alternatives. Below I have listed the platforms I used throughout my preparation. An important detail is that on some of the platforms it is possible to pause the simulations, so you don't have to do all 50 questions at once. Although I recommend that you do at least three practical exams without interruptions.
- Exam Topics
- Practical exam at the end of the Cloud Guru course
- Practical exam at the end of the Dan Sullivan course
- Practical exam at the begging of the Dan Sullivan book and at the end of each chapter
- Sample Questions do Google
I also watched the videos on the AwesomeGCP YouTube channel by Sathish VJ. It is an excellent source of study because in addition to tips and content itself, it also has several videos with commented questions.
These are the main services that most fell on my test, that is, that had two or more questions about them:
- ML concepts + AI at GCP (Vertex AI, Vision, NLP etc APIs).
- BigQuery (some questions covering details about the tool like backup and SQL structures like window functions).
- BigTable (the main detail is design of tables to avoid hot spots).
- Cloud SQL.
- Dataproc (some issues addressing Dataproc + HDFS).
- Pub/sub (especially used for scenarios where there needs to be decoupling between systems).
- Cloud Storage (main topic was storage classes).
Other services appeared with specific questions, with a maximum of two questions each:
- Cloud Spanner
- Compute Engine
- Data Loss Prevention
It is also very important to be familiar with cloud concepts such as IAM, which pertains to permissioning resources in the Cloud.
In this post I presented the steps I followed to prepare for the GCP Professional Data Engineer exam. As explained at the beginning, there is no “silver bullet” to succeed in the test. Some things work better with one person than another, and I'm sure there are other (and better) ways to prepare.
Honestly, even studying very intensively for 4 months, after the third question on the test my feeling was that I only knew a little bit about everything and that it wouldn't be enough to pass, since several questions asked small details about the tools. For several moments in the test I was sure I wouldn't be tested and in those moments, anxiety and despair took over me. I wasted at least 5 minutes of exam time staring at nothing and regretting that I wasn't doing well with the answers. But I tried to stay calm and concentrate. At the end of the exam I received the result of "Pass". I was extremely happy and relieved to know that all the effort and sacrifices paid off. After 3 days Google finally sent me the result confirmation and the GCP Professional Data Engineer certificate.
If you have any questions about the preparation, feel free to message me. I hope I was able to convey useful information to you, and I wish you good luck in the exam!