<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Peter Hanssens #BlackLivesMatter</title>
    <description>The latest articles on DEV Community by Peter Hanssens #BlackLivesMatter (@petehanssens).</description>
    <link>https://dev.to/petehanssens</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F432483%2Fdcab2528-d822-4b9b-b139-13a8c33a438a.jpg</url>
      <title>DEV Community: Peter Hanssens #BlackLivesMatter</title>
      <link>https://dev.to/petehanssens</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/petehanssens"/>
    <language>en</language>
    <item>
      <title>Welcome, DataEngHack online!</title>
      <dc:creator>Peter Hanssens #BlackLivesMatter</dc:creator>
      <pubDate>Thu, 28 Apr 2022 01:57:00 +0000</pubDate>
      <link>https://dev.to/dataengbytes/welcome-dataenghack-online-300</link>
      <guid>https://dev.to/dataengbytes/welcome-dataenghack-online-300</guid>
      <description>&lt;p&gt;Hey folks,&lt;/p&gt;

&lt;p&gt;Peter Hanssens here... welcome to the online DataEngHack blogging competition that we are running in the month of May!&lt;/p&gt;

&lt;p&gt;This blogging competition is designed to get you hands on with some cutting edge data engineering technology and public exposure for your awesome work in the process. You will be featured on the &lt;a href="https://dataengconf.com.au/"&gt;DataEngAu&lt;/a&gt; website and you have the chance to win an awesome set of prizes. You can submit your blog right away and in order to win a prize, your blog must be submitted by the 31st of May.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prizes
&lt;/h2&gt;

&lt;p&gt;So first up, anyone who submits a blog (with a few caveats around it being an appropriate data engineering blog) will be sent a free DataEngHack t-shirt.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0evi36Rh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/y98ifi9pd00ut5ec747u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0evi36Rh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/y98ifi9pd00ut5ec747u.png" alt="DataEngHack T-Shirt" width="591" height="842"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the top 10 blogs as determined by weighted popularity, will be each given one of either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;one of 5 Lego kits valued around $100 including the Lego Vesper 125&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Zhamak Dheghani's book on Data Mesh&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Technology sponsors
&lt;/h2&gt;

&lt;p&gt;This is a sponsored event and as such we encourage our participants to use at least one of the technologies of our technology partners in their solution. These vendors are leaders in their space and often provide really fun and innovative ways of achieving great Data Engineering outcomes, so why not give them a go:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5eJlyUYk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rvkyhge1s2d9dryxb44e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5eJlyUYk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rvkyhge1s2d9dryxb44e.png" alt="DataEngHack Sponsors" width="880" height="247"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So our list of sponsors are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://datastax.com/"&gt;datastax&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://neo4j.com/"&gt;neo4j&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.snowflake.com/"&gt;snowflake&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://fivetran.com/"&gt;fivetran&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://databricks.com/"&gt;databricks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://azure.microsoft.com/en-au/"&gt;microsoft azure&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/"&gt;aws&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://imply.io/"&gt;imply&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to get involved
&lt;/h2&gt;

&lt;p&gt;See the comments below for the way to register!&lt;/p&gt;

&lt;p&gt;This blog was originally published on &lt;a href="https://dataengconf.com.au/blog/2022-04/welcome-to-dataenghack-online"&gt;DataEngAu&lt;/a&gt;&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>dataarchitecture</category>
      <category>dataops</category>
      <category>hackathon</category>
    </item>
    <item>
      <title>My favourite re:Invent data announcements</title>
      <dc:creator>Peter Hanssens #BlackLivesMatter</dc:creator>
      <pubDate>Thu, 17 Dec 2020 03:18:28 +0000</pubDate>
      <link>https://dev.to/aws-heroes/my-favourite-re-invent-data-announcements-488a</link>
      <guid>https://dev.to/aws-heroes/my-favourite-re-invent-data-announcements-488a</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fuvxgl3iu9mojoy04iz9p.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fuvxgl3iu9mojoy04iz9p.jpeg" alt="Rahul Pathak, VP of Analytics, Amazon Web Services"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hi everyone,&lt;/p&gt;

&lt;p&gt;What a &lt;strong&gt;re:Invent&lt;/strong&gt; it has been so far with so many announcements across the board. My name is Peter Hanssens and I am a Serverless Hero based out of Sydney, Australia where I also run a Data Engineering meetup. I thought I'd spend some time talking about some announcements that are of interest to folks working within the data ecosystem.&lt;/p&gt;

&lt;p&gt;Many of these announcements listed below are from Rahul Pathak's leadership session on &lt;a href="https://virtual.awsevents.com/media/1_gb1zr8jb" rel="noopener noreferrer"&gt;harnessing the power of data with AWS analytics&lt;/a&gt; - well worth a watch if you haven't done so already.&lt;/p&gt;

&lt;h1&gt;
  
  
  Redshift
&lt;/h1&gt;

&lt;p&gt;Redshift is a cloud data warehouse and, up until last re:Invent, coupled compute and storage. Now the RA3 instances have been around for a year, but the new XLPlus instances are available at a much lower price point which is great for established startups to take advantage of the innovative features it brings in being able to scale compute and storage independently.&lt;/p&gt;

&lt;p&gt;Here are my top announcements for Redshift:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-redshift-launches-ra3-xlplus-nodes-managed-storage/" rel="noopener noreferrer"&gt;Amazon Redshift launches RA3.xlplus nodes with managed storage&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-redshift-announces-automatic-table-optimization/" rel="noopener noreferrer"&gt;Automatic Table Optimization&lt;/a&gt; - this is huge as you no longer need to think about distribution or sort keys!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/aws-announces-aqua-for-amazon-redshift-preview/" rel="noopener noreferrer"&gt;Preview - Aqua for Redshift&lt;/a&gt; - game changing query performance - this looks to be a quantum leap forward for Redshift.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-redshift-announces-support-native-json-semi-structured-data-processing/" rel="noopener noreferrer"&gt;Preview - native JSON support&lt;/a&gt; - JSON and semi-structured data are a feature of many modern data sources and being able to parse this natively within Redshift means less pre-work in landing data into your warehouse.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-redshift-now-includes-amazon-rds-for-mysql-and-amazon-aurora-mysql-databases-as-new-data-sources-for-federated-querying-preview/" rel="noopener noreferrer"&gt;Preview - Federated query support for RDS and Aurora MySQL&lt;/a&gt; - this makes it even easier ingest data into your data warehouse.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/aws-announces-amazon-redshift-ml-preview/" rel="noopener noreferrer"&gt;Preview - Amazon Redshift ML&lt;/a&gt; - another feature enabling data engineers to do more within the comforts of a data warehouse using SQL - very keen to see what folks can build with this great functionality.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-redshift-introduces-data-sharing-preview/" rel="noopener noreferrer"&gt;Preview - Data Sharing&lt;/a&gt; - a great new feature that allows companies to share data with other third parties.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-redshift-announces-native-console-integration-with-partners-preview/" rel="noopener noreferrer"&gt;Preview - Native console integration with partners&lt;/a&gt; - another preview aimed at making data integration much faster with third parties such as Salesforce and Slack.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Glue
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fd1.awsstatic.com%2FreInvent%2Fre20-pdp-tier1%2FHarrier%2FAWS-Glue-Elastic-Views_Diagram_V2%402x.f4e61472a99a87c7df15171009dda6923d30eae4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fd1.awsstatic.com%2FreInvent%2Fre20-pdp-tier1%2FHarrier%2FAWS-Glue-Elastic-Views_Diagram_V2%402x.f4e61472a99a87c7df15171009dda6923d30eae4.png" alt="AWS Glue Elastic Views"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AWS Glue is a serverless (Yay!) ETL tool with a data catalogue baked in. There have been some wonderful announcements across re:Invent as well as pre:Invent!&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/glue/features/elastic-views/" rel="noopener noreferrer"&gt;Preview - Elastic Views&lt;/a&gt; - source data from RDS, Aurora, and DynamoDB using SQL to query across them and surface the results continuously in a materialised view to a variety of destinations including Redshift, S3 and Elasticsearch Service.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/11/control-evolution-data-streams-using-aws-glue-schema-registry/" rel="noopener noreferrer"&gt;Pre:Invent - Schema Registry&lt;/a&gt; - this service allows better collaboration across teams maintaining data schemas which allows for schema evolution. It integrates with MSK, Kinesis and Lambda out of the box!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/11/introducing-aws-glue-databrew-visual-data-preparation-tool-to-clean-and-normalize-data-up-to-80-percent-faster/" rel="noopener noreferrer"&gt;Pre:Invent - DataBrew&lt;/a&gt; - making data preparation easier is what this service is all about with the idea that it solves the challenge that data scientists using 80% of their time doing data prep - very much looking forward to exploring this service further.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Lake Formation
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fd1.awsstatic.com%2FreInvent%2Fre20-pdp-tier1%2Fcolossus%2FAmazon-HealthLake_HIW-Diagram%402X.147eb6f1d8fa81bf98125ad26ab489a038ddd3ea.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fd1.awsstatic.com%2FreInvent%2Fre20-pdp-tier1%2Fcolossus%2FAmazon-HealthLake_HIW-Diagram%402X.147eb6f1d8fa81bf98125ad26ab489a038ddd3ea.png" alt="HealthLake"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Lake Formation is a set of best practises in rolling out a data lake on AWS including security and governance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/announcing-preview-aws-lake-formation-features/" rel="noopener noreferrer"&gt;Preview - Transactions, Row-level Security, and Acceleration&lt;/a&gt; - bringing lakehouse features to the data lake.&lt;br&gt;
&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/introducing-amazon-healthlake-to-make-sense-of-health-data/" rel="noopener noreferrer"&gt;HealthLake&lt;/a&gt; - using the FHIR industry standard to bring together lots of disparate and unstructured data sources allowing for powerful querying and search capabilities.&lt;/p&gt;

&lt;h1&gt;
  
  
  EMR
&lt;/h1&gt;

&lt;p&gt;EMR is a big data processing platform that gives you access to open source tools such as Presto, Spark, Flink and Hive to name a few.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-emr-introduces-amazon-emr-studio-makes-it-easier-for-data-scientists-to-build-and-deploy-code/" rel="noopener noreferrer"&gt;EMR Studio&lt;/a&gt; - is a fully managed JupyterNotebook with a rich feature set that you can log into using SSO and your corporate credentials.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/simplify-running-apache-spark-jobs-amazon-emr-amazon-eks/" rel="noopener noreferrer"&gt;EMR on EKS&lt;/a&gt; - now you can run spark jobs on EKS with the rich feature set that EMR brings to the table.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-emr-now-provides-up-to-30-lower-cost-and-up-to-15-improved-performance/" rel="noopener noreferrer"&gt;Graviton2 instances&lt;/a&gt; - Graviton2 has been a revolution in compute performance and now its doing its thing with EMR with up to 30% lower cost and up to 15% improved performance.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  AppFlow
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fd1.awsstatic.com%2Fproduct-marketing%2Fsandstone%2Fproduct-page-diagram_Amazon-AppFlow%402x.8d2898573c44a4e4e602b28772caad7a6b13e8c9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fd1.awsstatic.com%2Fproduct-marketing%2Fsandstone%2Fproduct-page-diagram_Amazon-AppFlow%402x.8d2898573c44a4e4e602b28772caad7a6b13e8c9.png" alt="AppFlow"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AppFlow allows you to securely transfer data between SaaS apps such as Salesforce, Marketo, and Slack and AWS Services such as S3 and Redshift.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-appflow-now-provides-amazon-lookout-for-metrics-connectivity-to-several-cloud-applications/" rel="noopener noreferrer"&gt;Lookout for Metrics integration&lt;/a&gt; - you can now detect anomalies and unexpected changes in your metrics without needing to have machine learning expertise.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Batch
&lt;/h1&gt;

&lt;p&gt;Batch is a service that optimally provisions the type and quantity of compute for batch processes that you would like to run.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/severless-batch-scheduling-with-aws-batch-and-aws-fargate/" rel="noopener noreferrer"&gt;Fargate support&lt;/a&gt; - now you can submit your Batch jobs without needing to worry about patching your EC2 instances!&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Neptune
&lt;/h1&gt;

&lt;p&gt;Neptune is a fast and reliable managed graph database service - many data teams are using graph databases to store metadata and lineage for their data lakes.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-announces-amazon-neptune-ml-easy-fast-and-accurate-predictions-for-graphs/" rel="noopener noreferrer"&gt;ML Integration&lt;/a&gt; - this allows you to run Graph Neural Networks over your data and return results within hours as opposed to weeks with traditional tabular methods.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Managed Airflow
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fd2908q01vomqb2.cloudfront.net%2Fda4b9237bacccdf19c0760cab7aec4a8359010b0%2F2020%2F11%2F17%2Fmwaa-graph-view-1024x466.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fd2908q01vomqb2.cloudfront.net%2Fda4b9237bacccdf19c0760cab7aec4a8359010b0%2F2020%2F11%2F17%2Fmwaa-graph-view-1024x466.png" alt="Airflow Dag"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Last but definitely not least, we have airflow which is a workflow orchestration service that allows data engineers create DAGs or directed acyclic graphs to manage dependencies across various data pipelines. Managing an airflow cluster can easily require a lot of effort so having this in a managed service is a huge win for data engineering teams already managing their own clusters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/aws/introducing-amazon-managed-workflows-for-apache-airflow-mwaa/" rel="noopener noreferrer"&gt;Pre:Invent - MWAA&lt;/a&gt; - is a new serverless service that allows you to deploy airflow at scale rapidly.&lt;/p&gt;

&lt;p&gt;Thanks for sticking with me for the long read - hope you enjoyed the wrap - and let me know what's your pick out of the lot?!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>dataengineering</category>
      <category>redshift</category>
      <category>reinvent2020</category>
    </item>
  </channel>
</rss>
