The discourse on digital transformation has received a myriad of contributions on how the advent of technological tools, cloud computing, a culture of digitalisation, and especially in recent times; how the adoption of Artificial Intelligence(AI) can be used to drive performance, innovation, growth and development in business organisations and corporate entities in general. Economists, social scientists, business and technological analysts have propounded and relied on a number of key indicators to evaluate the phenomenon of Corporate Digital Transformation. This includes metrics that span across the spectrum of operations, innovation, and organisational structure to mention but a few.
An example of a dataset that captures these metrics will be extracted from Kaggle(a data science and machine learning community platform) to conduct a data ingestion and analysis project using python programming language and a ML service offered by AWS known as Amazon Sagemaker.
The measure of the value and importance of data analysis and visualization is not just restricted to gathering information and insights on the data. These insights can drive innovation by tools such as AI & ML to train models that can make important predictions based on the data that was analyzed. A useful service offered by AWS that helps to achieve this is Sagemaker. Amazon Sagemaker is a managed ML service offered by AWS that is used to test, train and build models in the cloud. The essence of performing data analysis and visualisation in an Amazon sagemaker notebook is that it presents an opportunity to have a central hub for your data analytics workload to also be leveraged by ML engineers to train and build models that can make insightful predictions based on this data. To get started with creating a Sagemaker notebook, follow this steps provided in the Sagemaker developer guide:
NB: In the configuration window of your sagemaker notebook instance. Select the VPC option with an internet gateway attached to enable internet access for the underlying EC2 instance to have access to the kaggle api to pull the dataset used in this project. However, it’s important to note that for security purposes and unique workload requirements use the No VPC, only VPC options to prevent exposure from external traffic.
Additionally, you must have an account on Kaggle to generate API access keys to pull the dataset that I will use for the demonstration in the notebook below.
This simple analysis of digital transformation metrics dataset is just a brief demonstration of how the integration of kaggle into your Amazon Sagemaker studio notebook can aid data ingestion and analysis, by having ready access to a plethora of datasets to conduct any data analysis project and also train and build models that can make valuable predictions that your project requires.
Top comments (0)