DEV Community

Mujahida Joynab
Mujahida Joynab

Posted on

Fundamentals of Machine Learning

Features -> Input variables or attributes that describe each data sample . In a dataset to predict house prices , the features could be things like area , number of bedrooms ,location etc .
Weights -> Represent the importance of each feature in making the prediction . The model learning process involves iteratively adjusting these weights to minimize the error between the predicted and actual outputs .

Labels : These are the target variables or desired outputs that the model is trying to predict . In a house price prediction task , the label would be the actual sale price of the house .

f(x) = a0x0 + a1x1 + a2x2+...
f(x) = 0.4 * 0.8 + 0.5 * 0.3

Structured data - that is organized

Tabular data - arranged in rows and columns
Time-Series Data - Has chronological observations

Unstructured Data -> That lacks a predefined structure of format
Image Data -> Consists of visual pixels
Text Data -> Composed of written language

Supervised Learning -> Labeled
Unsupervised Learning -> Unlabeled Data

Output in classification is categorical and in regression it is numerical .

Regression :
Predicting a continuous numerical output variable .
e.g predicting house prices

Clustering:

Grouping data points into clusters based on similarities . Putting similar things together . e.g Frequent shoppers , Bargain Hunters , Occasional visitors

Anomaly Detection :
Identification of rare items , events or observations . Finding uncommon things . E.g. Apple doesn't fit the usual pattern .

Web Traffic , Heart Rate

Reinforcement Learning :
Environment Interaction :
Learning by interacting with an environment
Reward / Penalty System
Receiving rewards or penalties . Trial and Error . Ex- > Self driving car , Robots , Game.

ML development lifecycle:
Business Problem:
Problem Formulation : ML terms
Data Collection and integration :
Data Preprocessing and visualization : Missing values in the dataset , Outliers
Model Training : Perameter values that determe difference between predicted value and actual value .
Model Evaluation : Test dataset ,
Model Tuning : Hyper parameter tuning
Window seat :
Model Deployment :
Model Monitoring :Data Receiving
Amazon SageMaker AI:
Data Prep
Train
Build
Deploy
Monitoring

Features to simplify ML :
SageMaker Studio - An IDE that allows users to build , train , deploy machine learning models .
SageMaker Canvas - A no-code/low-code visual interface that empowers business users to build train , and depoloy machine learning models
SageMaker Data Wrangler - For cleansing , preparation and exploring data form a single visual interface .

Model Deployment :
Amazon SageMaker Endpoing
Amazon ColudWatch
Model Artifacts
Hosting the model

Amazon Sazemaker Inference :
Realtime Inference :
Live predictions
Sustained traffic
Low latency

Asycchronous inference -> Near real time
Long processing - 1 hour

Loan app - 20

Serverless inference
Intermittent traffic
Period of no traffic

AWS Managed AI/ML Services

Some AI/ML Services
Amazon Textract
Amazon Rekognition
Amazon Polly
Amazon Transcribe
Amazon Comprehend
Amazon Translate
Amazon Lex

Top comments (0)