DEV Community: Hulk Pham

Required Skills For Developers In The AI IDE Era

Hulk Pham — Tue, 27 May 2025 05:23:42 +0000

Executive Summary

The software development landscape is undergoing a profound
transformation, driven by the integration of Artificial Intelligence
(AI) into Integrated Development Environments (IDEs). This shift is not
about replacing human developers but rather augmenting their
capabilities, automating repetitive tasks, and accelerating workflows.

In this new era, developers require a blended skill set, encompassing
new technical competencies such as prompt engineering and understanding
AI/Machine Learning (ML) limitations, alongside a reinforcement of
human-centric attributes like critical thinking, adaptability and
ethical awareness. The developer's role is evolving from purely coding
to one of orchestration, architecture, and strategic problem-solving.
Embracing these evolving skills is no longer an advantage but a
necessity for maintaining individual career relevance and driving
organizational innovation.

1. Introduction: The Changing Landscape of Software Development

1.1. The Evolution of IDEs and the Rise of AI Integration

Integrated Development Environments (IDEs) have undergone a significant
evolution, from basic text editors to sophisticated systems offering
advanced features like syntax highlighting, intelligent code
suggestions, and version control integration. The recent integration of
AI marks a pivotal milestone, transforming how developers approach
coding by predicting code, suggesting edits, and streamlining the
development process. AI-powered development tools are software
applications that integrate artificial intelligence to assist developers
throughout the software development lifecycle. These tools harness the
power of machine learning, a subset of AI, to learn from existing code
patterns, thereby intelligently suggesting improvements or solutions,
and optimizing code structure.

A significant consequence of this evolution is the democratization of
software development and the emergence of the "citizen developer." As
AI tools become powerful enough to automate tasks like front-end,
back-end, and database management with "no coding required", they
empower non-technical users to build and customize applications without
relying on traditional developers. AI-powered low-code/no-code (LCNC)
platforms are making software development more accessible to
non-programmers. This shift indicates a significant change in who can
participate in software creation. This could lead to a surge in
"citizen developers" leveraging AI-powered LCNC platforms, blurring
the lines between traditional developer roles and business users.
Consequently, professional developers may increasingly focus on complex
integrations, custom AI model development, and managing these LCNC
platforms, rather than solely building applications in a conventional
manner.

1.2. Transformative Impact of AI on Developer Workflows and Productivity

AI is acting as a powerful collaborator, boosting productivity,
streamlining workflows, and driving innovation. It automates repetitive
tasks such as code snippet generation, debugging, and optimization. This
automation frees developers to focus on higher-value tasks that demand
human creativity and judgment, such as defining product vision, setting
strategy, concept development, and feature prioritization. According to
a McKinsey report, features like automated code validation and bug
detection can enhance developer productivity by up to 30%.

This shift is not merely about doing tasks faster; it's a fundamental
redefinition of a developer's core responsibilities. Rather than
diminishing human expertise, AI amplifies it, enabling developers to
concentrate on creative problem-solving, architectural design, and
strategic decision-making. In this new era, the developer is not just a
coder; they are a collaborator with AI, an architect of experiences, and
increasingly an AI orchestrator, managing pipelines, models, data, and
business logic. This implies a need for broader systems thinking and
less intensive focus on detailed coding for routine tasks.

2. AI-Powered IDEs: Core Functionalities and Benefits

2.1. Automated Code Generation, Completion, and Refactoring

AI-powered IDEs, such as GitHub Copilot, Cursor, and IntelliCode,
provide real-time code suggestions, auto-completions, and even generate
entire functions or multi-line code blocks. They leverage AI models
trained on billions of lines of open-source code, offering context-aware
suggestions and predicting subsequent code patterns. AI-assisted code
refactoring ensures clean, efficient, and well-structured code,
improving readability and maintainability. "Natural Language to Code"
features allow developers to describe functionality in plain English,
generating complete, context-aware code snippets.

2.2. Advanced Debugging, Testing, and Optimization

AI detects errors, bugs, and inefficiencies before code execution,
significantly reducing debugging time. AI-powered debuggers analyze
runtime behavior, detect irregularities, and pinpoint problematic code
sections in seconds. Automated test case generation, unit, integration,
and regression testing are streamlined by AI, ensuring high-quality
software and reducing manual effort. AI also analyzes and suggests
performance improvements, making code faster and more readable.

2.3. Real-Time Collaboration and Project Management Features

AI IDEs can facilitate real-time code collaboration for teams, allowing
multiple developers to work simultaneously on the same codebase. AI
assists in project management tasks, market analysis, and feedback
analysis, freeing up time for strategic activities. AI-powered search
helps developers quickly locate functions, files, and dependencies
within large projects.

2.4. Table: Key AI IDE Features and Developer Benefits

The following table summarizes the most impactful features of AI-powered
IDEs and their direct benefits to developers. This table provides a
quick overview, making it easy for readers to grasp the practical
applications of these tools and serving as a foundation for discussing
the necessary skills. For example, if AI automates debugging, the skill
shifts from manually finding bugs to understanding why AI flagged an
issue or how to refine its suggestions.

Feature Category	Specific Feature (Examples)	Benefit to Developers
Code Generation/Completion	Real-time code suggestions	Accelerates coding process
	Natural Language to Code	Minimizes manual coding, increases efficiency
	Context-aware auto-completion	Improves workflow speed and accuracy
Debugging/Error Detection	AI-driven error detection	Significantly reduces debugging time
	Explanations of code sections	Enhances understanding of codebase
Testing	Automated unit test generation	Ensures software quality, reduces manual effort
	Automated regression testing	Maintains code quality with updates
Optimization	Code refactoring	Ensures clean, efficient, readable, and maintainable code
	Performance optimization suggestions	Improves code speed and efficiency
Collaboration	Real-time code collaboration	Ideal for remote and distributed teams
	Automated code reviews	Accelerates PR turnaround, ensures standards
Project Management	AI-powered code navigation and search	Quickly locates functions, files, dependencies
	Decision-making support	Provides rapid insights into development options
Security	Real-time vulnerability detection	Reduces security risks before code goes live
	Scanning AI-generated code for vulnerabilities	Ensures safety and compliance

3. Core Technical Skills for the AI IDE Era

3.1. Mastering Prompt Engineering

Prompt engineering is emerging as a core skill, involving the creation
of effective prompts to guide AI language models in generating accurate,
relevant, and context-aware responses. This requires the ability to
manage conversational context, optimize token usage, and incorporate
detailed context to effectively guide AI models. Developers need to
understand techniques such as persona-driven prompting, iterative
prompting, and few-shot prompting. The ability to evaluate AI-generated
responses and continuously refine prompts is crucial for improving
accuracy and relevance.

This shift represents a fundamental cognitive transition: from mastering
syntax to orchestrating intent. Traditional programming emphasizes
precise syntax and detailed algorithmic implementation. However, with AI
capable of generating code from natural language, the developer's focus
shifts. Prompt engineering requires "guiding AI language models to
produce accurate, relevant, and context-aware responses" and "refining
prompt iterations". This implies that the primary interface becomes
natural language, and the skill lies in articulating intent and
refining output rather than solely writing line-by-line code. This
demands a deeper understanding of the problem domain and the
capabilities and limitations of AI, beyond mere syntax. The developer is
no longer just a "coder" but an "AI orchestrator", who defines
what (the intent) and refines how (the AI's output) through
prompts.

3.2. Understanding Foundational AI/Machine Learning Principles

A solid grounding in fundamental AI concepts such as machine learning,
deep learning, and neural networks is essential. Developers need to
grasp the intricacies of AI model operations, including how data is
structured and processed for model training, optimizing data pipelines,
and managing conversational states in chatbot applications. Crucially,
developers must comprehend AI's limitations, such as its dependence on
data quality, lack of common sense, contextual understanding, and
difficulty with ambiguity. Knowledge of Python and its AI/ML libraries
(TensorFlow, PyTorch, Hugging Face's Transformers) remains vital for
algorithm development and model customization.

A significant challenge is the "black box" problem and the need for
explainability. AI models often produce results without a clear
explanation of the underlying logic. This lack of transparency reduces
trust, especially in sensitive fields like healthcare or law.
Additionally, AI may lack a deep understanding of the broader context in
which a software project operates, potentially overlooking specific
business goals or features that don't align with the product's
strategic vision. Therefore, developers cannot blindly trust
AI-generated code or solutions. The skill is not just about using AI,
but interrogating it. Developers need to develop skills in
understanding model explainability, even if they are not building models
from scratch. This includes asking the AI to explain its reasoning and
understanding how to debug not just code, but model behavior, which is
a distinct challenge from traditional debugging. This capability is
crucial for maintaining quality, security, and ethical standards.

3.3. Enhanced Code Quality Assurance and Security

Developers must critically evaluate AI-generated code for functional
correctness, logical soundness, edge cases, and adherence to
requirements. Static and dynamic code analysis skills are crucial for
identifying syntax errors, coding standard violations, security
vulnerabilities, and runtime issues. Human oversight remains essential
to catch issues that automated tools might miss, such as duplicated
code, "code smells," and subtle security vulnerabilities.
Understanding common vulnerabilities (e.g., SQL injection, hardcoded
credentials, XSS) and how AI can propagate them is critical.

A paradox arises between efficiency and caution. AI promises to
"accelerate coding workflows" and "faster PR turnaround". However,
AI-generated code is not infallible; developers must carefully review
suggestions, ensuring the code remains secure, efficient, and aligned
with project-specific needs. AI-generated code can present a "polished
facade" but function incorrectly, and this problem can be alleviated by
human reviewers. The apparent speed and correctness of AI-generated code
can create a false sense of security, potentially leading to reduced
human oversight. This paradox means developers must actively resist the
temptation to over-rely on AI for correctness and, instead, must
intensify rigorous code review, security analysis, and testing. The
skill is not just how to review code, but the discipline to
scrutinize despite AI assistance, recognizing that AI's efficiency
benefits must be balanced with human caution to prevent subtle yet
critical errors or vulnerabilities from slipping into production.

3.4. System Design, Architecture, and Integration

Developers will increasingly act as "AI orchestrators," managing
pipelines, models, data, and business logic. Systems thinking and
architectural design skills are crucial for building scalable,
intelligent systems, including microservices that can learn and adapt
over time. Integrating AI capabilities into existing applications and
infrastructure, including cloud-native development and
Infrastructure-as-Code (IaC) tools, is paramount. Understanding how to
make API requests and integrate AI functionalities into web applications
is a key practical skill.

3.5. AI-Assisted Debugging and Performance Optimization Proficiency

Leveraging AI tools for troubleshooting involves understanding how to
identify symptoms, use structured debugging approaches, and apply
multi-language support tools. Specific skills include model training
debugging (e.g., TensorFlow Debugger), identifying bottlenecks, and
applying profiling techniques to enhance execution speed and efficiency.
Developers must understand how to ask AI assistants to explain code
sections, answer programming questions, and assist with performance
analysis and debugging.

3.6. Foundational Programming and API Interaction

Proficiency in core programming languages like Python, Java, C++, and
JavaScript remains essential, as AI tools augment rather than replace
them. Python is particularly popular due to its rich AI/ML libraries.
Mastering practical Python, including data structures (lists,
dictionaries), organizing code with functions and files, and handling
data formats (CSV, JSON), is crucial. Understanding API interaction,
including secure authentication, optimizing performance with concurrent
requests, and designing robust error handling, is critical for
integrating AI functionalities.

Amidst the hype of AI, the enduring value of "the basics" remains.
Even the most advanced AI system becomes useless if it cannot be
integrated into a functional application, which requires reliable code,
proper database connections, and well-structured APIs. AI works best
when the developer is already good at programming and doesn't want to
bother learning technology-specific APIs. There's a risk that
developers might de-emphasize foundational programming skills, assuming
AI will handle everything. However, evidence suggests AI augments, not
replaces, the need for core programming competence. Developers still
need to understand underlying logic, debug AI-generated code, and
integrate it into larger systems. "The basics" (data structures,
algorithms, modularity, API interaction) become even more critical for
evaluating AI output and building robust applications around
AI-generated components. This implies that while the volume of manual
coding might decrease, the quality and understanding of foundational
programming principles become paramount for effective AI collaboration.

4. Essential Human-Centric Skills for the Augmented Developer

4.1. Critical Thinking and Problem-Solving

The ability to evaluate information, question assumptions, and solve
complex problems remains invaluable as AI handles routine tasks.
Developers must guard against over-reliance on AI outputs, which can
diminish independent problem-solving abilities and lead to a superficial
understanding of coding principles. This includes checking for logical
correctness, edge cases, unintended consequences in AI-generated code,
and challenging AI's suggestions.

A concerning risk is "cognitive atrophy" and the need for intentional
engagement. Studies have shown that reliance on AI outputs can diminish
an individual's cognitive ability, leading to a "potential erosion of
essential analytical skills over time". Developers might become overly
dependent on AI-generated suggestions, leading to skill degradation over
time. This is a profound long-term implication. If developers passively
accept AI suggestions, their critical thinking "muscles" (logic,
creativity, experimentation) can atrophy. The skill is not just having
critical thinking, but actively exercising it even when AI provides a
solution. This requires a conscious effort to "review AI-generated
code, ask hard questions, and strive to understand the underlying
logic". It shifts the focus from efficiency at all costs to a balanced
workflow that preserves human cognitive capabilities.

4.2. Adaptability and Continuous Learning

Adaptability is an imperative skill due to rapid changes in technology,
with AI tools, automation, and orchestration improving almost daily.
Cultivating a mindset of continuous learning, self-directed learning,
and resourcefulness over rote memorization is crucial for staying
relevant. This involves diving into diverse skill areas, leveraging
learning opportunities, embracing failure, and being open to new
perspectives.

The rapid pace of technological change demands a "learning agility
imperative." Technology evolves at breakneck speed, and adapting
quickly is crucial. The AI landscape is rapidly evolving, and to keep
up, developers need to be comfortable teaching themselves new frameworks
and libraries. Flexible learning approaches and just-in-time learning
are necessary. It's not merely about learning new things; the pace of
change demands the speed and efficiency with which one can learn and
integrate new knowledge. This "learning agility" means developers must
be proficient at identifying what to learn, seeking out resources (e.g.,
ChatGPT, Stack Overflow, documentation), and applying that knowledge
quickly. This is a meta-skill that underpins all other technical and
human skills in the AI era, as specific tools and techniques will
constantly evolve.

4.3. Collaboration and Communication

Teamwork and effective collaboration are essential in an AI-augmented
environment, especially with real-time code collaboration features in
IDEs. Developers will increasingly collaborate with AI as a "coworker"
or "on-demand expert". Cross-functional collaboration with data
science, operations, and security teams will become more prevalent.
Strong communication skills, including rhetoric and emotional
intelligence, are crucial for influencing others and working effectively
in a tech environment.

4.4. Ethical AI Development and Responsible Practices

Developers must actively participate in discussions around responsible
AI, especially as regulations tighten globally. Key ethical
considerations include fairness and bias, transparency, privacy, human
safety, and environmental responsibility. Skills in bias mitigation
techniques, data anonymization, auditable ML pipelines, and secure data
handling are now part of the developer's toolkit. Understanding the
"black box" problem and striving for explainability in AI systems is
crucial for building trust. Human oversight is indispensable to ensure
AI systems align with human values, laws, and company policies.

The developer's role expands beyond merely writing functional code to
becoming a "socio-technical guardian." With the immense power of AI
comes the great human responsibility to ensure these technologies are
developed and used ethically. Concerns include bias, discrimination, and
misuse. The developer's role extends to being an "ethical code
curator, ensuring transparency and safety". This means understanding
the potential societal impact of AI models, proactively addressing
biases in training data, ensuring data privacy, and designing
transparent and accountable systems. This is not just about avoiding
legal repercussions; it's about building trustworthy AI that aligns
with human values. This implies the need for developers to engage with
ethical frameworks, policy discussions, and diverse perspectives,
transforming them into socio-technical experts.

5. Navigating Challenges and Future Prospects

5.1. Addressing Over-Reliance on AI and Skill Degradation

The risk of developers becoming overly dependent on AI-generated
suggestions, leading to skill degradation in critical thinking,
problem-solving, and debugging, is a significant concern. Mitigation
strategies include maintaining a balanced workflow, regularly reviewing
AI-generated code, and actively striving to understand the underlying
logic.

5.2. Mitigating Data Privacy and Security Risks

Sharing proprietary code with external AI services can pose risks of
intellectual property exposure. AI-powered tools can introduce security
risks if trained on insecure code patterns or vulnerabilities.
Strategies include using private instances of AI systems, establishing
organizational agreements to prevent data use in future AI training
models, and regularly scanning AI-generated code for vulnerabilities.

5.3. Evolving Role: From Coder to AI Orchestrator and Architect

The developer's mindset must evolve to prioritize automation, AI
awareness, and user-centricity. New job roles blending traditional
programming with oversight of AI-driven processes are emerging, such as
AI model trainers and AI system controllers. Developers will
increasingly manage pipelines, models, data, and business logic,
focusing on user trust and compliance.

The concept of "human-in-the-loop" is a strategic imperative. This
refers to a partnership between machines and humans, where humans can
harness AI's problem-solving abilities while maintaining oversight.
Humans must monitor AI performance and can intervene when necessary,
overriding AI decisions or providing alternative solutions. Human
oversight is indispensable to ensure AI systems operate as expected and
make decisions aligned with human values, laws, and company policies.
This implies that developers need skills in monitoring AI systems,
understanding when and how to intervene, and establishing feedback loops
to continuously improve AI performance and align it with human values
and project goals. It reinforces that AI is a tool, and human
intelligence remains the ultimate decision-maker.

Conclusion: The Augmented Developer of Tomorrow

The AI IDE era is fundamentally redefining the developer's skill set,
moving beyond syntax mastery to encompass prompt engineering, AI/ML
literacy, enhanced quality assurance, and systems-level thinking.
Human-centric skills -- critical thinking, adaptability, collaboration,
and ethical responsibility -- are not diminished but amplified, becoming
differentiating factors in an increasingly automated landscape.

Developers who embrace AI as a partner, commit to continuous learning,
and uphold responsible practices will lead innovation and thrive in this
evolving ecosystem. The future belongs to the augmented developer,
capable of seamlessly blending human creativity with machine efficiency.

Building a Multi-Agent AI with LangGraph: A Comprehensive Guide

Hulk Pham — Wed, 19 Feb 2025 16:54:11 +0000

Introduction

In the rapidly evolving world of conversational AI, designing agents that can handle complex workflows and interactions is more important than ever. LangGraph, an extension of LangChain, provides a graph-based approach to create structured and dynamic AI workflows. This guide will walk you through building an AI agent with LangGraph and highlight the LangGraph-AI-Agent repository by hulk-pham—a project that demonstrates advanced multi-agent conversational systems, dynamic workflow orchestration, custom agent behaviors, and robust state management.

What is LangGraph?

LangGraph is an innovative framework that leverages directed graphs to model AI workflows. Unlike traditional sequential or decision-tree-based logic, LangGraph allows you to define nodes (representing tasks or actions) and edges (representing the flow of information) for more flexible and scalable AI applications.

Key Features:

Graph-based Execution: Define workflows as nodes and edges.
Parallel Execution: Run multiple tasks simultaneously.
State Management: Maintain context and handle conversation history.
Error Handling: Gracefully manage and recover from failures.

Exploring the LangGraph-AI-Agent Repository

The LangGraph-AI-Agent repository is a practical implementation that showcases how to build multi-agent conversational workflows using LangGraph. Here’s a quick overview of what the repository offers:

Multi-Agent Conversations: The project supports interactions between multiple AI agents, each designed for specific tasks.
Dynamic Workflow Orchestration: Easily adapt and extend conversation flows as needed.
Custom Agent Behaviors: Define specialized behaviors for each agent to handle diverse queries.
State Management: Keep track of conversation context across interactions.

Repository Structure:

├── agents/         # Agent definitions
├── graphs/         # Workflow graphs
├── utils/          # Helper functions
├── main.py         # Entry point for running the agent
└── requirements.txt

Getting Started: Installation & Setup

Follow these steps to set up the project locally:

Clone the Repository:

git clone https://github.com/hulk-pham/LangGraph-AI-Agent.git
cd LangGraph-AI-Agent

Create a Virtual Environment:

python3.12 -m venv .venv
source .venv/bin/activate

Install Dependencies:
```
pip install -e .
```

Set Environment Variables:

export OPENAI_API_KEY="your-api-key"
export PATH=$PATH:/usr/local/mysql/bin

Run the Agent:
```
python3 src/ai_core/main.py
```

Creating a Simple AI Agent with LangGraph

If you want to build your own agent from scratch or extend the existing implementation, here’s a basic outline of the process:

Step 1: Import Required Libraries

from langchain.chat_models import ChatOpenAI
from langgraph.graph import StateGraph
from langgraph.graph.nodes import LLMNode

Step 2: Define the AI Model

llm = ChatOpenAI(model_name="gpt-4", temperature=0)

Step 3: Create the Graph

Define a simple workflow where the agent processes user input and generates a response:

# Define the processing function
def process_query(state):
    user_input = state["query"]
    response = llm.invoke(user_input)
    return {"response": response}

# Initialize the graph
graph = StateGraph()

# Add the node to the graph
node = LLMNode(process_query)
graph.add_node("query_processor", node)

# Set the entry point for the graph
graph.set_entry_point("query_processor")

Step 4: Running the Agent

Compile and run your graph:

# Compile the graph
executor = graph.compile()

# Run the agent with a user query
user_input = "What is LangGraph?"
output = executor.invoke({"query": user_input})
print(output)

Enhancing Your AI Agent

Adding Memory

Leverage LangGraph’s state management to maintain context across interactions. This allows your agent to store conversation history and adapt responses accordingly.

Building Multi-Step Workflows

Extend your graph by adding more nodes for tasks such as:

Fetching external data via APIs.
Performing calculations or database queries.
Handling complex decision-making processes.

Customizing Agent Behaviors

Modify or create new agents with specialized functions to suit different parts of your workflow, enabling a modular and scalable design.

Conclusion

LangGraph offers a powerful and flexible framework for building advanced AI agents. Whether you're starting from scratch or building upon existing projects like the LangGraph-AI-Agent repository, you now have a robust foundation for designing conversational workflows that are both dynamic and scalable.

Start experimenting with LangGraph today, and explore the endless possibilities of multi-agent conversational systems!

Feel free to modify and expand upon this draft to better fit your style or to include more details from your repository. Happy coding!

Hire me?

Contact me at Linkedin

The Complete Machine Learning Pipeline: From Data to Deployment

Hulk Pham — Wed, 19 Feb 2025 16:46:44 +0000

Machine Learning (ML) is revolutionizing industries by enabling automated decision-making, predictive analytics, and intelligent automation. However, building a successful ML model isn’t just about training an algorithm—it requires a structured pipeline that takes data from raw collection to real-world deployment.

In this blog, we’ll walk through the end-to-end Machine Learning pipeline, covering each stage and its significance.

1. Data Collection

The foundation of any ML project is data. Without high-quality data, even the most sophisticated model will fail. Data can be gathered from various sources:

Databases: SQL, NoSQL, cloud storage solutions like AWS S3, Google Cloud Storage.
APIs: Twitter API, Google Maps API, financial market data from Alpha Vantage.
Files: CSV, JSON, Excel, Parquet.
Web Scraping: BeautifulSoup, Scrapy for extracting information from websites.
IoT Devices & Sensors: Smart devices, industrial sensors, fitness trackers.
Public Datasets: Kaggle, UCI Machine Learning Repository, Google Dataset Search.

Key Considerations:

Ensure data is relevant to the problem.
Maintain data integrity and avoid bias.
Store data securely following privacy regulations (GDPR, HIPAA).

2. Data Preprocessing & Cleaning

Raw data is often messy and requires cleaning before feeding it into an ML model. Common preprocessing steps include:

Handling missing values: Imputation (mean, median, mode), deletion, or using predictive models.
Removing duplicates and outliers: Use statistical methods like Z-score or IQR.
Standardizing data formats: Converting dates to a standard format, normalizing text.
Encoding categorical variables: One-hot encoding, label encoding.
Feature scaling: Normalization (MinMaxScaler) or Standardization (StandardScaler).

Popular Tools:

Pandas & NumPy: Data manipulation and numerical computation.
OpenCV: Image processing.
NLTK & spaCy: Natural language preprocessing.

3. Feature Engineering

Feature engineering is the process of transforming raw data into meaningful features that improve model performance. Techniques include:

Feature selection: Choosing relevant variables using methods like Recursive Feature Elimination (RFE).
Feature transformation: Applying logarithmic transformations, polynomial features, and binning.
Feature extraction: Techniques like Principal Component Analysis (PCA), TF-IDF for text data.
Feature creation: Aggregating data, domain-specific transformations, lag features for time series.

Popular Tools:

Scikit-learn: Feature selection and transformation.
Featuretools: Automated feature engineering.
TensorFlow Transform: Feature preprocessing for deep learning models.

4. Data Splitting

Before training a model, the dataset needs to be split into different subsets:

Training Set: Used to train the model (~70-80% of the data).
Validation Set: Used for tuning hyperparameters (~10-15%).
Test Set: Used for final evaluation (~10-15%).

Stratified sampling is recommended for classification problems to maintain class distribution.

Common Methods:

train_test_split from Scikit-learn.
Cross-validation techniques like k-fold cross-validation.

5. Model Selection

Choosing the right algorithm depends on the type of ML problem:

Regression: Linear Regression, Decision Trees, XGBoost, LightGBM.
Classification: Logistic Regression, Random Forest, SVM, Deep Learning.
Clustering: K-Means, DBSCAN, Hierarchical Clustering.
Deep Learning: CNNs for images, RNNs for sequential data, Transformers for NLP.

Frameworks:

TensorFlow & PyTorch: Deep learning.
Scikit-learn: Traditional ML models.
XGBoost & LightGBM: Gradient boosting models.

6. Model Training

Once an algorithm is selected, it is trained using the dataset:

Batch training vs. Mini-batch gradient descent for deep learning models.
Early stopping to prevent overfitting.
Regularization techniques: L1 (Lasso), L2 (Ridge), Dropout.
Transfer learning for pre-trained deep learning models.

Key Considerations:

Monitor loss functions and convergence.
Utilize GPU acceleration for deep learning.

7. Model Evaluation

A trained model is only useful if it performs well on unseen data. Performance is measured using:

Regression Metrics: RMSE, MAE, R².
Classification Metrics: Accuracy, Precision, Recall, F1-score, AUC-ROC.
Clustering Metrics: Silhouette Score, Davies-Bouldin Index.

Tools:

Scikit-learn's metrics module.
TensorFlow Model Analysis.

8. Hyperparameter Tuning

Fine-tuning model parameters can significantly improve performance. Methods include:

Grid Search: Exhaustive search over a parameter grid.
Random Search: Randomly selecting hyperparameters.
Bayesian Optimization: Probabilistic search using Gaussian Processes.
Genetic Algorithms: Evolution-based tuning.

Tools:

Optuna, Hyperopt, GridSearchCV.

9. Model Deployment

Once the model is optimized, it needs to be deployed for real-world use:

API-based deployment: Using Flask or FastAPI.
Cloud deployment: AWS SageMaker, GCP AI Platform, Azure ML.
Edge AI: Deploying models on IoT devices using TensorFlow Lite, ONNX.
Containerization: Docker, Kubernetes for scalable deployment.

10. Monitoring & Maintenance

ML models require continuous monitoring to remain effective:

Detecting data drift: Concept drift, covariate shift.
Logging model predictions for auditing.
Retraining with new data periodically.
Scaling and optimizing for performance.

Tools:

MLflow: Experiment tracking.
Evidently AI: Model monitoring.

11. Feedback & Continuous Improvement

The ML pipeline is an iterative process. New data, changing user behavior, and industry trends mean models must evolve over time. A strong feedback loop allows:

Retraining with fresh data.
Adjusting hyperparameters.
Deploying improved versions of the model.

Final Thoughts

Building a Machine Learning Pipeline is a systematic approach to developing, deploying, and maintaining ML models efficiently. By following this structured workflow, businesses can scale AI applications, improve accuracy, and ensure reliable, real-time predictions.

🔹 Do you have an ML project in mind? Share your thoughts in the comments! 🚀

Hire me?

Contact me at Linkedin

Getting Started with TensorFlow and Keras

Hulk Pham — Wed, 19 Feb 2025 16:29:10 +0000

Machine learning is one of the most exciting fields in modern technology, and TensorFlow and Keras are two of the most powerful tools for building AI models. Whether you're a beginner or an experienced developer, learning TensorFlow and Keras can open doors to new possibilities in deep learning. In this blog, we will walk through the basics of setting up TensorFlow and Keras, building your first neural network, and training a simple model.

What Are TensorFlow and Keras?

TensorFlow is an open-source machine learning framework developed by Google. It provides a flexible ecosystem for building and deploying AI models.
Keras is a high-level neural network API that runs on top of TensorFlow, making it easier to build and train models with minimal code.

Comparing TensorFlow, Keras, and PyTorch

While TensorFlow and Keras are widely used in deep learning, PyTorch is another popular framework developed by Facebook. Here’s a quick comparison:

Feature	TensorFlow & Keras	PyTorch
Ease of Use	Keras is beginner-friendly with simple APIs	PyTorch offers dynamic computation graphs for flexibility
Performance	Optimized for large-scale deployments	Preferred for research and experimentation
Community Support	Strong industry and academic adoption	Growing rapidly in the research community
Debugging	TensorFlow 2.0+ has better debugging tools	PyTorch offers intuitive debugging with Pythonic code
Deployment	TensorFlow supports production deployment with TensorFlow Serving and TFLite	PyTorch has TorchScript but is less mature for deployment

If you’re looking for easy-to-use tools for quick prototyping, Keras is a great choice. If you need fine-grained control and dynamic computation graphs, PyTorch is a better option.

Installing TensorFlow and Keras

Before we start, ensure you have Python installed (preferably Python 3.7+). You can install TensorFlow using pip:

pip install tensorflow

To check if TensorFlow is installed correctly, run the following in Python:

import tensorflow as tf
print(tf.__version__)

If you see a version number, you’re ready to go!

Example 1: Building Your First Neural Network

Let's create a simple neural network using Keras. We'll use the MNIST dataset, which consists of hand-written digits, and build a model to classify them.

Step 1: Import Necessary Libraries

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

Step 2: Load and Preprocess Data

# Load dataset
mnist = keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to be between 0 and 1
x_train, x_test = x_train / 255.0, x_test / 255.0

Step 3: Define the Model

model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),  # Input layer
    keras.layers.Dense(128, activation='relu'),  # Hidden layer
    keras.layers.Dense(10, activation='softmax') # Output layer
])

Step 4: Compile the Model

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Step 5: Train the Model

model.fit(x_train, y_train, epochs=5)

Step 6: Evaluate the Model

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f'\nTest accuracy: {test_acc}')

Making Predictions

Once trained, you can use the model to make predictions:

predictions = model.predict(x_test)
print(np.argmax(predictions[0]))  # Predicted digit for first test image

Example 2 Sentiment Analysis with TensorFlow and Keras

Sentiment analysis is a common application of natural language processing (NLP) used to determine the sentiment behind a given text. With TensorFlow and Keras, we can easily build a sentiment analysis model.

Step 1: Load the IMDB Dataset

The IMDB dataset is a collection of 50,000 movie reviews labeled as positive or negative. It is commonly used for binary sentiment classification tasks. You can read more about it here.

from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence

# Load dataset
max_features = 10000  # Vocabulary size
maxlen = 500  # Maximum length of sequences
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

# Pad sequences to ensure uniform length
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)

Step 2: Build the Sentiment Analysis Model

model = keras.Sequential([
    keras.layers.Embedding(max_features, 128),
    keras.layers.LSTM(64, dropout=0.2, recurrent_dropout=0.2),
    keras.layers.Dense(1, activation='sigmoid')
])

Step 3: Compile and Train the Model

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test))

Step 4: Evaluate and Predict

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f'\nTest accuracy: {test_acc}')

# Making a prediction
sample_review = x_test[0].reshape(1, -1)
prediction = model.predict(sample_review)
print("Positive" if prediction > 0.5 else "Negative")

Conclusion

Congratulations! You have successfully built and trained your first neural network using TensorFlow and Keras. This is just the beginning—there's a lot more to explore, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and advanced deep learning techniques.

Additionally, we explored sentiment analysis, a powerful application of deep learning in NLP. Try experimenting with different datasets and models to improve your understanding.

If you're interested in diving deeper, check out the official TensorFlow documentation and experiment with different datasets and architectures. Happy coding!

Hire me?

Contact me at Linkedin

AWS Compute - Part 5: How to choose compute service

Hulk Pham — Fri, 01 Nov 2024 14:51:57 +0000

TL;DR

Key Compute Options

Amazon EC2 offers granular control, various instance types, and flexibility for managing infrastructure
Containers provide lightweight, portable, and consistent application deployment across environments
Serverless computing abstracts infrastructure management, allowing focus on code and rapid development

When to Choose Each Option

Use EC2 for compute-intensive applications, long-running stateful apps, or when full OS control is needed
Choose containers for compute-intensive workloads, large monolithic applications, or when quick scaling is required
Opt for serverless (Lambda) for short-lived applications (under 15 minutes), event-driven architectures, or when automatic scaling is desired

Considerations

EC2 is versatile and suitable for a wide range of applications, offering various pricing models
Containers are ideal for microservices architecture but may not be optimal for apps with complex persistent storage or networking requirements
Serverless is best for small, simple applications that integrate multiple AWS services and don't require long execution times

I. Advantages of each compute type

1. Amazon EC2

Amazon EC2 offers granular control for managing your infrastructure. It gives you the choice of over 500 instance types, with the latest processors, storage, OSs, and networking. An EC2 instance is a virtualized server running in the cloud. Whatever options you customize for a physical server can also be customized for EC2 instances. Amazon EC2 also offers instances that are optimized for certain performance needs or workload functions. Therefore, your applications can start on an instance built to accommodate that workload type.

One advantage of using virtual servers is that you can build and deploy an instance in minutes. You can spin up an instance, test your application, and then delete the instance when you are done. Instances also offer you the flexibility to increase or decrease resources as your workload demands change, without affecting your application.

When you choose to use EC2 instances, here is a list of benefits that you gain:

With Amazon EC2, you can quickly build and start a new server: You don't need to rack the server, run cable, and update hardware drivers as you would do with a traditional server.
You can scale capacity as needed, both up and down. This means that if you need more memory, processing, or storage, you can add it.
Instances offer at least 99.99% (four nines) of availability. For more information on AWS and availability, see Amazon Compute Service Level Agreement.
Amazon EC2 offers instances that are optimized for specific types of workloads, including memory optimized, compute optimized, storage optimized, accelerated computing, and general purpose.
Various instance types are available with different pricing options, so you can choose the best option to fit your business requirements. These options include On-Demand Instances, Reserved Instances, and Spot Instances.
Amazon EC2 gives you complete control over the instance, down to the root level. You can manage the instance as you would manage a physical server.
You can use instances for long-running applications, especially those with state information and long-running computation cycles.

2. Containers

Containers provide a standard way to package your application's code, configurations, and dependencies into a single object. Containers share an OS installed on the server and run as resource-isolated processes, for quick, reliable, and consistent deployments, regardless of environment.

Containers are lightweight, efficient, fast, and highly portable. Because the container holds all the files it needs to run, you can ensure a consistent performance from your application, regardless of the underlying components. This encapsulated application approach also caters for rapid deployment, patching, and scaling when needed. Because you don't need to worry about OS patches or security enhancements that might affect your applications, containers lend themselves nicely toward improved and accelerated development cycles.

Here are some of the key features of the AWS container services:

The application is packaged so that you control the application and all associated resources, such as policies, security, and deployment.
Containers are portable and can be moved to different OS or hardware platforms, and through different environments such as development, testing/staging, pre-production, and production.
There are no time-out limits when running. This is useful for applications that run longer than 15 minutes or that need to initiate instantly when called.
Containers run without the startup latency of Lambda or Amazon EC2.
Containers have no size limit. They can be as large or as small as you need them to be.
Containers are useful when taking a large traditional application and breaking it down into small parts, or microservices, to make the application more scaleable and resilient.

3. Serverless

One of the major benefits of cloud computing is its ability to abstract (hide) the infrastructure layer. Therefore, you don't need to manually manage the underlying physical hardware. In a serverless environment, the abstraction is one layer higher. Not only is the physical infrastructure abstracted, but the instances and the operating systems on which AWS Lambda is running, are also abstracted. With this higher level abstraction in place, you can focus on the code for your applications without spending time building, maintaining, and patching the underlying infrastructure, hosts, and operating systems.

With serverless applications, there are never instances, OSs, or servers to manage. AWS handles everything required to run and scale your application.

By building serverless applications, your developers can focus on the code that makes your business unique.

If you are considering putting your workload into a serverless environment but aren't sure, the following list has some considerations when deciding on a serverless architecture.

To learn more about reasons to choose a serverless solution for your application, expand each of the following six categories.

Fast development
You might need to develop applications quickly and might not have the time to build and maintain the underlying infrastructure.

Using a serverless solution, you and your developers can focus on building and refining your applications without spending time managing and maintaining servers.
Pay for value

You only pay for the time that your application runs. This model helps keep costs down so that you aren't paying for time when your application is idle.

The AWS Lambda free tier includes one million free requests per month and 400,000 GB-seconds of compute time per month
Short-lived applications
Lambda is a suitable choice for any short-lived application that can finish running in under 15 minutes.

If an application needs to run longer than 15 minutes, it's no longer cost effective to use Lambda. Instead, consider other solutions.
Event-driven applications
You might need event-initiated, or event-driven, stateless applications that need quick response times.

An event-driven architecture uses events to initiate actions and communication between decoupled services. An event is a change in state, a user request, or an update, such as an item being placed in a shopping cart in an ecommerce website. When an event occurs, the information is published for other services to consume it.
Automatic scaling
If you do not want, or need, to managing resource scaling, a serverless architecture is something to consider.

When you use Lambda, the service is responsible for all the resources required to run your application. If your application suddenly needs more resources, Lamdba adjusts your resource consumption up or down to maintain consistent application performance during peak utilization and off-hour timeframes.
Redundancy and resilience
The AWS Global Infrastructure is built around AWS Regions and Availability Zones. Regions provide multiple physically separated and isolated Availability Zones, which are connected with low-latency, high-throughput, and highly redundant networking.

Lambda runs your function in multiple Availability Zones to ensure that it is available to process events in case of a service interruption in a single zone.

Lambda also provides additional resilience features such as versioning, reserved resources, retries, and the previously mentioned automatic scaling capability.

II. Choosing a Compute Option for your Workload

1. Amazon EC2 considerations

With Amazon EC2, you have complete control of your computing resources. Amazon EC2 reduces the time required to obtain and start new server instances to minutes. Therefore, you can quickly scale capacity up or down as your computing requirements change.

Because EC2 instances are virtualized servers in the cloud, they lend themselves to support a large variety of applications. Anything you can run on a physical server can be run on Amazon EC2. Whether you need a small instance type or a robust instance type with multiple processors, and extra memory, Amazon EC2 is a versatile platform for all of your application needs.

Here are some application characteristics that lend themselves to run best when using Amazon EC2.

When you have compute-intensive or memory-intensive applications, consider the following:
- You can select a specific instance type based on the requirements of the application or software that you plan to run on your instance.
  - Amazon EC2 provides each instance with a consistent and predictable amount of CPU capacity, regardless of its underlying hardware.
- Amazon EC2 gives you the choice of OSs and software packages. You can select a configuration of memory, CPU, instance storage, and a boot partition size that is optimal for your choice of OS and application.
- You have access to the underlying files of the instance to customize or update as needed.
Amazon EC2 works with other AWS services to provide a complete solution for computing, query processing, and storage across a wide range of applications.
- From a performance perspective, you can use instances for long-running, stateful applications.
- You can determine the type and size or your storage, whether to use block or file storage.
- Amazon EC2 can support workloads where complex networking, security, and storage is required, in addition to applications and workloads with heavy computational requirements.
EC2 instances are optimized for use cases such as compute optimized, storage optimized, general purpose, or accelerated computing.
Amazon EC2 is versatile and reliable. You can run a simple website or you can run a vast, complex, custom-built application, and when necessary, replacement instances can be rapidly and predictably deployed.
Pricing is pay for what you use. There are also different types of pricing models so you can select the type of instance and the pricing model that fits your budget.

2. Considerations for containers

Some application characteristics lend themselves to run optimally in a containerized environment. Here are some factors to consider for your application to see if containers are appropriate.

When to consider containers

For compute-intensive workloads
- Applications that are compute intensive run better in a container environment. If you have a small application that runs under in 15 minutes but is compute intensive, consider using a container. Lambda is not the best fit for a heavily compute-intensive piece of code.
For large monolithic applications
- These are appropriate candidates to move to containers. Large monoliths that have many parts are very suitable applications to consider moving to containers.
- You can break apart applications and run them as independent components, called microservices, using containers to isolate processes.
  - By using microservices, you can break large applications into smaller, self-contained pieces.
  - Segmenting a larger application means that updates can be applied on only specific parts. Because each part of the larger application resides in its own container, an update that might have affected files used by a different piece of the application is isolated to its own container.
  - With containers, you can do frequent updates by pushing out new containers, without the concern that one set of updates might break another part of the application. If you detect any issues, you have the flexibility to undo the change quickly.
When you need to scale quickly
- Containers can be built and taken down quickly, which means fast application deployment and scaling.
When you need to move your large application to the cloud without altering the code
- With containers, you can package entire applications and move them to the cloud without the need to make any code changes.
- Your application can be as large as you need it to be and can run as long as you require it to run.

When not to use containers

When applications need persistent data storage
- Containers can absolutely support persistent storage; however, it tends to be easier to containerize applications that don't require persistent storage. Persistent storage requirements increase the security and storage complexity and also make the containers less portable. If the container is moved, the storage needs to be reconfigured and secured.Applications that have no state information and don't require complex persistent storage would be better candidates for using a container solution than an application with complex storage needs.
When applications have complex networking, routing, or security requirements
- Containers are portable and can be moved through different environments (testing, staging, production). If the application requires a complex configuration for networking, routing, storage, and so on, the containers are much more challenging to move.

3. Considerations for serverless applications

Consider the characteristics of applications that lend themselves to run optimally in a serverless environment:

Are you building, testing, and deploying applications frequently and want to focus only on your code and not on infrastructure?
Are your applications less compute intensive?
Are the applications that you are running or building small, simple, or modular?
- Simple applications, such as chatbots or to-do lists that people can use to modify a list of things that they need to do, are good choices to move to serverless.
Will you be using multiple AWS services where one service might need to call another service?
- For example, if someone uploads a file to Amazon Simple Storage Service (Amazon S3), will you then need to invoke other workflows to log the update or convert the file to HTML? Serverless is a very appropriate fit when you need one action to invoke other workflows within AWS.
Do your applications finish quickly?
- Serverless is most suitable for applications that don't run longer than 15 minutes.
- Large, long-running, workloads, are expensive to run on serverless and not an optimal fit for this compute type.

4. Putting it all together

John is confident that he can now distinguish between the AWS Compute types and explain which types use Amazon EC2 instances and which types are run serverlessly. Sofia wants to make sure he grasps the nuances of the services. She asks him how he would summarize the three options, he shows her his notes which read:

Amazon EC2 are virtual server instances in the cloud.
Amazon ECS, Amazon EKS: Container management services that can run containers on either customer managed Amazon EC2 instances OR as an AWS managed serverless offering running containers on AWS Fargate.
AWS Lambda is Serverless compute for running stateless code in response to triggers.

John took a moment to draw an image depicting the abstraction of the compute services from the instance based compute to the serverless solutions.

AWS Cloud Essential Series

Hulk Pham — Thu, 31 Oct 2024 09:35:24 +0000

Welcome to our comprehensive guide on AWS essentials! In this blog series, we'll explore the fundamental building blocks of Amazon Web Services (AWS), the leading cloud computing platform. Whether you're a beginner looking to start your cloud journey or an IT professional aiming to expand your knowledge, this series will provide you with valuable insights into key AWS services and concepts.

Our study notes cover a wide range of topics, from basic cloud concepts to advanced services, organized into the following categories:

Basic Concepts: We'll start with an introduction to cloud computing, AWS global infrastructure, and the shared responsibility model.
Identity and Access Management: Learn about securing your AWS resources with IAM.
Computing: Dive into various compute options, including virtual machines, containerization, serverless computing, and auto-scaling.
Networking: Understand the fundamentals of AWS networking with Virtual Private Cloud (VPC).
Storage: Explore different storage types, file systems, and object storage solutions offered by AWS.
Database: Get to know AWS's diverse database offerings, including DynamoDB and purpose-built databases.
Monitoring: Learn how to keep track of your AWS resources with CloudWatch.

Each topic is covered in-depth, providing you with the knowledge you need to confidently work with AWS services. Let's embark on this exciting journey through the world of AWS!

List post details

Basic Concepts:

Identity and Access Management:

AWS Identity and Access Management

Computing:

Networking:

AWS Networking - VPC

Storage:

Database:

Monitoring:

AWS Monitoring - Part 1: AWS CloudWatch

This study guide is inspired by and references content from AWS Skill Builder. We acknowledge AWS as the original source of much of this information. The guide is intended for educational purposes only and is not an official AWS product.

If you have any concerns regarding copyright or the use of AWS-related content, please contact us at tanhunghue@gmail.com. We are committed to respecting intellectual property rights and will promptly address any issues.

AWS Compute - Part 4: Load Balancer and Autoscaling

Hulk Pham — Thu, 31 Oct 2024 09:21:09 +0000

TL;DR

High Availability and Load Balancing

High availability is crucial for systems, often expressed as a percentage of uptime or number of nines
Elastic Load Balancing (ELB) distributes incoming traffic across multiple targets, improving availability and scalability
ELB offers three types: Application Load Balancer (ALB), Network Load Balancer (NLB), and Gateway Load Balancer (GLB), each suited for different use cases

Amazon EC2 Auto Scaling

EC2 Auto Scaling automatically adds or removes EC2 instances based on defined policies, ensuring optimal performance and cost-efficiency
Auto Scaling groups define where resources are deployed, specifying VPC, subnets, and instance purchase options
Launch templates or configurations specify the resources to be scaled, including AMI, instance type, and security groups
Scaling policies determine when to add or remove instances, using CloudWatch metrics and alarms to trigger actions

I. High Availability

The availability of a system is typically expressed as a percentage of uptime in a given year or as a number of nines. In the following table is a list of availability percentages based on the downtime per year and its notation in nines.

Availability (%)	Downtime (per year)
90% (one nine of availability)	36.53 days
99% (two nines of availability)	3.65 days
99.9% (three nines of availability)	8.77 hours
99.95% (three and a half nines of availability)	4.38 hours
99.99% (four nines of availability)	52.60 minutes
99.995% (four and a half nines of availability)	26.30 minutes
99.999% (five nines of availability)	5.26 minutes

To increase availability, you need redundancy. This typically means more infrastructure—more data centers, more servers, more databases, and more replication of data. You can imagine that adding more of this infrastructure means a higher cost. Customers want the application to always be available, but you need to draw a line where adding redundancy is no longer viable in terms of revenue.

1. Why improve application availability?

In the current application, one EC2 instance hosts the application. The photos are served from Amazon S3, and the structured data is stored in Amazon DynamoDB. That single EC2 instance is a single point of failure for the application.

Even if the database and Amazon S3 are highly available, customers have no way to connect if the single instance becomes unavailable. One way to solve this single point of failure issue is to add one more server in a second Availability Zone.

2. Adding a second Availability Zone

The physical location of a server is important. In addition to potential software issues at the operating system (OS) or application level, you must also consider hardware issues. They might be in the physical server, the rack, the data center, or even the Availability Zone hosting the virtual machine. To remedy the physical location issue, you can deploy a second EC2 instance in a second Availability Zone. This second instance might also solve issues with the OS and the application.

However, when there is more than one instance, it brings new challenges, such as the following:

Replication process – The first challenge with multiple EC2 instances is that you need to create a process to replicate the configuration files, software patches, and application across instances. The best method is to automate where you can.
Customer redirection – The second challenge is how to notify the clients—the computers sending requests to your server—about the different servers. You can use various tools here. The most common is using a Domain Name System (DNS) where the client uses one record that points to the IP address of all available servers.

However, this method isn't always used because of propagation — the time frame it takes for DNS changes to be updated across the Internet.

Another option is to use a load balancer, which takes care of health checks and distributing the load across each server. Situated between the client and the server, a load balancer avoids propagation time issues. You will learn more about load balancers in the next section.
Types of high availability – The last challenge to address when there is more than one server is the type of availability you need: active-passive or active-active.

3. High availability categories.

Active-passive systems

With an active-passive system, only one of the two instances is available at a time. One advantage of this method is that for stateful applications (where data about the client’s session is stored on the server), there won’t be any issues. This is because the customers are always sent to the server where their session is stored.

Active-active systems

A disadvantage of an active-passive system is scalability. This is where an active-active system shines. With both servers available, the second server can take some load for the application, and the entire system can take more load. However, if the application is stateful, there would be an issue if the customer’s session isn’t available on both servers. Stateless applications work better for active-active systems.

II. Elastic Load Balancing

The Elastic Load Balancing (ELB) service can distribute incoming application traffic across EC2 instances, containers, IP addresses, and Lambda functions.

1. Load balancers

Load balancing refers to the process of distributing tasks across a set of resources. In the case of the Employee Directory application, the resources are EC2 instances that host the application, and the tasks are the requests being sent. You can use a load balancer to distribute the requests across all the servers hosting the application.

To do this, the load balancer needs to take all the traffic and redirect it to the backend servers based on an algorithm. The most popular algorithm is round robin, which sends the traffic to each server one after the other.

A typical request for an application starts from a client's browser. The request is sent to a load balancer. Then, it’s sent to one of the EC2 instances that hosts the application. The return traffic goes back through the load balancer and back to the client's browser.

Although it is possible to install your own software load balancing solution on EC2 instances, AWS provides the ELB service for you.

2. ELB features

The ELB service provides a major advantage over using your own solution to do load balancing. Mainly, you don’t need to manage or operate ELB. It can distribute incoming application traffic across EC2 instances, containers, IP addresses, and Lambda functions. Other key features include the following:

Hybrid mode – Because ELB can load balance to IP addresses, it can work in a hybrid mode, which means it also load balances to on-premises servers.
High availability – ELB is highly available. The only option you must ensure is that the load balancer's targets are deployed across multiple Availability Zones.
Scalability – In terms of scalability, ELB automatically scales to meet the demand of the incoming traffic. It handles the incoming traffic and sends it to your backend application.

3. Health checks

Monitoring is an important part of load balancers because they should route traffic to only healthy EC2 instances. That’s why ELB supports two types of health checks as follows:

Establishing a connection to a backend EC2 instance using TCP and marking the instance as available if the connection is successful.
Making an HTTP or HTTPS request to a webpage that you specify and validating that an HTTP response code is returned.

Taking time to define an appropriate health check is critical. Only verifying that the port of an application is open doesn’t mean that the application is working. It also doesn’t mean that making a call to the home page of an application is the right way either.

For example, the Employee Directory application depends on a database and Amazon S3. The health check should validate all the elements. One way to do that is to create a monitoring webpage, such as /monitor. It will make a call to the database to ensure that it can connect, get data, and make a call to Amazon S3. Then, you point the health check on the load balancer to the /monitor page.

After determining the availability of a new EC2 instance, the load balancer starts sending traffic to it. If ELB determines that an EC2 instance is no longer working, it stops sending traffic to it and informs Amazon EC2 Auto Scaling. It is the responsibility of Amazon EC2 Auto Scaling to remove that instance from the group and replace it with a new EC2 instance. Traffic is only sent to the new instance if it passes the health check.

If Amazon EC2 Auto Scaling has a scaling policy that calls for a scale down action, it informs ELB that the EC2 instance will be terminated. ELB can prevent Amazon EC2 Auto Scaling from terminating an EC2 instance until all connections to the instance end. It also prevents any new connections. This feature is called connection draining. We will learn more about Amazon EC2 Auto Scaling in the next lesson.

4. ELB components

The ELB service is made up of three main components: rules, listeners, and target groups.

Rule

To associate a target group to a listener, you must use a rule. Rules are made up of two conditions. The first condition is the source IP address of the client. The second condition decides which target group to send the traffic to.

Listener

The client connects to the listener. This is often called client side. To define a listener, a port must be provided in addition to the protocol, depending on the load balancer type. There can be many listeners for a single load balancer.

Target group

The backend servers, or server side, are defined in one or more target groups. This is where you define the type of backend you want to direct traffic to, such as EC2 instances, Lambda functions, or IP addresses. Also, a health check must be defined for each target group.

5. Types of load balancers

We will cover three types of load balancers: Application Load Balancer (ALB), Network Load Balancer (NLB), and Gateway Load Balancer (GLB).

A. Application Load Balancer

For our Employee Directory application, we are using an Application Load Balancer. An Application Load Balancer functions at Layer 7 of the Open Systems Interconnection (OSI) model. It is ideal for load balancing HTTP and HTTPS traffic. After the load balancer receives a request, it evaluates the listener rules in priority order to determine which rule to apply. It then routes traffic to targets based on the request content.

Primary features of an Application Load Balancer:

Routes traffic based on request data

An Application Load Balancer makes routing decisions based on the HTTP and HTTPS protocol. For example, the ALB could use the URL path (/upload) and host, HTTP headers and method, or the source IP address of the client. This facilitates granular routing to target groups.
Sends responses directly to the client

An Application Load Balancer can reply directly to the client with a fixed response, such as a custom HTML page. It can also send a redirect to the client. This is useful when you must redirect to a specific website or redirect a request from HTTP to HTTPS. It removes that work from your backend servers.
Uses TLS offloading

An Application Load Balancer understands HTTPS traffic. To pass HTTPS traffic through an Application Load Balancer, an SSL certificate is provided one of the following ways:
- Importing a certificate by way of IAM or ACM services
- Creating a certificate for free using ACM
This ensures that the traffic between the client and Application Load Balancer is encrypted.
Authenticates users

An Application Load Balancer can authenticate users before they can pass through the load balancer. The Application Load Balancer uses the OpenID Connect (OIDC) protocol and integrates with other AWS services to support protocols, such as the following:
- SAML
- Lightweight Directory Access Protocol (LDAP)
- Microsoft Active Directory
- Others
Secures traffic

To prevent traffic from reaching the load balancer, you configure a security group to specify the supported IP address ranges.
Supports sticky sessions

If requests must be sent to the same backend server because the application is stateful, use the sticky session feature. This feature uses an HTTP cookie to remember which server to send the traffic to across connections.

B. Network Load Balancer

A Network Load Balancer is ideal for load balancing TCP and UDP traffic. It functions at Layer 4 of the OSI model, routing connections from a target in the target group based on IP protocol data.

Primary features of an Network Load Balancer:

Sticky sessions Routes requests from the same client to the same target.
Low latency Offers low latency for latency-sensitive applications.
Source IP address Preserves the client-side source IP address.
Static IP support Automatically provides a static IP address per Availability Zone (subnet).
Elastic IP address support Lets users assign a custom, fixed IP address per Availability Zone (subnet).
DNS failover Uses Amazon Route 53 to direct traffic to load balancer nodes in other zones.

C. Gateway Load Balancer

A Gateway Load Balancer helps you to deploy, scale, and manage your third-party appliances, such as firewalls, intrusion detection and prevention systems, and deep packet inspection systems. It provides a gateway for distributing traffic across multiple virtual appliances while scaling them up and down based on demand.

Primary features of an Gateway Load Balancer:

High availability Ensures high availability and reliability by routing traffic through healthy virtual appliances.
Monitoring Can be monitored using CloudWatch metrics.
Streamlined deployments Can deploy a new virtual appliance by selecting it in the AWS Marketplace.
Private connectivity

Connects internet gateways, virtual private clouds (VPCs), and other network resources over a private network.

6. Selecting between ELB types

You can select between the ELB service types by determining which feature is required for your application. The following table presents a list of some of the major features of load balancers. For a complete list, see "Elastic Load Balancing features" in the Resources section at the end of this lesson.

Feature	ALB	NLB	GLB
Load Balancer Type	Layer 7	Layer 4	Layer 3 gateway and Layer 4 load balancing
Target Type	IP, instance, Lambda	IP, instance, ALB	IP, instance
Protocol Listeners	HTTP, HTTPS	TCP, UDP, TLS	IP
Static IP and Elastic IP Address		Yes
Preserve Source IP Address	Yes	Yes	Yes
Fixed Response	Yes
User Authentication	Yes

III. Amazon EC2 Auto Scaling

Amazon EC2 Auto Scaling helps you maintain application availability. You can automatically add or remove EC2 instances using scaling policies that you define.

1. Capacity issues

You can improve availability and reachability by adding one more server. However, the entire system can again become unavailable if there is a capacity issue. This section looks at load issues for both active-passive systems and active-active systems. These issues are addressed through scaling.

A. Vertical Scaling

Increase the instance size. If too many requests are sent to a single active-passive system, the active server will become unavailable and, hopefully, fail over to the passive server. But this doesn’t solve anything.

With active-passive systems, you need vertical scaling. This means increasing the size of the server. With EC2 instances, you select either a larger type or a different instance type. This can be done only while the instance is in a stopped state. In this scenario, the following steps occur:

Stop the passive instance. This doesn’t impact the application because it’s not taking any traffic.
Change the instance size or type, and then start the instance again.
Shift the traffic to the passive instance, turning it active.
Stop, change the size, and start the previous active instance because both instances should match.

When the number of requests reduces, you must do the same operation. Even though there aren’t that many steps involved, it’s actually a lot of manual work. Another disadvantage is that a server can only scale vertically up to a certain limit. When that limit is reached, the only option is to create another active-passive system and split the requests and functionalities across them. This can require massive application rewriting.

This is where the active-active system can help. When there are too many requests, you can scale this system horizontally by adding more servers.

B. Horizontal Scaling

Add additional instances. As mentioned, for the application to work in an active-active system, it’s already created as stateless, not storing any client sessions on the server. This means that having two or four servers wouldn’t require any application changes. It would only be a matter of creating more instances when required and shutting them down when traffic decreases. The Amazon EC2 Auto Scaling service can take care of that task by automatically creating and removing EC2 instances based on metrics from Amazon CloudWatch. We will learn more about this service in this lesson.

You can see that there are many more advantages to using an active-active system in comparison with an active-passive system. Modifying your application to become stateless provides scalability.

2. Traditional scaling compared to auto scaling

With a traditional approach to scaling, you buy and provision enough servers to handle traffic at its peak. However, this means at nighttime, for example, you might have more capacity than traffic, which means you’re wasting money. Turning off your servers at night or at times when the traffic is lower only saves on electricity.

The cloud works differently with a pay-as-you-go model. You must turn off the unused services, especially EC2 instances you pay for on-demand. You can manually add and remove servers at a predicted time. But with unusual spikes in traffic, this solution leads to a waste of resources with over-provisioning or a loss of customers because of under-provisioning.

The need here is for a tool that automatically adds and removes EC2 instances according to conditions you define. That’s exactly what the Amazon EC2 Auto Scaling service does.

3. Amazon EC2 Auto Scaling features

The Amazon EC2 Auto Scaling service adds and removes capacity to keep a steady and predictable performance at the lowest possible cost. By adjusting the capacity to exactly what your application uses, you only pay for what your application needs. This means Amazon EC2 Auto Scaling helps scale your infrastructure and ensure high availability.

Scaling features:

Automatic scaling Automatically scales in and out based on demand.Click to flip
Scheduled scaling Scales based on user-defined schedules.Click to flip
Fleet management Automatically replaces unhealthy EC2 instances.Click to flip
Predictive scaling Uses machine learning (ML) to help schedule the optimum number of EC2 instances.Click to flip
Purchase options Includes multiple purchase models, instance types, and Availability Zones.Click to flip
Amazon EC2 availability Comes with the Amazon EC2 service.Click to flip

4. ELB with Amazon EC2 Auto Scaling

Additionally, the ELB service integrates seamlessly with Amazon EC2 Auto Scaling. As soon as a new EC2 instance is added to or removed from the Amazon EC2 Auto Scaling group, ELB is notified. However, before ELB can send traffic to a new EC2 instance, it needs to validate that the application running on the EC2 instance is available.

This validation is done by way of the ELB health checks feature you learned about in the previous lesson.

IV. Configure Amazon EC2 Auto Scaling components

There are three main components of Amazon EC2 Auto Scaling. Each of these components addresses one main question as follows:

Launch template or configuration: Which resources should be automatically scaled?
Amazon EC2 Auto Scaling groups: Where should the resources be deployed?
Scaling policies: When should the resources be added or removed?

1. Launch templates and configurations

Multiple parameters are required to create EC2 instances—Amazon Machine Image (AMI) ID, instance type, security group, additional Amazon EBS volumes, and more. All this information is also required by Amazon EC2 Auto Scaling to create the EC2 instance on your behalf when there is a need to scale. This information is stored in a launch template.

You can use a launch template to manually launch an EC2 instance or for use with Amazon EC2 Auto Scaling. It also supports versioning, which can be used for quickly rolling back if there's an issue or a need to specify a default version of the template. This way, while iterating on a new version, other users can continue launching EC2 instances using the default version until you make the necessary changes.

A launch template specifies instance configuration information, such as the ID of the AMI, instance type, and security groups. You can have multiple versions of a launch template with a subset of the full parameters.

You can create a launch template in one of three ways as follows:

Use an existing EC2 instance. All the settings are already defined.
Create one from an already existing template or a previous version of a launch template.
Create a template from scratch. These parameters will need to be defined: AMI ID, instance type, key pair, security group, storage, and resource tags.

Another way to define what Amazon EC2 Auto Scaling needs to scale is by using a launch configuration. It’s similar to the launch template, but you cannot use a previously created launch configuration as a template. You cannot create a launch configuration from an existing Amazon EC2 instance. For these reasons, and to ensure that you get the latest features from Amazon EC2, AWS recommends you use a launch template instead of a launch configuration.

2. Amazon EC2 Auto Scaling groups

The next component Amazon EC2 Auto Scaling needs is an Amazon EC2 Auto Scaling group. An Auto Scaling group helps you define where Amazon EC2 Auto Scaling deploys your resources. This is where you specify the Amazon Virtual Private Cloud (Amazon VPC) and subnets the EC2 instance should be launched in. Amazon EC2 Auto Scaling takes care of creating the EC2 instances across the subnets, so select at least two subnets that are across different Availability Zones.

With Auto Scaling groups, you can specify the type of purchase for the EC2 instances. You can use On-Demand Instances or Spot Instances. You can also use a combination of the two, which means you can take advantage of Spot Instances with minimal administrative overhead.

To specify how many instances Amazon EC2 Auto Scaling should launch, you have three capacity settings to configure for the group size.

To learn more, choose each of the three numbered markers.

Minimum capacity
This is the minimum number of instances running in your Auto Scaling group, even if the threshold for lowering the number of instances is reached.
When Amazon EC2 Auto Scaling removes EC2 instances because the traffic is minimal, it keeps removing EC2 instances until it reaches a minimum capacity.
When reaching that limit, even if Amazon EC2 Auto Scaling is instructed to remove an instance, it does not. This ensures that the minimum is kept.
Note: Depending on your application, using a minimum of two is recommended to ensure high availability. However, you ultimately know how many EC2 instances at a bare minimum your application requires at all times.

Desired capacity

The desired capacity is the number of EC2 instances that Amazon EC2 Auto Scaling creates at the time the group is created. This number can only be within or equal to the minimum or maximum.

If that number decreases, Amazon EC2 Auto Scaling removes the oldest instance by default. If that number increases, Amazon EC2 Auto Scaling creates new instances using the launch template.

Maximum capacity

This is the maximum number of instances running in your Auto Scaling group, even if the threshold for adding new instances is reached.

When traffic keeps growing, Amazon EC2 Auto Scaling keeps adding EC2 instances. This means the cost for your application will also keep growing. That’s why you must set a maximum amount to ensure it doesn’t go above your budget.

3. Scaling policies

By default, an Auto Scaling group will be kept to its initial desired capacity. While it’s possible to manually change the desired capacity, you can also use scaling policies.

In the Monitoring lesson, you learned about CloudWatch metrics and alarms. You use metrics to keep information about different attributes of your EC2 instance, such as the CPU percentage. You use alarms to specify an action when a threshold is reached. Metrics and alarms are what scaling policies use to know when to act. For example, you can set up an alarm that states when the CPU utilization is above 70 percent across the entire fleet of EC2 instances. It will then invoke a scaling policy to add an EC2 instance.

Three types of scaling policies are available: simple, step, and target tracking scaling. To learn about a category, choose the appropriate tab.

Simple Scaling Policy

With a simple scaling policy, you can do exactly what’s described in this module. You use a CloudWatch alarm and specify what to do when it is invoked. This can include adding or removing a number of EC2 instances or specifying a number of instances to set the desired capacity to. You can specify a percentage of the group instead of using a number of EC2 instances, which makes the group grow or shrink more quickly.

After the scaling policy is invoked, it enters a cooldown period before taking any other action. This is important because it takes time for the EC2 instances to start, and the CloudWatch alarm might still be invoked while the EC2 instance is booting. For example, you might decide to add an EC2 instance if the CPU utilization across all instances is above 65 percent. You don’t want to add more instances until that new EC2 instance is accepting traffic. However, what if the CPU utilization is now above 85 percent across the Auto Scaling group?

Adding one instance might not be the right move. Instead, you might want to add another step in your scaling policy. Unfortunately, a simple scaling policy can’t help with that. This is where a step scaling policy helps.

Step Scaling Policy

Step scaling policies respond to additional alarms even when a scaling activity or health check replacement is in progress. Similar to the previous example, you might decide to add two more instances when CPU utilization is at 85 percent and four more instances when it’s at 95 percent.

Deciding when to add and remove instances based on CloudWatch alarms might seem like a difficult task. This is why the third type of scaling policy exists—target tracking.

Target Tracking Scaling Policy

If your application scales based on average CPU utilization, average network utilization (in or out), or request count, then this scaling policy type is the one to use. All you need to provide is the target value to track, and it automatically creates the required CloudWatch alarms.

AWS Monitoring - Part 1: AWS CloudWatch

Hulk Pham — Thu, 31 Oct 2024 08:45:33 +0000

TL;DR

Introduction to Monitoring

Monitoring collects and analyzes data about operational health and usage of resources, helping to answer questions about system performance and issues
Metrics are individual data points created by resources, which become statistics when collected and analyzed over time

Benefits of Monitoring

Enables proactive response to operational issues, improves performance and reliability, and helps recognize security threats
Facilitates data-driven decisions and creates cost-effective solutions by optimizing resource usage

Amazon CloudWatch

CloudWatch is a centralized monitoring and observability service that collects resource data and provides actionable insights
It offers features like anomaly detection, alarms, log analysis, and automated actions
CloudWatch Logs allows for centralized storage and analysis of log files from various AWS services and applications
CloudWatch alarms can be set up to automatically initiate actions based on sustained state changes of metrics, helping prevent and troubleshoot issues

I. Mornitoring Intruduction

1. Purpose of monitoring

When operating a website like the employee directory application on AWS, you might have questions like the following:

How many people are visiting my site day to day?
How can I track the number of visitors over time?
How will I know if the website is having performance or availability issues?
What happens if my Amazon Elastic Compute Cloud (Amazon EC2) instance runs out of capacity?
Will I be alerted if my website goes down?

You need a way to collect and analyze data about the operational health and usage of your resources. The act of collecting, analyzing, and using data to make decisions or answer questions about your IT resources and systems is called monitoring.

Monitoring provides a near real-time pulse on your system and helps answer the previous questions. You can use the data you collect to watch for operational issues caused by events like overuse of resources, application flaws, resource misconfiguration, or security-related events. Think of the data collected through monitoring as outputs of the system, or metrics.

2. Use metrics to solve problems

The AWS resources that host your solutions create various forms of data that you might be interested in collecting. Each individual data point that a resource creates is a metric. Metrics that are collected and analyzed over time become statistics, such as average CPU utilization over time showing a spike.

One way to evaluate the health of an EC2 instance is through CPU utilization. Generally speaking, if an EC2 instance has a high CPU utilization, it can mean a flood of requests. Or it can reflect a process that has encountered an error and is consuming too much of the CPU. When analyzing CPU utilization, take a process that exceeds a specific threshold for an unusual length of time. Use that abnormal event as a cue to either manually or automatically resolve the issue through actions like scaling the instance.

CPU utilization is one example of a metric. Other examples of metrics that EC2 instances have are network utilization, disk performance, memory utilization, and the logs created by the applications running on top of Amazon EC2.

3. Types of metrics

Different resources in AWS create different types of metrics. To see examples of metrics associated with different resources, flip each of the following flashcards by choosing them.****

Amazon Simple Storage Service (Amazon S3) metrics

◦ Size of objects stored in a bucket

◦ Number of objects stored in a bucket

◦ Number of HTTP request made to a bucket

Amazon Relational Database Service (Amazon RDS) metrics

◦ Database connections

◦ CPU utilization of an instance

◦ Disk space consumption

Amazon EC2 metrics

◦ CPU utilization

◦ Network utilization

◦ Disk performance

◦ Status checks

4. Monitoring benefits

Monitoring gives you visibility into your resources, but the question now is, "Why is that important?" This section describes some of the benefits of monitoring.

To learn more, expand each of the following five categories.

Respond proactively

Respond to operational issues proactively before your end users are aware of them. Waiting for end users to let you know when your application is experiencing an outage is a bad practice. Through monitoring, you can keep tabs on metrics like error response rate and request latency. Over time, the metrics help signal when an outage is going to occur. You can automatically or manually perform actions to prevent the outage from happening and fix the problem before your end users are aware of it.

Improve performance and reliability

Monitoring can improve the performance and reliability of your resources. Monitoring the various resources that comprise your application provides you with a full picture of how your solution behaves as a system. Monitoring, if done well, can illuminate bottlenecks and inefficient architectures. This helps you drive performance and improve reliability.

Recognize security threats and events

By monitoring, you can recognize security threats and events. When you monitor resources, events, and systems over time, you create what is called a baseline. A baseline defines normal activity. Using a baseline, you can spot anomalies like unusual traffic spikes or unusual IP addresses accessing your resources. When an anomaly occurs, an alert can be sent out or an action can be taken to investigate the event.

Make data-driven decisions

Monitoring helps you make data-driven decisions for your business. Monitoring keeps an eye on IT operational health and drives business decisions. For example, suppose you launched a new feature for your cat photo app and now you want to know if it’s being used. You can collect application-level metrics and view the number of users who use the new feature. With your findings, you can decide whether to invest more time into improving the new feature.

Create cost-effective solutions

Through monitoring, you can create more cost-effective solutions. You can view resources that are underused and rightsize your resources to your usage. This helps you optimize cost and make sure you aren’t spending more money than necessary.

II. Amazon CloudWatch

1. Visibility using CloudWatch

AWS resources create data that you can monitor through metrics, logs, network traffic, events, and more. This data comes from components that are distributed in nature. This can lead to difficulty in collecting the data you need if you don’t have a centralized place to review it all. AWS has taken care of centralizing the data collection for you with a service called CloudWatch.

CloudWatch is a monitoring and observability service that collects your resource data and provides actionable insights into your applications. With CloudWatch, you can respond to system-wide performance changes, optimize resource usage, and get a unified view of operational health.

You can use CloudWatch to do the following:

Detect anomalous behavior in your environments.
Set alarms to alert you when something is not right.
Visualize logs and metrics with the AWS Management Console.
Take automated actions like scaling.
Troubleshoot issues.
Discover insights to keep your applications healthy.

2. How CloudWatch works

With CloudWatch, all you need to get started is an AWS account. It is a managed service that you can use for monitoring without managing the underlying infrastructure.

To learn more, choose each numbered marker.

Collect

Collect metrics and logs from your resources, applications, and services that run on AWS or on-premises servers.

Monitor

Visualize applications and infrastructure with dashboards. Troubleshoot with correlated logs and metrics, and set alerts.

Act

Automate responses to operational changes with CloudWatch events and auto scaling.

Analyze

Up to 1-second metrics, extended data retention (15 months), and real-time analysis with CloudWatch metric math.

The employee directory application is built with various AWS services working together as building blocks. Monitoring the individual services independently can be challenging. Fortunately, CloudWatch acts as a centralized place where metrics are gathered and analyzed.

Many AWS services automatically send metrics to CloudWatch for free at a rate of 1 data point per metric per 5-minute interval. This is called basic monitoring, and it gives you visibility into your systems without any extra cost. For many applications, basic monitoring is adequate.

For applications running on EC2 instances, you can get more granularity by posting metrics every minute instead of every 5-minutes using a feature like detailed monitoring. Detailed monitoring incurs a fee. For more information about pricing, see "Amazon CloudWatch Pricing" in the Resources section at the end of this lesson.

3. CloudWatch concepts

Metrics are the fundamental concept in CloudWatch. A metric represents a time-ordered set of data points that are published to CloudWatch. Think of a metric as a variable to monitor and the data points as representing the values of that variable over time. Every metric data point must be associated with a timestamp.

To learn more, choose each numbered marker.

Metric

Metrics are data about the performance of your systems.

For example, the CPU usage of a particular EC2 instance is one metric provided by Amazon EC2.

Timestamp

Each metric data point must be associated with a timestamp. If you do not provide a timestamp, CloudWatch creates one for you based on the time the data point was received.

AWS services that send data to CloudWatch attach dimensions to each metric. A dimension is a name and value pair that is part of the metric’s identity. You can use dimensions to filter the results that CloudWatch returns. For example, many Amazon EC2 metrics publish InstanceId as a dimension name and the actual instance ID as the value for that dimension.

By default, many AWS services provide metrics at no charge for resources such as EC2 instances, Amazon Elastic Block Store (Amazon EBS) volumes, and Amazon RDS database (DB) instances. For a charge, you can activate features such as detailed monitoring or publishing your own application metrics on resources such as your EC2 instances.

4. Custom metrics

Suppose you have an application, and you want to record the number of page views your website gets. How would you record this metric with CloudWatch? First, it's an application-level metric. That means it’s not something the EC2 instance would post to CloudWatch by default. This is where custom metrics come in. With custom metrics, you can publish your own metrics to CloudWatch.

If you want to gain more granular visibility, you can use high-resolution custom metrics, which make it possible for you to collect custom metrics down to a 1-second resolution. This means you can send 1 data point per second per custom metric.

Some examples of custom metrics include the following:

Webpage load times
Request error rates
Number of processes or threads on your instance
Amount of work performed by your application

5. CloudWatch dashboards

Once you provision your AWS resources and they are sending metrics to CloudWatch, you can visualize and review that data using CloudWatch dashboards. Dashboards are customizable home pages you can configure for data visualization for one or more metrics through widgets, such as a graph or text.

You can build many custom dashboards, each one focusing on a distinct view of your environment. You can even pull data from different AWS Regions into a single dashboard to create a global view of your architecture. The following screenshot an example of a dashboard with metrics from Amazon EC2 and Amazon EBS.

CloudWatch aggregates statistics according to the period of time that you specify when creating your graph or requesting your metrics. You can also choose whether your metric widgets display live data. Live data is data published within the last minute that has not been fully aggregated.

You are not bound to using CloudWatch exclusively for all your visualization needs. You can use external or custom tools to ingest and analyze CloudWatch metrics using the GetMetricData API.

As far as security is concerned, with AWS Identity and Access Management (IAM) policies, you control who has access to view or manage your CloudWatch dashboards.

6. Amazon CloudWatch Logs

CloudWatch Logs is centralized place for logs to be stored and analyzed. With this service, you can monitor, store, and access your log files from applications running on EC2 instances, AWS Lambda functions, and other sources.

With CloudWatch Logs, you can query and filter your log data. For example, suppose you’re looking into an application logic error for your application. You know that when this error occurs, it will log the stack trace. Because you know it logs the error, you query your logs in CloudWatch Logs to find the stack trace. You also set up metric filters on logs, which turn log data into numerical CloudWatch metrics that you can graph and use on your dashboards.

Some services, like Lambda, are set up to send log data to CloudWatch Logs with minimal effort. With Lambda, all you need to do is give the Lambda function the correct IAM permissions to post logs to CloudWatch Logs. Other services require more configuration. For example, to send your application logs from an EC2 instance into CloudWatch Logs, you need to install and configure the CloudWatch Logs agent on the EC2 instance. With the CloudWatch Logs agent, EC2 instances can automatically send log data to CloudWatch Logs.

CloudWatch Logs terminology

Log data sent to CloudWatch Logs can come from different sources, so it’s important you understand how they’re organized.

To learn more about logs terminology, choose each of the three numbered markers.

Log event

A log event is a record of activity recorded by the application or resource being monitored. It has a timestamp and an event message.

Log stream

Log events are grouped into log streams, which are sequences of log events that all belong to the same resource being monitored.

For example, logs for an EC2 instance are grouped together into a log stream that you can filter or query for insights.

Log group

A log group is composed of log streams that all share the same retention and permissions settings.

For example, suppose you have multiple EC2 instances hosting your application and you send application log data to CloudWatch Logs. You can group the log streams from each instance into one log group.

7. CloudWatch alarms

You can create CloudWatch alarms to automatically initiate actions based on sustained state changes of your metrics. You configure when alarms are invoked and the action that is performed.

First, you must decide which metric you want to set up an alarm for, and then you define the threshold that will invoke the alarm. Next, you define the threshold's time period. For example, suppose you want to set up an alarm for an EC2 instance to invoke when the CPU utilization goes over a threshold of 80 percent. You also must specify the time period the CPU utilization is over the threshold.

You don’t want to invoke an alarm based on short, temporary spikes in the CPU. You only want to invoke an alarm if the CPU is elevated for a sustained amount of time. For example, if CPU utilization exceeds 80 percent for 5 minutes or longer, there might be a resource issue. To set up an alarm you need to choose the metric, threshold, and time period.

An alarm can be invoked when it transitions from one state to another. After an alarm is invoked, it can initiate an action. Actions can be an Amazon EC2 action, an automatic scaling action, or a notification sent to Amazon Simple Notification Service (Amazon SNS).

States of an alarm:

OK: The metric is within the defined threshold. Everything appears to be operating like normal.
ALARM: The metric is outside the defined threshold. This might be an operational issue.
INSUFFICIENT_DATA: The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state.

8. Prevent and troubleshoot issues with CloudWatch alarms

CloudWatch Logs uses metric filters to turn the log data into metrics that you can graph or set an alarm on. The following timeline indicates the order of the steps to complete when setting up an alarm. It also provides an example using our employee directory application.

Set up a metric filter

For the employee directory application, suppose you set up a metric filter for HTTP 500 error response codes.
Define an alarm

Then, you define which metric alarm state ****should be invoked based on the threshold. With this example, the alarm state is invoked if HTTP 500 error responses are sustained for a specified period of time.
Define an action

Next, you define an action that you want to take place when the alarm is invoked. Here, it makes sense to send an email or text alert to you so you can start troubleshooting the website. Hopefully, you can fix it before it becomes a bigger issue.

After the alarm is set up, you know that if the error happens again, you will be notified promptly.

You can set up different alarms for different reasons to help you prevent or troubleshoot operational issues. In the scenario just described, the alarm invokes an Amazon SNS notification that goes to a person who looks into the issue manually.

Another option is to have alarms invoke actions that automatically remediate technical issues. For example, you can set up an alarm to invoke an EC2 instance to reboot or scale services up or down. You can even set up an alarm to invoke an Amazon SNS notification that invokes a Lambda function. The Lambda function then calls any AWS API to manage your resources and troubleshoot operational issues. By using AWS services together like this, you can respond to events more quickly.

AWS Database - Part 3: Purpose-Built Databases

Hulk Pham — Thu, 31 Oct 2024 08:26:37 +0000

TL;DR

Purpose-Built Databases Overview

AWS offers a range of purpose-built databases to support diverse data models and application needs, moving away from the one-size-fits-all approach of relational databases
Purpose-built databases allow developers to choose the best database for specific problems, enabling the building of highly scalable, distributed applications

Key AWS Database Services

Amazon DynamoDB: A fully managed NoSQL database for high-scale and serverless applications
Amazon ElastiCache: A fully managed, in-memory caching solution supporting Redis and Memcached
Amazon DocumentDB: A fully managed document database with MongoDB compatibility
Amazon Neptune: A fully managed graph database for highly connected data
Amazon Timestream: A serverless time series database for IoT and operational applications
Amazon QLDB: A ledger database providing a verifiable history of data changes

Database Selection Guide

AWS provides a variety of database options tailored to specific use cases, from relational databases for traditional applications to specialized databases for graph data, time series, and ledger applications

I. Purpose-built databases for all application needs

We covered Amazon RDS and relational databases in the previous lesson, and for a long time, relational databases were the default option. They were widely used in nearly all applications. A relational database is like a multi-tool. It can do many things, but it is not perfectly suited to any one particular task. It might not always be the best choice for your business needs.

The one-size-fits-all approach of using a relational database for everything no longer works. Over the past few decades, there has been a shift in the database landscape, and this shift has led to the rise of purpose-built databases. Developers can consider the needs of their application and choose a database that will fit those needs.

AWS offers a broad and deep portfolio of purpose-built databases that support diverse data models. Customers can use them to build data-driven, highly scalable, distributed applications. You can pick the best database to solve a specific problem and break away from restrictive commercial databases. You can focus on building applications that meet the needs of your organization.

1. Amazon DynamoDB

DynamoDB is a fully managed NoSQL database that provides fast, consistent performance at any scale. It has a flexible billing model, tight integration with infrastructure as code (IaC), and a hands-off operational model. DynamoDB has become the database of choice for two categories of applications: high-scale applications and serverless applications. Although DynamoDB is the database of choice for high-scale and serverless applications, it can work for nearly all online transaction processing (OLTP) application workloads. We will explore DynamoDB more in the next lesson.

2. Amazon ElastiCache

ElastiCache is a fully managed, in-memory caching solution. It provides support for two open-source, in-memory cache engines: Redis and Memcached. You aren’t responsible for instance failovers, backups and restores, or software upgrades.

3. Amazon MemoryDB for Redis

MemoryDB is a Redis-compatible, durable, in-memory database service that delivers ultra-fast performance. With MemoryDB, you can achieve microsecond read latency, single-digit millisecond write latency, high throughput, and Multi-AZ durability for modern applications, like those built with microservices architectures. You can use MemoryDB as a fully managed, primary database to build high-performance applications. You do not need to separately manage a cache, durable database, or the required underlying infrastructure.

4. Amazon DocumentDB (with MongoDB compatibility)

Amazon DocumentDB is a fully managed document database from AWS. A document database is a type of NoSQL database you can use to store and query rich documents in your application. These types of databases work well for the following use cases: content management systems, profile management, and web and mobile applications. Amazon DocumentDB has API compatibility with MongoDB. This means you can use popular open-source libraries to interact with Amazon DocumentDB, or you can migrate existing databases to Amazon DocumentDB with minimal hassle.

5. Amazon Keyspaces (for Apache Cassandra)

Amazon Keyspaces is a scalable, highly available, and managed Apache Cassandra compatible database service. Apache Cassandra is a popular option for high-scale applications that need top-tier performance. Amazon Keyspaces is a good option for high-volume applications with straightforward access patterns. With Amazon Keyspaces, you can run your Cassandra workloads on AWS using the same Cassandra Query Language (CQL) code, Apache 2.0 licensed drivers, and tools that you use today.

6. Amazon Neptune

Neptune is a fully managed graph database offered by AWS. A graph database is a good choice for highly connected data with a rich variety of relationships. Companies often use graph databases for recommendation engines, fraud detection, and knowledge graphs.

7. Amazon Timestream

Timestream is a fast, scalable, and serverless time series database service for Internet of Things (IoT) and operational applications. It makes it easy to store and analyze trillions of events per day up to 1,000 times faster and for as little as one-tenth of the cost of relational databases. Time series data is a sequence of data points recorded over a time interval. It is used for measuring events that change over time, such as stock prices over time or temperature measurements over time.

8. Amazon Quantum Ledger Database (Amazon QLDB)

With traditional databases, you can overwrite or delete data, so developers use techniques, such as audit tables and audit trails to help track data lineage. These approaches can be difficult to scale and put the burden of ensuring that all data is recorded on the application developer. Amazon QLDB is a purpose-built ledger database that provides a complete and cryptographically verifiable history of all changes made to your application data.

II. Choosing the Right AWS Database Service

As we learned in the previous lessons, AWS has a variety of database options for different use cases. The following table provides a quick look at the AWS database portfolio.

AWS Service(s)	Database Type	Use Cases
Amazon RDS, Aurora,Amazon Redshift	Relational	Traditional applications, ERP, CRM,ecommerce
DynamoDB	Key-value	High-traffic web applications, ecommerce systems, gaming applications
Amazon ElastiCache for Memcached, Amazon ElastiCache for Redis	In-memory	Caching, session management, gaming leaderboards, geospatial applications
Amazon DocumentDB	Document	Content management, catalogs, user profiles
Amazon Keyspaces	Wide column	High-scale industrial applications for equipment maintenance, fleet management, route optimization
Neptune	Graph	Fraud detection, social networking, recommendation engines
Timestream	Time series	IoT applications, Development Operations (DevOps), industrial telemetry
Amazon QLDB	Ledger	Systems of record, supply chain, registrations, banking transactions

AWS Database - Part 2: DynamoDB

Hulk Pham — Thu, 31 Oct 2024 08:21:19 +0000

TL;DR

Overview and Core Components

DynamoDB is a fully managed NoSQL database service offering fast performance and seamless scalability
Core components include tables (collections of data), items (groups of attributes), and attributes (fundamental data elements)

Key Features and Capabilities

Allows creation of database tables that can store and retrieve any amount of data and handle any level of request traffic
Enables scaling of table throughput capacity without downtime or performance degradation
Automatically replicates data across multiple Availability Zones for high availability and durability

Use Cases

Ideal for applications requiring high scalability, OLTP workloads, and mission-critical systems needing high availability
Suitable for various scenarios, including software applications, media metadata stores, gaming platforms, and retail experiences

Security Features

Provides full encryption at rest using AWS Key Management Service
Offers fine-grained access control through IAM roles and policy conditions
Enables monitoring of operations and key usage through AWS CloudTrail

I. DynamoDB overview

DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. With DynamoDB, you can offload the administrative burdens of operating and scaling a distributed database. You don't need to worry about hardware provisioning, setup and configuration, replication, software patching, or cluster scaling.

With DynamoDB, you can do the following:

Create database tables that can store and retrieve any amount of data and serve any level of request traffic.
Scale up or scale down your tables' throughput capacity without downtime or performance degradation.
Monitor resource usage and performance metrics using the AWS Management Console.

DynamoDB automatically spreads the data and traffic for your tables over a sufficient number of servers to handle your throughput and storage requirements. It does this while maintaining consistent, fast performance. All your data is stored on SSDs and is automatically replicated across multiple Availability Zones in a Region, providing built-in high availability and data durability.

II. DynamoDB core components

In DynamoDB, tables, items, and attributes are the core components that you work with. A table is a collection of items, and each item is a collection of attributes. DynamoDB uses primary keys to uniquely identify each item in a table and secondary indexes to provide more querying flexibility.

Table

Similar to other database systems, DynamoDB stores data in tables. A table is a collection of data. For example, you can have a table called Person that you can use to store personal contact information about friends, family, or anyone else of interest. You can also have a Cars table to store information about vehicles that people drive.

Item

Each table contains zero or more items. An item is a group of attributes that is uniquely identifiable among all the other items. In a Person table, each item represents a person. In a Cars table, each item represents one vehicle. Items in DynamoDB are similar in many ways to rows, records, or tuples in other database systems. In DynamoDB, there is no limit to the number of items you can store in a table.

Attribute

Each item is composed of one or more attributes. An attribute is a fundamental data element, something that does not need to be broken down any further. For example, an item in a Person table might contain attributes called PersonID, LastName, FirstName, and so on. In a Department table, an item might have attributes such as DepartmentID, Name, Manager, and so on. Attributes in DynamoDB are similar in many ways to fields or columns in other database systems.

III. DynamoDB use cases

DynamoDB is a fully managed service that handles the operations work. You can offload the administrative burdens of operating and scaling distributed databases to AWS.

You might want to consider using DynamoDB in the following circumstances:

You are experiencing scalability problems with other traditional database systems.
You are actively engaged in developing an application or service.
You are working with an OLTP workload.
You care deploying a mission-critical application that must be highly available at all times without manual intervention.
You require a high level of data durability, regardless of your backup-and-restore strategy.

DynamoDB is used in a wide range of workloads because of its simplicity, from low-scale operations to ultrahigh-scale operations, such as those demanded by Amazon.com.

To learn more about potential use cases, expand each of the following four categories:

Develop software applications

Build internet-scale applications supporting user-content metadata and caches that require high concurrency and connections for millions of users and millions of requests per second.

Create media metadata stores

Scale throughput and concurrency for analysis of media and entertainment workloads, such as real-time video streaming and interactive content. Deliver lower latency with multi-Region replication across Regions.

Scale gaming platforms

Focus on driving innovation with no operational overhead. Build out your game platform with player data, session history, and leaderboards for millions of concurrent users.

Deliver seamless retail experiences

Use design patterns for deploying shopping carts, workflow engines, inventory tracking, and customer profiles. DynamoDB supports high-traffic, extreme-scaled events and can handle millions of queries per second.

IV. DynamoDB security

DynamoDB provides a number of security features to consider as you develop and implement your own security policies. They include the following:

DynamoDB provides a highly durable storage infrastructure designed for mission-critical and primary data storage. Data is redundantly stored on multiple devices across multiple facilities in a DynamoDB Region.
All user data stored in DynamoDB is fully encrypted at rest. DynamoDB encryption at rest provides enhanced security by encrypting all your data at rest using encryption keys stored in AWS Key Management Service (AWS KMS).
IAM administrators control who can be authenticated and authorized to use DynamoDB resources. You can use IAM to manage access permissions and implement security policies.
As a managed service, DynamoDB is protected by the AWS global network security procedures.

Use AWS CloudTrail to monitor AWS managed key usage

If you are using an AWS managed key for encryption at rest, usage of the key is recorded in AWS CloudTrail. CloudTrail can tell you who made the request, the services used, actions performed, parameters for the action, and response elements returned.

Use IAM roles to authenticate access to DynamoDB

For users, applications, and other AWS services to access DynamoDB, they must include valid AWS credentials in their AWS API requests. Use IAM roles to obtain temporary access keys.

Use IAM policy conditions for fine-grained access control

When you grant permissions in DynamoDB, you can specify conditions that determine how a permissions policy takes effect. Implementing least privilege is key in reducing security risk and the impact that can result from errors or malicious intent.

Monitor DynamoDB operations using CloudTrail

When activity occurs in DynamoDB, that activity is recorded in a CloudTrail event. For an ongoing record of events in DynamoDB and in your AWS account, create a trail to deliver log files to an Amazon Simple Storage Service (Amazon S3) bucket.

AWS Database - Part 1: AWS RDS

Hulk Pham — Wed, 30 Oct 2024 14:36:17 +0000

TL;DR

Relational Databases

Relational databases organize data into tables with relationships between them, using SQL for complex queries and data management
Benefits include reduced data redundancy, familiarity among professionals, and adherence to ACID principles

Amazon RDS Overview

Amazon RDS is a managed database service that supports various engines including commercial, open-source, and cloud-native options
It allows users to focus on application development rather than database management tasks

Amazon RDS Features

Offers different storage types (General Purpose SSD, Provisioned IOPS SSD, and Magnetic) to tailor performance and cost
Provides automated backups with point-in-time recovery and manual snapshots for longer retention periods
Supports Multi-AZ deployment for high availability, with automatic failover to a standby database in case of issues

Security and Management

Implements security measures including IAM policies, security groups, encryption at rest, and SSL/TLS connections

I. Relational databases

A relational database organizes data into tables. Data in one table can link to data in other tables to create relationships—hence, the relational part of the name.

A table stores data in rows and columns. A row, often called a record, contains all information about a specific entry. Columns describe attributes of an entry. The following image is an example of three tables in a relational database.

1. Relational database management system

With a relational database management system (RDBMS), you can create, update, and administer a relational database. Some common examples of RDBMSs include the following:

MySQL
PostgresQL
Oracle
Microsoft SQL Server
Amazon Aurora

You communicate with an RDBMS by using structured query language (SQL) queries, similar to the following example:

SELECT * FROM table_name.

2. Relational database benefits

To learn more about the benefits of using relational databases, flip each of the following flashcards by choosing them.

Complex SQL

With SQL, you can join multiple tables so you can better understand relationships between your data.

Reduced redundancy

You can store data in one table and reference it from other tables instead of saving the same data in different places.

Familiarity

Because relational databases have been a popular choice since the 1970s, technical professionals often have familiarity and experience with them.

Accuracy

Relational databases ensure that your data has high integrity and adheres to the atomicity, consistency, isolation, and durability (ACID) principle.

3. Relational database use cases

Much of the world runs on relational databases. In fact, they’re at the core of many mission-critical applications, some of which you might use in your day-to-day life.

To learn some common use cases for relational databases, expand each of the following two categories.

Applications that have a fixed schema

These are applications that have a fixed schema and don't change often. An example is a lift-and-shift application that lifts an app from on-premises and shifts it to the cloud, with little or no modifications.

Applications that need persistent storage

These are applications that need persistent storage and follow the ACID principle, such as:

Enterprise resource planning (ERP) applications
Customer relationship management (CRM) applications
Commerce and financial applications

4. Choose between unmanaged and managed databases

If you want to trade your on-premises database for a relational database on AWS, you first need to select how you want to run it—managed or unmanaged. Managed services and unmanaged services are handled similar to the shared responsibility model. The shared responsibility model distinguishes between AWS security responsibilities and the customer’s security responsibilities. Likewise, managed compared to unmanaged can be understood as a trade-off between convenience and control.

Unmanaged databases

If you operate a relational database on premises, you are responsible for all aspects of operation. This includes data center security and electricity, host machines management, database management, query optimization, and customer data management. You are responsible for absolutely everything, which means you have control over absolutely everything.

Now, suppose you want to shift some of the work to AWS by running your relational database on Amazon Elastic Compute Cloud (Amazon EC2). If you host a database on Amazon EC2, AWS implements and maintains the physical infrastructure and hardware and installs the EC2 instance operating system (OS). However, you are still responsible for managing the EC2 instance, managing the database on that host, optimizing queries, and managing customer data.

This is called an unmanaged database option. In this option, AWS is responsible for and has control over the hardware and underlying infrastructure. You are responsible for and have control over management of the host and database.

You are responsible for everything in a database hosted on-premises. AWS takes on more of that responsibility in databases hosted in Amazon EC2.

Managed databases

To shift more of the work to AWS, you can use a managed database service. These services provide the setup of both the EC2 instance and the database, and they provide systems for high availability, scalability, patching, and backups. However, in this model, you’re still responsible for database tuning, query optimization, and ensuring that your customer data is secure. This option provides the ultimate convenience but the least amount of control compared to the two previous options.

II. Amazon RDS overview

Amazon RDS is a managed database service customers can use to create and manage relational databases in the cloud without the operational burden of traditional database management. For example, imagine you sell healthcare equipment, and your goal is to be the number-one seller on the West Coast of the United States. Building a database doesn’t directly help you achieve that goal. However, having a database is a necessary component to achieving that goal.

With Amazon RDS, you can offload some of the unrelated work of creating and managing a database. You can focus on the tasks that differentiate your application, instead of focusing on infrastructure-related tasks, like provisioning, patching, scaling, and restoring.

Amazon RDS supports most of the popular RDBMSs, ranging from commercial options to open-source options and even a specific AWS option. Supported Amazon RDS engines include the following:

Commercial: Oracle, SQL Server
Open source: MySQL, PostgreSQL, MariaDB
Cloud native: Aurora

III. Database instances

Just like the databases you build and manage yourself, Amazon RDS is built from compute and storage. The compute portion is called the database (DB) instance, which runs the DB engine. Depending on the engine selected, the instance will have different supported features and configurations. A DB instance can contain multiple databases with the same engine, and each DB can contain multiple tables.

Underneath the DB instance is an EC2 instance. However, this instance is managed through the Amazon RDS console instead of the Amazon EC2 console. When you create your DB instance, you choose the instance type and size. The DB instance class you choose affects how much processing power and memory it has.

To learn more about the various instance classes, choose each of the four numbered markers.

IV. Storage on Amazon RDS

The storage portion of DB instances for Amazon RDS use Amazon Elastic Block Store (Amazon EBS) volumes for database and log storage. This includes MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server.

When using Aurora, data is stored in cluster volumes, which are single, virtual volumes that use solid-state drives (SSDs). A cluster volume contains copies of your data across three Availability Zones in a single AWS Region. For nonpersistent, temporary files, Aurora uses local storage.

Amazon RDS provides three storage types: General Purpose SSD (also called gp2 and gp3), Provisioned IOPS SSD (also called io1), and Magnetic (also called standard). They differ in performance characteristics and price, which means you can tailor your storage performance and cost to the needs of your database workload.

To learn more about the different storage types, choose each of the three numbered markers.

V. Amazon RDS in an Amazon Virtual Private Cloud

When you create a DB instance, you select the Amazon Virtual Private Cloud (Amazon VPC) your databases will live in. Then, you select the subnets that will be designated for your DB. This is called a DB subnet group, and it has at least two Availability Zones in its Region. The subnets in a DB subnet group should be private, so they don’t have a route to the internet gateway. This ensures that your DB instance, and the data inside it, can be reached only by the application backend.

Access to the DB instance can be restricted further by using network access control lists (network ACLs) and security groups. With these firewalls, you can control, at a granular level, the type of traffic you want to provide access into your database.

Using these controls provides layers of security for your infrastructure. It reinforces that only the backend instances have access to the database.

VI. Backup data

You don’t want to lose your data. To take regular backups of your Amazon RDS instance, you can use automated backups or manual snapshots. To learn about a category, choose the appropriate tab.

1. Automated Backups

Automated backups are turned on by default. This backs up your entire DB instance (not just individual databases on the instance) and your transaction logs. When you create your DB instance, you set a backup window that is the period of time that automatic backups occur. Typically, you want to set the window during a time when your database experiences little activity because it can cause increased latency and downtime.

Retaining backups: Automated backups are retained between 0 and 35 days. You might ask yourself, “Why set automated backups for 0 days?” The 0 days setting stops automated backups from happening. If you set it to 0, it will also delete all existing automated backups. This is not ideal. The benefit of automated backups that you can do point-in-time recovery.

Point-in-time recovery: This creates a new DB instance using data restored from a specific point in time. This restoration method provides more granularity by restoring the full backup and rolling back transactions up to the specified time range.

2. Manual Snapshots

If you want to keep your automated backups longer than 35 days, use manual snapshots. Manual snapshots are similar to taking Amazon EBS snapshots, except you manage them in the Amazon RDS console. These are backups that you can initiate at any time. They exist until you delete them. For example, to meet a compliance requirement that mandates you to keep database backups for a year, you need to use manual snapshots. If you restore data from a manual snapshot, it creates a new DB instance using the data from the snapshot.

Choosing a backup option

It is advisable to deploy both backup options. Automated backups are beneficial for point-in-time recovery. With manual snapshots, you can retain backups for longer than 35 days.

VII. Redundancy with Amazon RDS Multi-AZ

In an Amazon RDS Multi-AZ deployment, Amazon RDS creates a redundant copy of your database in another Availability Zone. You end up with two copies of your database—a primary copy in a subnet in one Availability Zone and a standby copy in a subnet in a second Availability Zone.

The primary copy of your database provides access to your data so that applications can query and display the information. The data in the primary copy is synchronously replicated to the standby copy. The standby copy is not considered an active database, and it does not get queried by applications.

To improve availability, Amazon RDS Multi-AZ ensures that you have two copies of your database running and that one of them is in the primary role. If an availability issue arises, such as the primary database loses connectivity, Amazon RDS initiates an automatic failover.

When you create a DB instance, a Domain Name System (DNS) name is provided. AWS uses that DNS name to fail over to the standby database. In an automatic failover, the standby database is promoted to the primary role, and queries are redirected to the new primary database.

To help ensure that you don't lose Multi-AZ configuration, there are two ways you can create a new standby database. They are as follows:

Demote the previous primary to standby if it's still up and running.
Stand up a new standby DB instance.

The reason you can select multiple subnets for an Amazon RDS database is because of the Multi-AZ configuration. You will want to ensure that you have subnets in different Availability Zones for your primary and standby copies.

VIII. Amazon RDS security

When it comes to security in Amazon RDS, you have control over managing access to your Amazon RDS resources, such as your databases on a DB instance. How you manage access will depend on the tasks you or other users need to perform in Amazon RDS. Network ACLs and security groups help users dictate the flow of traffic. If you want to restrict the actions and resources others can access, you can use AWS Identity and Access Management (IAM) policies.

IAM

Use IAM policies to assign permissions that determine who can manage Amazon RDS resources. For example, you can use IAM to determine who can create, describe, modify, and delete DB instances, tag resources, or modify security groups.

Security groups

Use security groups to control which IP addresses or Amazon EC2 instances can connect to your databases on a DB instance. When you first create a DB instance, all database access is prevented except through rules specified by an associated security group.

Amazon RDS encryption

Use Amazon RDS encryption to secure your DB instances and snapshots at rest.

SSL or TLS

Use Secure Sockets Layer (SSL) or Transport Layer Security (TLS) connections with DB instances running the MySQL, MariaDB, PostgreSQL, Oracle, or SQL Server database engines.

AWS Storage - Part 4: Amazon S3

Hulk Pham — Wed, 30 Oct 2024 14:06:00 +0000

TL;DR

Amazon S3 Overview

Amazon S3 is an object storage service that stores data in a flat structure, allowing retrieval from anywhere on the web
Objects are stored in containers called buckets, each with a unique name across all AWS accounts in a partition

S3 Use Cases and Security

Common use cases include backup and storage, media hosting, software delivery, data lakes, and static websites
S3 resources are private by default, with security managed through IAM policies, bucket policies, and encryption

S3 Features

Multiple storage classes are available to optimize costs based on data access patterns
Versioning allows preservation of multiple versions of an object in the same bucket
Lifecycle configurations automate the transition between storage classes and object expiration

Comparison with Other AWS Storage Services

S3 is ideal for static content, backups, and data analytics, offering object storage with pay-for-use pricing and multi-AZ replication

I. Amazon S3

Unlike Amazon EBS, Amazon Simple Storage Service (Amazon S3) is a standalone storage solution that isn’t tied to compute. With Amazon S3, you can retrieve your data from anywhere on the web. If you have used an online storage service to back up the data from your local machine, you most likely have used a service similar to Amazon S3. The big difference between those online storage services and Amazon S3 is the storage type.

Amazon S3 is an object storage service. Object storage stores data in a flat structure. An object is a file combined with metadata. You can store as many of these objects as you want. All the characteristics of object storage are also characteristics of Amazon S3.

1. Amazon S3 concepts

In Amazon S3, you store your objects in containers called buckets. You can’t upload an object, not even a single photo, to Amazon S3 without creating a bucket first. When you store an object in a bucket, the combination of a bucket name, key, and version ID uniquely identifies the object.

When you create a bucket, you specify, at the very minimum, two details: the bucket name and the AWS Region that you want the bucket to reside in.

2. Amazon S3 bucket names

Amazon S3 supports global buckets. Therefore, each bucket name must be unique across all AWS accounts in all AWS Regions within a partition. A partition is a grouping of Regions, of which AWS currently has three: Standard Regions, China Regions, and AWS GovCloud (US). When naming a bucket, choose a name that is relevant to you or your business. For example, you should avoid using AWS or Amazon in your bucket name.

The following are some examples of the rules that apply for naming buckets in Amazon S3. For a full list of rules, see the link in the resources section.

Bucket names must be between 3 (min) and 63 (max) characters long.
Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).
Bucket names must begin and end with a letter or number.
Buckets must not be formatted as an IP address.
A bucket name cannot be used by another AWS account in the same partition until the bucket is deleted.

If your application automatically creates buckets, choose a bucket naming scheme that is unlikely to cause naming conflicts and will choose a different bucket name, should one not be available.

3. Object key names

The object key (key name) uniquely identifies the object in an Amazon S3 bucket. When you create an object, you specify the key name. As described earlier, the Amazon S3 model is a flat structure, meaning there is no hierarchy of subbuckets or subfolders. However, the Amazon S3 console does support the concept of folders. By using key name prefixes and delimiters, you can imply a logical hierarchy.

For example, suppose your bucket called testbucket has two objects with the following object keys: 2022-03-01/AmazonS3.html and 2022-03-01/Cats.jpg. The console uses the key name prefix, 2022-03-01, and delimiter (/) to present a folder structure.

Amazon S3 supports buckets and objects, and there is no hierarchy. However, by using prefixes and delimiters in an object key name, the Amazon S3 console and the AWS SDKs are able to infer hierarchy and introduce the concept of folders.

II. Amazon S3 use cases

Amazon S3 is a widely used storage service, with far more use cases than could fit on one screen. To learn more, expand each of the following six categories.

Backup and storage

Amazon S3 is a natural place to back up files because it is highly redundant. As mentioned in the last lesson, AWS stores your EBS snapshots in Amazon S3 to take advantage of its high availability.

Media hosting

Because you can store unlimited objects, and each individual object can be up to 5 TB, Amazon S3 is an ideal location to host video, photo, and music uploads.

Software delivery

You can use Amazon S3 to host your software applications that customers can download.

Data lakes

Amazon ****S3 is an optimal foundation for a data lake because of its virtually unlimited scalability. You can increase storage from gigabytes to petabytes of content, paying only for what you use.

Static websites

You can configure your S3 bucket to host a static website of HTML, CSS, and client-side scripts.

Static content

Because of the limitless scaling, the support for large files, and the fact that you can access any object over the web at any time, Amazon S3 is the perfect place to store static content.

III. Security in Amazon S3

Everything in Amazon S3 is private by default. This means that all Amazon S3 resources, such as buckets and objects, can only be viewed by the user or AWS account that created that resource. Amazon S3 resources are all private and protected to begin with.

If you decide that you want everyone on the internet to see your photos, you can choose to make your buckets and objects public. A public resource means that everyone on the internet can see it. Most of the time, you don’t want your permissions to be all or nothing. Typically, you want to be more granular about the way that you provide access to your resources.

To be more specific about who can do what with your Amazon S3 resources, Amazon S3 provides several security management features: IAM policies, S3 bucket policies, and encryption to develop and implement your own security policies.

1. Amazon S3 and IAM policies

Previously, you learned about creating and using AWS Identity and Access Management (IAM) policies. Now you can apply that knowledge to Amazon S3. When IAM policies are attached to your resources (buckets and objects) or IAM users, groups, and roles, the policies define which actions they can perform. Access policies that you attach to your resources are referred to as resource-based policies **and access policies attached to users in your account are called *user policies*.

You should use IAM policies for private buckets in the following two scenarios:

You have many buckets with different permission requirements. Instead of defining many different S3 bucket policies, you can use IAM policies.
You want all policies to be in a centralized location. By using IAM policies, you can manage all policy information in one location.

2. Amazon S3 bucket policies

Like IAM policies, S3 bucket policies are defined in a JSON format. Unlike IAM policies, which are attached to resources and users, S3 bucket policies can only be attached to S3 buckets. The policy that is placed on the bucket applies to every object in that bucket. S3 bucket policies specify what actions are allowed or denied on the bucket.

You should use S3 bucket policies in the following scenarios:

You need a simple way to do cross-account access to Amazon S3, without using IAM roles.
Your IAM policies bump up against the defined size limit. S3 bucket policies have a larger size limit.

For examples of bucket policies, see the Bucket Policy Examples(opens in a new tab) link here or in the resources section.

3. Amazon S3 encryption

Amazon S3 reinforces encryption in transit (as it travels to and from Amazon S3) and at rest. To protect data, Amazon S3 automatically encrypts all objects on upload and applies server-side encryption with S3-managed keys as the base level of encryption for every bucket in Amazon S3 at no additional cost.

IV. Amazon S3 storage classes

When you upload an object to Amazon S3 and you don’t specify the storage class, you upload it to the default storage class, often referred to as standard storage. In previous lessons, you learned about the default Amazon S3 standard storage class.

Amazon S3 storage classes let you change your storage tier when your data characteristics change. For example, if you are accessing your old photos infrequently, you might want to change the storage class for the photos to save costs.

Storage Class	Description
S3 Standard	This is considered general-purpose storage for cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics.
S3 Intelligent-Tiering	This tier is useful if your data has unknown or changing access patters. S3 Intelligent-Tiering stores objects in three tiers: a frequent access tier, an infrequent access tier, and an archive instance access tier. Amazon S3 monitors access patterns of your data and automatically moves your data to the most cost-effective storage tier based on frequency of access.
S3 Standard-Infrequent Access (S3 Standard-IA)	This tier is for data that is accessed less frequently but requires rapid access when needed. S3 Standard-IA offers the high durability, high throughput, and low latency of S3 Standard, with a low per-GB storage price and per-GB retrieval fee. This storage tier is ideal if you want to store long-term backups, disaster recovery files, and so on.
S3 One Zone-Infrequent Access (S3 One Zone-IA)	Unlike other S3 storage classes that store data in a minimum of three Availability Zones, S3 One Zone-IA stores data in a single Availability Zone, which makes it less expensive than S3 Standard-IA. S3 One Zone-IA is ideal for customers who want a lower-cost option for infrequently accessed data, but do not require the availability and resilience of S3 Standard or S3 Standard-IA. It's a good choice for storing secondary backup copies of on-premises data or easily recreatable data.
S3 Glacier Instant Retrieval	Use S3 Glacier Instant Retrieval for archiving data that is rarely accessed and requires millisecond retrieval. Data stored in this storage class offers a cost savings of up to 68 percent compared to the S3 Standard-IA storage class, with the same latency and throughput performance.
S3 Glacier Flexible Retrieval	S3 Glacier Flexible Retrieval offers low-cost storage for archived data that is accessed 1–2 times per year. With S3 Glacier Flexible Retrieval, your data can be accessed in as little as 1–5 minutes using an expedited retrieval. You can also request free bulk retrievals in up to 5–12 hours. It is an ideal solution for backup, disaster recovery, offsite data storage needs, and for when some data occasionally must be retrieved in minutes.
S3 Glacier Deep Archive	S3 Glacier Deep Archive is the lowest-cost Amazon S3 storage class. It supports long-term retention and digital preservation for data that might be accessed once or twice a year. Data stored in the S3 Glacier Deep Archive storage class has a default retrieval time of 12 hours. It is designed for customers that retain data sets for 7–10 years or longer, to meet regulatory compliance requirements. Examples include those in highly regulated industries, such as the financial services, healthcare, and public sectors.
S3 on Outposts	Amazon S3 on Outposts delivers object storage to your on-premises AWS Outposts environment using S3 API's and features. For workloads that require satisfying local data residency requirements or need to keep data close to on premises applications for performance reasons, the S3 Outposts storage class is the ideal option.

V. Amazon S3 versioning

As described earlier, Amazon S3 identifies objects in part by using the object name. For example, when you upload an employee photo to Amazon S3, you might name the object employee.jpg and store it in a bucket called employees. Without Amazon S3 versioning, every time you upload an object called employee.jpg to the employees bucket, it will overwrite the original object.

This can be an issue for several reasons, including the following:

Common names: The employee.jpg object name is a common name for an employee photo object. You or someone else who has access to the bucket might not have intended to overwrite it; but once it's overwritten, the original object can't be accessed.
Version preservation: You might want to preserve different versions of employee.jpg. Without versioning, if you wanted to create a new version of employee.jpg, you would need to upload the object and choose a different name for it. Having several objects all with slight differences in naming variations can cause confusion and clutter in S3 buckets.

To counteract these issues, you can use Amazon S3 versioning. Versioning keeps multiple versions of a single object in the same bucket. This preserves old versions of an object without using different names, which helps with object recovery from accidental deletions, accidental overwrites, or application failures.

If you enable versioning for a bucket, Amazon S3 automatically generates a unique version ID for the object. In one bucket, for example, you can have two objects with the same key but different version IDs, such as employeephoto.jpg (version 111111) and employeephoto.jpg (version 121212).

By using versioning-enabled buckets, you can recover objects from accidental deletion or overwrite. The following are examples:

Deleting an object does not remove the object permanently. Instead, Amazon S3 puts a marker on the object that shows that you tried to delete it. If you want to restore the object, you can remove the marker and the object is reinstated.
If you overwrite an object, it results in a new object version in the bucket. You still have access to previous versions of the object.

1. Versioning states

Buckets can be in one of three states. The versioning state applies to all objects in the bucket. Storage costs are incurred for all objects in your bucket, including all versions. To reduce your Amazon S3 bill, you might want to delete previous versions of your objects when they are no longer needed.

To learn more, expand each of the following three categories.

Unversioned (default)

No new and existing objects in the bucket have a version.

Versioning-enabled

Versioning is enabled for all objects in the bucket. After you version-enable a bucket, it can never return to an unversioned state. However, you can suspend versioning on that bucket.

Versioning-suspended

Versioning is suspended for new objects. All new objects in the bucket will not have a version. However, all existing objects keep their object versions.

VI. Managing your storage lifecycle

If you keep manually changing your objects, such as your employee photos, from storage tier to storage tier, you might want to automate the process by configuring their Amazon S3 lifecycle. When you define a lifecycle configuration for an object or group of objects, you can choose to automate between two types of actions: transition and expiration.

Transition actions define when objects should transition to another storage class.
Expiration actions define when objects expire and should be permanently deleted.

For example, you might transition objects to S3 Standard-IA storage class 30 days after you create them. Or you might archive objects to the S3 Glacier Deep Archive storage class 1 year after creating them.

The following use cases are good candidates for the use of lifecycle configuration rules:

Periodic logs: If you upload periodic logs to a bucket, your application might need them for a week or a month. After that, you might want to delete them.
Data that changes in access frequency: Some documents are frequently accessed for a limited period of time. After that, they are infrequently accessed. At some point, you might not need real-time access to them. But your organization or regulations might require you to archive them for a specific period. After that, you can delete them.

VII. Choosing the Right Storage Service

1. Amazon EC2 instance store

Instance store is ephemeral block storage. This is preconfigured storage that exists on the same physical server that hosts the EC2 instance and cannot be detached from Amazon EC2. You can think of it as a built-in drive for your EC2 instance.

Instance store is generally well suited for temporary storage of information that is constantly changing, such as buffers, caches, and scratch data. It is not meant for data that is persistent or long lasting. If you need persistent long-term block storage that can be detached from Amazon EC2 and provide you more management flexibility, such as increasing volume size or creating snapshots, you should use Amazon EBS.

2. Amazon EBS

Amazon EBS is meant for data that changes frequently and must persist through instance stops, terminations, or hardware failures. Amazon EBS has two types of volumes: SSD-backed volumes and HDD-backed volumes.

The performance of SSD-backed volumes depends on the IOPs and is ideal for transactional workloads, such as databases and boot volumes.

The performance of HDD-backed volumes depends on megabytes per second (MBps) and is ideal for throughput-intensive workloads, such as big data, data warehouses, log processing, and sequential data I/O.

Here are a few important features of Amazon EBS that you need to know when comparing it to other services.

It is block storage.
You pay for what you provision (you have to provision storage in advance).
EBS volumes are replicated across multiple servers in a single Availability Zone.
Most EBS volumes can only be attached to a single EC2 instance at a time.

3. Amazon S3

If your data doesn’t change often, Amazon S3 might be a cost-effective and scalable storage solution for you. Amazon S3 is ideal for storing static web content and media, backups and archiving, and data for analytics. It can also host entire static websites with custom domain names.

Here are a few important features of Amazon S3 to know about when comparing it to other services:

It is object storage.
You pay for what you use (you don’t have to provision storage in advance).
Amazon S3 replicates your objects across multiple Availability Zones in a Region.
Amazon S3 is not storage attached to compute.

4. Amazon EFS

Amazon EFS provides highly optimized file storage for a broad range of workloads and applications. It is the only cloud-native shared file system with fully automatic lifecycle management. Amazon EFS file systems can automatically scale from gigabytes to petabytes of data without needing to provision storage. Tens, hundreds, or even thousands of compute instances can access an Amazon EFS file system at the same time.

Amazon EFS Standard storage classes are ideal for workloads that require the highest levels of durability and availability. EFS One Zone storage classes are ideal for workloads such as development, build, and staging environments.

Here are a few important features of Amazon EFS to know about when comparing it to other services:

It is file storage.
Amazon EFS is elastic, and automatically scales up or down as you add or remove files. And you pay only for what you use.
Amazon EFS is highly available and designed to be highly durable. All files and directories are redundantly stored within and across multiple Availability Zones.
Amazon EFS offers native lifecyle management of your files and a range of storage classes to choose from.

5. Amazon FSx

Amazon FSx provides native compatibility with third-party file systems. You can choose from NetApp ONTAP, OpenZFS, Windows File Server, and Lustre. With Amazon FSx, you don't need to worry about managing file servers and storage. This is because Amazon FSx automates time consuming administration task such as hardware provisioning, software configuration, patching, and backups. This frees you up to focus on your applications, end users, and business.

Amazon FSx file systems offer feature sets, performance profiles, and data management capabilities that support a wide variety of use cases and workloads. Examples include machine learning, analytics, high performance computing (HPC) applications, and media and entertainment.

File System	Description
Amazon FSx for NETAPP ONTAP	Fully managed shared storage built on the NetApp popular ONTAP file system
Amazon FSx for OpenZFS	Fully managed shared storage built on the popular OpenZFS file system
Amazon FSx for Windows File Server	Fully managed shared storage built on Windows Server
Amazon FSx for Lustre	Fully managed shared storage built on the world's most popular high-performance file system