1. What is Docker, and Why is it Used?
Docker is an open-source containerization platform that allows developers to package applications and their dependencies into isolated environments called containers. These containers ensure that applications run consistently across different environments.
๐น Real-Life Example:
Imagine you're developing a MERN stack web app. It works fine on your laptop, but when your teammate runs it, they get "version mismatch" errors.
With Docker, you create a consistent environment across all machines, preventing such issues.
โ Why Use Docker?
Docker is beneficial when you need:
- Portability โ Works on any OS without compatibility issues
- Consistency โ Eliminates "It works on my machine" problems
- Lightweight โ Uses fewer system resources than virtual machines
- Scalability โ Quickly scale applications with minimal overhead
2. Main Components of Docker
๐ ๏ธ 1. Docker Daemon (dockerd)
- The background process that manages Docker containers
- Listens for API requests and handles images, networks, and volumes
๐ป 2. Docker CLI (Command-Line Interface)
- A tool to interact with the Docker Daemon
- Common commands:
docker ps # List running containers
docker run # Start a new container
docker stop # Stop a running container
๐ฆ 3. Docker Images
- A read-only template containing the application, libraries, and dependencies
- Immutable โ Once built, images donโt change
- Used to create containers
๐ 4. Docker Containers
- A running instance of a Docker image
- Isolated from the host system but can interact if needed (e.g., exposing ports)
๐ 5. Docker Hub
- A cloud-based registry where Docker images are stored and shared
๐๏ธ 6. Docker Volumes
- Used for persistent data storage outside of containers
๐ Illustration of Docker Components:
3. How is Docker Different from Virtual Machines?
โก Example:
You're testing a React.js + Express.js app. Instead of running a full Ubuntu VM (which consumes high RAM & CPU), you start a lightweight container in seconds:
docker run -d -p 3000:3000 node:16
Unlike a VM, which takes minutes to boot, a container starts instantly.
๐ Docker vs. Virtual Machines
Feature | Docker (Containers) | Virtual Machines (VMs) |
---|---|---|
Boot Time | Seconds | Minutes |
Size | MBs | GBs |
Performance | Near-native speed | Slower due to hypervisor overhead |
Isolation | Process-level isolation | Full OS-level isolation |
Resource Efficiency | Shares OS kernel, lightweight | Requires full OS, resource-intensive |
docker run vs. docker start vs. docker exec
docker run : Start a new container
docker start : Restart a stopped container
docker exec : Run a command inside it
4. Popular and Useful Docker Commands
Here are some of the most commonly used Docker commands:
๐ Container Management
# List all running containers
docker ps
# List all containers (including stopped ones)
docker ps -a
# Start a stopped container
docker start <container_id>
# Stop a running container
docker stop <container_id>
# Remove a container
docker rm <container_id>
๐ Image Management
# List all available images
docker images
# Pull an image from Docker Hub
docker pull <image_name>
# Remove an image
docker rmi <image_name>
๐ฆ Build and Run Containers
# Build a Docker image from a Dockerfile
docker build -t <image_name> .
# Run a container from an image
docker run -d -p 8080:80 <image_name>
๐ Volume Management
# List all Docker volumes
docker volume ls
# Create a new volume
docker volume create <volume_name>
# Remove a volume
docker volume rm <volume_name>
Docker Compose: docker-compose.yml
What is docker-compose.yml
?
The docker-compose.yml
file is used to define and run multi-container Docker applications. With Docker Compose, you can manage and orchestrate multiple services, including databases, backend APIs, and front-end applications, all in a single file.
It allows you to define services, networks, and volumes, making it easier to deploy and manage applications that require multiple services working together.
Why is docker-compose.yml
Useful?
Simplifies Multi-Container Management:
Instead of managing each container manually, Docker Compose allows you to define all services (frontend, backend, database, etc.) in one configuration file and launch them with a single command.Networking and Dependency Management:
Docker Compose automatically creates a network for your containers, allowing them to communicate with each other. Services can be referenced by their service name, which means the backend can talk to the database without needing an IP address.One Command to Start Everything:
Instead of running individual containers with complexdocker run
commands, Docker Compose lets you define the services and their dependencies in a YAML file, and run everything withdocker-compose up
.Simplified Development Environment:
With Docker Compose, developers can easily replicate the production environment locally, using the same configuration for services like databases, backends, and frontends. It allows seamless integration and testing, as you don't have to manually set up each service.Environment Variable Management:
You can manage environment variables for each service within thedocker-compose.yml
file, making it easier to configure your application for different environments (development, testing, production).
Example of docker-compose.yml
for a Web Application
Letโs walk through an example where we have three services:
- Frontend: A React app running on port 3000.
- Backend: A Node.js API running on port 5000.
- Database: A MongoDB instance to store data.
version: '3.8'
services:
frontend:
build: ./frontend
ports:
- "3000:3000"
volumes:
- ./frontend:/app
depends_on:
- backend
backend:
build: ./backend
ports:
- "5000:5000"
environment:
- NODE_ENV=development
depends_on:
- database
database:
image: mongo
volumes:
- mongo-data:/data/db
ports:
- "27017:27017"
volumes:
mongo-data:
Database Migrations
- Explain how you would design and manage a database schema using Sequelize, including the process of setting up migrations, handling model relationships, optimizing for performance, and managing database changes in a collaborative team environment.
Database Migration with Sequelize
Purpose
Database migrations allow you to safely update and manage your database schema over time. They help track changes to the schema in a version-controlled manner, making it easy to collaborate in teams.
Setting Up Migrations
- Initialize Sequelize with
sequelize-cli
to generate migration files. - Migration files contain two primary methods:
-
up
: For applying changes (e.g., create tables, add columns). -
down
: For rolling back changes (undoing the applied changes).
-
Handling Schema Changes
Creating Migrations:
When you need to add, modify, or delete database schema (e.g., tables, columns), you create a new migration file.Applying Migrations:
Use the commandnpx sequelize-cli db:migrate
to apply migrations to the database.Rolling Back Migrations:
Usenpx sequelize-cli db:migrate:undo
to undo the last applied migration.
Model Relationships
- Define associations (e.g., one-to-many, many-to-many) within your models using Sequelize methods:
-
hasMany
,belongsTo
,manyToMany
, etc.
-
Collaborative Workflow
- Migrations should be version-controlled using Git.
- Each team member works with migrations, and when schema changes are required, new migrations are created and applied across all environments (development, staging, production).
Github Action
Reference
Steps to Deploy on AWS EC2
1. Launch EC2 Instance
2. Add Secret Variables in GitHub
- Go to GitHub Repo Settings โ Secrets and Variables โ Actions โ Add Secret
3. Connect to EC2 Instance
Install Docker on AWS EC2
sudo apt-get update
sudo apt-get install docker.io -y
sudo systemctl start docker
sudo chmod 666 /var/run/docker.sock
sudo systemctl enable docker
docker --version
docker ps
4. Create Two Runners on the Same EC2 Instance
- In React App โ Actions โ Runner โ New Self-Hosted Runner
- Copy the download commands and run them in the EC2 instance terminal
- Install it as a service to keep it running in the background
sudo ./svc.sh install
sudo ./svc.sh start
- Do the same for the Node.js Runner
5. Create a Dockerfile for Node.js (Backend)
6. Create a GitHub Actions Workflow
Create a .github/workflows/cicd.yml
file
7. Push Docker Images to DockerHub
8. Add Inbound/Outbound Rules on EC2 Instance
9. Access the Node.js Application
- Use
EC2_PUBLIC_IP:PORT
to access your application
Deploying React App
- Create a Dockerfile for React
- Follow the same process as above
What is GitHub Actions, and how does it work?
GitHub Actions is a CI/CD automation tool that allows you to define workflows in YAML to build, test, and deploy applications directly from GitHub repositories.
How do you trigger a GitHub Actions workflow?
Workflows can be triggered by events such as push
, pull_request
, schedule
, workflow_dispatch
, and repository_dispatch
.
What are the key components of a GitHub Actions workflow?
Key components include:
-
Workflows (YAML files in
.github/workflows/
) - Jobs (Independent execution units in a workflow)
- Steps (Commands executed in a job)
- Actions (Reusable units of functionality)
- Runners (Machines that execute jobs)
What is the difference between jobs, steps, and actions?
- Jobs: Run in parallel or sequentially within a workflow.
- Steps: Individual tasks executed within a job.
- Actions: Pre-built reusable components within steps.
How do you use environment variables and secrets in GitHub Actions?
- Define environment variables using
env
:
env:
NODE_ENV: production
- Store sensitive values in
secrets
:
env:
API_KEY: ${{ secrets.API_KEY }}
What are self-hosted runners, and when should you use them?
Self-hosted runners are custom machines used to execute workflows instead of GitHub's hosted runners. Use them for private repositories, custom hardware, or specific dependencies.
How do you cache dependencies in GitHub Actions?
Use actions/cache@v3
to cache dependencies and speed up builds:
- uses: actions/cache@v3
with:
path: ~/.npm
key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
restore-keys: npm-${{ runner.os }}
How do you create a reusable workflow in GitHub Actions?
Define a workflow with on: workflow_call
and call it from another workflow:
on: workflow_call
jobs:
build:
runs-on: ubuntu-latest
steps:
- run: echo "Reusable workflow"
How do you set up a CI/CD pipeline using GitHub Actions?
Define a workflow that includes jobs for building, testing, and deploying:
jobs:
build:
runs-on: ubuntu-latest
steps:
- run: echo "Building..."
test:
runs-on: ubuntu-latest
steps:
- run: echo "Testing..."
deploy:
runs-on: ubuntu-latest
needs: test
steps:
- run: echo "Deploying..."
What is the difference between workflow_dispatch, workflow_run, and schedule triggers?
-
workflow_dispatch
: Manual trigger via GitHub UI/API. -
workflow_run
: Triggered when another workflow finishes. -
schedule
: Runs workflows at specific times using cron syntax.
How do you debug a failing GitHub Actions workflow?
- Check logs in GitHub Actions UI.
- Use
set -x
in bash scripts for verbose output. - Add
continue-on-error: true
to isolate issues.
How do you run a GitHub Actions workflow locally?
Use act
, a tool that simulates GitHub Actions on your local machine:
act
How do you optimize and speed up GitHub Actions workflows?
- Use caching (
actions/cache@v3
). - Run jobs in parallel when possible.
- Use
matrix
builds for different environments. - Limit workflow execution to necessary branches.
How do you manage permissions and security in GitHub Actions?
- Use least privilege principle for tokens (
GITHUB_TOKEN
). - Restrict
secrets
exposure to trusted workflows. - Use branch protection rules to limit workflow execution.
Websockets & Multi-backend system
Why Do Backends Need to Talk to Each Other?
In a typical client-server architecture, communication happens between the browser (client) and the backend server. However, as applications grow, keeping everything on a single server exposed to the internet becomes inefficient and unscalable.
When designing a multi-backend system, you need to consider:
- If there are multiple services, how should they communicate when an event occurs?
- Should it be an immediate HTTP call?
- Should the event be sent to a queue?
- Should the services communicate via WebSockets?
- Should you use a Pub-Sub mechanism?
These decisions impact performance, scalability, and reliability.
Multi-Backend Communication - Final Interview Script
Question: "How do you handle communication between multiple backend services?"
Your Answer:
"When designing multi-backend systems, we have four main communication patterns, each serving different use cases.
1. HTTP/REST - Synchronous Communication
This is direct API calls between services. For example, when a user places an order, the User Service calls Order Service, which then calls Payment Service immediately.
Use case: When you need immediate response and strong consistency, like user authentication or payment validation.
Pros: Simple to implement, immediate feedback, strong consistency
Cons: Tight coupling, if one service fails, whole chain breaks
2. Message Queues - Asynchronous 1:1
Here we use message brokers like RabbitMQ or Amazon SQS. Messages are placed in queues and consumers pick them up when ready. It's point-to-point communication - only one consumer gets each message.
Use case: Task distribution, background job processing, load balancing
Example: Multiple payment workers processing payment requests from a queue
Pros: Loose coupling, fault tolerance, load balancing
Cons: Eventual consistency, more complex error handling
3. Pub-Sub - Event Broadcasting 1:N
Publishers send events to topics, and multiple subscribers listen to the same topic. Same message goes to all subscribers.
Use case: Event-driven architecture where multiple services need to react to same event
Example: When order is created, Inventory Service updates stock, Email Service sends confirmation, Analytics tracks metrics - all from same event
Pros: Highly decoupled, easy to add new features, scalable
Cons: Message ordering challenges, duplicate handling needed
4. WebSockets - Real-time Communication
Persistent bidirectional connections for real-time communication.
Use case: Chat applications, live updates, gaming
Pros: Real-time, low latency
Cons: Resource intensive, connection management complexity
Key Difference - Queue vs Pub-Sub:
Both have same components - Publisher/Producer, Broker, and Consumer/Subscriber. The difference is in message delivery:
- Queue: 1:1 - Messages compete, only one consumer gets each message
- Pub-Sub: 1:N - Same message broadcasted to all subscribers
Real Example - E-commerce System:
I would use a hybrid approach:
- User places order - HTTP call for immediate validation
- Order processing - Pub-Sub event 'ORDER_CREATED' to notify multiple services
- Background tasks - Queue for heavy processing like report generation
Technology Stack:
- Apache Kafka - Can work as both queue and pub-sub
- RabbitMQ - For reliable message queuing
- Redis Pub-Sub - For simple event broadcasting
- Amazon SQS/SNS - For managed cloud solutions
Decision Framework:
Choose HTTP when: Need immediate response, strong consistency, simple flows
Choose Queues when: Task distribution, load balancing, background processing
Choose Pub-Sub when: Multiple services need same event, event-driven architecture
Choose WebSockets when: Real-time bidirectional communication needed
Production Considerations:
- Error Handling: Circuit breakers, dead letter queues, retry mechanisms
- Monitoring: Queue depths, processing times, error rates
- Scalability: Horizontal scaling of consumers, proper partitioning
The key is choosing the right pattern for each specific use case rather than using one approach everywhere."
If Asked Follow-up Questions:
"What about data consistency?"
"For strong consistency, use HTTP calls. For eventual consistency, use async patterns with proper error handling and compensation transactions."
"How do you handle failures?"
"Circuit breakers for HTTP, dead letter queues for messages, retry mechanisms with exponential backoff, and proper monitoring."
"Which technology would you choose?"
"Kafka for high throughput and both queue/pub-sub needs, RabbitMQ for complex routing, SQS for simple cloud solutions."
Example: Payment Processing System
Let's consider a payment application. When a transaction occurs:
- The database update should happen immediately (synchronous).
- The notification (email/SMS) can be pushed to a queue (asynchronous).
Why not handle everything in the primary backend?
- If the email service is down, should the user be forced to wait after completing the transaction? No!
- Instead, we push the notification event to a queue.
- Even if the notification service is down, the queue retains the event and sends notifications once the service is back.
- This is why message queues (e.g., RabbitMQ, Kafka, AWS SQS) are better than HTTP for such tasks.
Types of Communication
-
Synchronous Communication
- The system waits for a response from the other system.
- Examples: HTTP requests, WebSockets (in some cases).
-
Asynchronous Communication
- The system does not wait for a response.
- Examples: Message queues, Pub-Sub services.
Why WebSockets?
WebSockets provide persistent, full-duplex communication over a single TCP handshake.
Limitations of HTTP:
- In HTTP, the server cannot push events to the client on its own.
- The client (browser) can request, and the server can respond, but the server cannot initiate communication with the client.
WebSockets vs. HTTP for Real-Time Applications
Example: Stock Market Trading System
- Stock buying & selling generates millions of requests per second.
- If you use HTTP, every request requires a three-way handshake, adding latency and overhead.
- With WebSockets, the handshake happens only once, and then the server and client can continuously exchange data.
Alternative: Polling
If you still want to use HTTP for real-time updates, an alternative approach is polling.
- However, polling creates unnecessary load on the server by making frequent requests.
- WebSockets are a more efficient solution for real-time updates.
Some Basic Questions
Basic
What is Node.js?
Node.js is a runtime environment for executing JavaScript on the server side. It is not a framework or a language. A runtime is responsible for memory management and converting high-level code into machine code.
Examples:
- Java: JVM (Runtime) โ Spring (Framework)
- Python: CPython (Runtime) โ Django (Framework)
- JavaScript: Node.js (Runtime) โ Express.js (Framework)
With Node.js, JavaScript can run outside the browser as well.
Runtime vs Frameworks
- Runtime: Focuses on executing code, handling memory, and managing I/O.
- Framework: Provides structured tools and libraries to simplify development.
What happens when you enter a URL in the browser and hit enter?
DNS Lookup
The browser checks if it already knows the IP address for www.example.com.
If not, it contacts a DNS (Domain Name System) server to get the IP address (e.g., 192.168.1.1).
Establishing Connection
The browser initiates a TCP connection with the web server using a process called three-way handshake.
If the website uses HTTPS, a TLS handshake happens to encrypt the communication.
Sending HTTP Request
The browser sends an HTTP request to the server:
GET / HTTP/1.1
Host: www.example.com
Server Processing
The web server processes the request and may:
Fetch data from a database
Generate a response (HTML, JSON, etc.)
Receiving the Response
The server sends an HTTP response back to the browser:
HTTP/1.1 200 OK
Content-Type: text/html
Rendering the Page
The browser processes the HTML, CSS, and JavaScript and displays the webpage.
Difference Between Monolithic and Microservices Architecture
Monolithic Architecture
- All components (UI, DB, Auth, etc.) are tightly coupled.
- Single application handles everything.
Microservices Architecture
- Divided into small, independent services.
- Each service handles a specific function (Auth, Payments, etc.).
Pros:
- Scalable
- Services can use different tech stacks
Cons:
- More complex to manage
- Requires API communication
HTTP Status Codes
-
200
OK -
201
Created -
400
Bad Request -
401
Unauthorized -
402
Payment Required -
404
Not Found -
405
Method Not Allowed -
500
Internal Server Error
What is cors ?
CORS stand for Cross Origin Resource Sharing- a security feature built into browsers
It blocks the requests from one origin(domain,protocol or port) to another origin unless explicitly allowed by the server
For exmple: Your frontend is hosted at frontend.com and you bacend at backend.com
The browser these as a different origin and blocks the request unless it is explicitly allowed
why does this happen though?
CORS error are triggered by Same Origin Policy,which prevents malicious websites from making unauthorized API call using your credentials
Browser isn't blocking the requests---its blocking the response for security reasons
REST vs GraphQL
REST API:
"REST (Representational State Transfer) is an architectural style where data is fetched using multiple endpoints, and each request returns a fixed structure of data."
GraphQL:
"GraphQL is a query language for APIs that allows clients to request only the data they need, reducing overfetching and underfetching."
๐ก Key Point:
- REST APIs have multiple endpoints (
/users
,/orders
), while GraphQL has a single endpoint (/graphql
). - GraphQL provides more flexibility by allowing clients to request exactly what they need in a single query.
- REST APIs return predefined responses and sometimes require multiple requests.
- If performance and flexibility are key concerns, GraphQL is a better choice.
How Do You Design an API for a Large-Scale System?
- Use Microservices: Separate services (Auth, Payments, etc.).
- Load Balancers: Distribute traffic efficiently.
- Caching: Use Redis for frequently accessed data.
- Pagination: Send data in chunks.
- Rate Limiting: Prevent API abuse.
What is Pagination? How to Implement It?
Pagination breaks large datasets into smaller parts.
Implementation:
- Use
limit
andoffset
in database queries. - Example:
SELECT * FROM users LIMIT 10 OFFSET 20;
- Use cursor-based pagination for better performance.
How Do You Handle File Uploads?
-
Single file upload: Use
multipart/form-data
with Express.js & Multer. - Large file handling: Use chunked uploads.
- Storage options: Store files on AWS S3, Google Cloud Storage, or a database.
- Server-side Upload: The file is uploaded to your backend server first, and then the server sends it to S3 or Cloudinary.
JWT - Final Interview Answer Script
Question: "What is JWT? How does it work?"
Your Complete Answer:
"JWT stands for JSON Web Token. It's a stateless authentication mechanism where user information is encoded in a token that can be verified without storing session data on the server.
JWT Structure - 3 Parts:
JWT has three parts separated by dots:
header.payload.signature
1. Header: Contains metadata about the token
{
"alg": "HS256", // Algorithm used
"typ": "JWT" // Token type
}
2. Payload: Contains user information and claims
{
"userId": 123,
"role": "admin",
"exp": 1640995200 // Expiry timestamp
}
3. Signature: Ensures token integrity and authenticity
- Created by encrypting header + payload with a secret key
- Used to verify token hasn't been tampered with
How JWT Authentication Works:
Step 1 - User Login:
- User sends credentials to server
- Server validates credentials
- If valid, server creates JWT token
Step 2 - Token Creation:
- Server creates header and payload
- Server generates signature using secret key:
HMAC-SHA256(header.payload, secretKey)
- All three parts are combined:
header.payload.signature
Step 3 - Token Usage:
- Server sends token to client
- Client stores token (localStorage or cookie)
- Client sends token in Authorization header for API requests
Step 4 - Token Verification:
- Server receives token with request
- Server splits token into three parts
- Server recreates signature using same secret key
- If signatures match, token is valid
- Server extracts user info from payload
Key Benefits:
Stateless: No need to store session data on server
Scalable: Works across multiple servers
Self-contained: All user info is in the token
Cross-domain: Can work across different domains
Security Considerations:
Secret Key: Never expose the secret key used for signing
Expiry: Always set short expiry times (15-30 minutes)
HTTPS: Always use HTTPS to prevent token interception
Storage: Be careful about XSS if storing in localStorage
Real-world Example:
When user logs into an e-commerce site:
- User enters username/password
- Server validates and creates JWT with user ID, role, expiry
- Client stores JWT and sends it with every API call
- Server verifies JWT and processes request
- When token expires, user needs to login again or refresh token
JWT vs Sessions:
JWT:
- Stateless (no server storage)
- Better for APIs and microservices
- Self-contained
Sessions:
- Stateful (server stores session data)
- Better for traditional web apps
- More secure (data on server)
The choice depends on your architecture - use JWT for REST APIs and distributed systems, sessions for traditional web applications."
If Asked Follow-up Questions:
"How do you handle token expiry?"
"Use refresh tokens. Short-lived access tokens (15 mins) with longer-lived refresh tokens (7 days). When access token expires, use refresh token to get new access token."
"What if someone steals the JWT?"
"That's why we use short expiry times, HTTPS only, and httpOnly cookies when possible. Also implement token blacklisting for logout."
"Can JWT be modified?"
"If someone modifies the payload, the signature won't match because they don't have the secret key. Server will reject the token."
"Where do you store JWT on client?"
"For web apps: httpOnly cookies for security, or localStorage for convenience but with XSS risk. For mobile: secure storage."
Question: "Explain Cookies, Sessions, Tokens, and Local Storage for authentication."
Your Answer:
"These are four different ways to handle user authentication and data storage. Let me explain each:
1. COOKIES - Automatic Browser Storage
What it is:
Cookies are small pieces of data that the server sends to the browser, and the browser automatically sends them back with every request.
How it works:
- Server creates cookie and sends to browser
- Browser stores it automatically
- Browser includes cookie in every HTTP request to that domain
- Server reads cookie data from request
Authentication use:
User logs in โ Server creates cookie: authId=abc123 โ Browser stores it โ
Every request includes: Cookie: authId=abc123 โ Server validates cookie
Example: When you login to Facebook, server sets cookie with session ID. Now every page you visit automatically sends this cookie.
2. SESSIONS - Server-Side Storage
What it is:
Session is user data stored on the server, identified by a session ID that's typically stored in a cookie.
How it works:
- User logs in โ Server creates session data in memory/database
- Server generates unique session ID
- Session ID is sent to browser via cookie
- Browser sends session ID back with requests
- Server looks up session data using this ID
Authentication flow:
Login โ Server creates: sessions[abc123] = {userId: 456, role: 'admin'} โ
Cookie: sessionId=abc123 โ Server uses ID to fetch user data
Example: Traditional web applications where user data is stored on server for security.
3. TOKENS (JWT) - Self-Contained Authentication
What it is:
A token is an encoded string containing user information that can be verified without storing anything on the server.
How JWT works:
- Contains 3 parts: Header.Payload.Signature
- Payload has user info (userId, role, expiry)
- Signature ensures token hasn't been tampered with
- Server can verify token without database lookup
Authentication flow:
Login โ Server creates JWT token with user info โ Client stores token โ
Client sends: Authorization: Bearer <token> โ Server verifies signature
Example: REST APIs where each request includes JWT token in Authorization header.
4. LOCAL STORAGE - Browser Client Storage
What it is:
Browser's built-in storage that persists data locally, accessible via JavaScript.
How it works:
- JavaScript can store/retrieve data:
localStorage.setItem('token', 'abc123')
- Data persists even after browser closes
- Available to JavaScript on same domain
- 5-10MB storage capacity
Authentication use:
Login โ Store token: localStorage.setItem('authToken', token) โ
API calls โ Get token: localStorage.getItem('authToken') โ
Send manually: headers: {Authorization: Bearer + token}
Example: Single Page Applications (SPAs) where JavaScript manages authentication.
Key Differences Summary:
Storage Location:
- Cookies: Browser (managed automatically)
- Sessions: Server-side (secure)
- Tokens: Client-side (self-contained)
- Local Storage: Browser (manual JavaScript)
Security:
- Cookies: Can be HttpOnly (XSS safe), but CSRF risk
- Sessions: Most secure (data on server)
- Tokens: Stateless but vulnerable if stolen
- Local Storage: Vulnerable to XSS attacks
Usage:
- Cookies: Automatic with every request
- Sessions: Server looks up data using session ID
- Tokens: Manual inclusion in headers
- Local Storage: Manual JavaScript handling
When to Use What:
Use Cookies + Sessions when:
- Traditional web applications
- Maximum security needed
- Server-side rendering
- Simple user flows
Use Tokens (JWT) when:
- REST APIs
- Mobile applications
- Microservices architecture
- Need stateless authentication
Use Local Storage when:
- Single Page Applications (SPAs)
- Need persistent client-side data
- Want manual control over auth flow
- Client-side JavaScript frameworks
Intermediate
What is full text search?
What is Serverless and Serverful backend ?
A serverfull backend means you manage the entire server, while a serverless backend means you donโt have to manage serversโyour code runs only when needed on cloud platforms like AWS Lambda
Example: Imagine you are building a food delivery app like Zomato or Uber Eats.
If you use a serverfull backend:
You set up an Express.js server on AWS EC2.
The server is always running, handling all API requests like fetching restaurants, placing orders, and tracking deliveries.
You pay for the server 24/7, even when there are no active users.
If you use a serverless backend:
You use AWS Lambda functions to handle API requests.
When a user places an order, the function runs only for that request and then shuts down.
You only pay for execution time, making it cost-effective.
Can you explain single-threaded vs. multi-threaded processing?
Single-threaded programs execute one task at a time, while multi-threaded programs can execute multiple tasks in parallel. However, single-threaded systems can still be asynchronous using event loops, like in Node.js. If I were building a CPU-intensive app like a video editor, Iโd go with multi-threading. But for an API server handling multiple users, Iโd use a single-threaded, asynchronous model like Node.js to handle requests efficiently
๐ง Web Server Request Handling โ Full Interview Deep Dive
Understand how web servers handle various types of requests, what part of the system gets triggered, and why CPU, disk, and memory are used in different ways.
๐น Case 1: Static File Request (e.g., GET /index.html
)
๐งฑ Architecture:
Client โ Web Server (Nginx, Apache) โ Disk
Step | Description | CPU Used? | Why |
---|---|---|---|
1 | TCP Connection Establishment | โ | OS uses CPU threads to handle new socket connection |
2 | TLS Handshake (if HTTPS) | โ โ | Public-key crypto (RSA/ECC), key exchange โ very CPU intensive |
3 | HTTP Request Parsing | โ | Server reads headers, URL, method |
4 | Check In-Memory Cache | โ ๏ธ Sometimes | If file is cached, skip disk I/O (saves time and CPU) |
5 | Disk I/O โ Read File | โ ๏ธ + I/O | Slowest part if uncached (mechanical disk = even slower) |
6 | Build HTTP Response | โ | Add headers, content-type, status, etc. |
7 | Send Response (TCP Send) | โ | Network stack and syscalls involve CPU |
โ Conclusion:
- Mostly I/O bound, but CPU handles parsing & networking
- With HTTPS, CPU spikes due to encryption
๐น Case 2: Dynamic Request (Backend involved)
e.g., GET /profile?id=10
๐งฑ Architecture:
Client โ Web Server โ Backend Server โ DB
Step | Description | CPU Used? | Why |
---|---|---|---|
1 | TCP + TLS Handshake | โ โ | Same as static case |
2 | Request Parsing | โ | Headers, query params |
3 | Reverse Proxy to Backend | โ | Web server forwards via IPC/port |
4 | Backend App Logic | โ โ | Routing, auth, business logic (CPU heavy) |
5 | Database Query | โ ๏ธ CPU + I/O | Reads/writes involve disk and DB engine CPU |
6 | Response Generation (HTML/JSON) | โ โ | Templating or serialization is CPU-bound |
7 | Send Response โ Client | โ | Network transmission |
โ Conclusion:
- This is both CPU + I/O bound
- More cores help in scaling
- Backend does the heavy lifting, web server is just the router
๐น Case 3: Cached Response
๐งฑ Architecture:
Client โ Web Server โ Cache (Redis/Memcached/internal) โ Client
Step | Description | CPU Used? | Why |
---|---|---|---|
1 | TCP + HTTP Parsing | โ | Normal |
2 | Cache Lookup (Memory) | โ ๏ธ | Fast RAM lookup, nearly no disk or backend call |
3 | Response Ready โ Send | โ | Minimal CPU for sending back |
โ Conclusion:
- Fastest flow among all
- Skips backend & disk I/O โ highly efficient
- Caching = performance booster
๐น Case 4: Reverse Proxy (Static + Dynamic Mix)
๐งฑ Architecture:
Client โ Nginx (Reverse Proxy) โ Static OR Backend
Step | Description | CPU Used? | Why |
---|---|---|---|
1 | Request to Nginx | โ | Parses incoming request |
2 | Nginx Checks Routes | โ | Matches URI patterns |
3 | Serve Static (if matched) | โ ๏ธ | Disk read if not cached |
4 | Else Proxy to Backend | โ | Same as Case 2 |
5 | Send Response Back | โ | Nginx acts as gateway |
โ Conclusion:
- Nginx = Traffic Manager
- Smart separation between static and dynamic content
- Efficient request routing saves resources
๐น Case 5: HTTPS (TLS) Request
Step | Description | CPU Used? | Why |
---|---|---|---|
1 | TCP Connection | โ | Basic connection setup |
2 | TLS Handshake | โ โ โ | Expensive: Cert validation, RSA/AES/ECC operations |
3 | HTTP Parsing | โ | After TLS tunnel established |
โ Conclusion:
- TLS is CPU-heavy
- TLS Offloading to Cloudflare or Load Balancer is often used
๐น Case 6: API Request (POST JSON)
๐งฑ Architecture:
Client โ Web Server/API Gateway โ Backend โ DB
Step | Description | CPU Used? | Why |
---|---|---|---|
1 | Receive POST | โ | TCP + header parsing |
2 | JSON Body Parsing | โ โ | Deserialization consumes CPU |
3 | Business Logic | โ โ | Auth, validation, core logic |
4 | DB Query | โ ๏ธ | DB fetch/update |
5 | Build JSON Response | โ โ |
JSON.stringify() or equivalent |
6 | Send Response | โ | Network syscall |
โ Conclusion:
- APIs (especially large JSON) are CPU-bound
- Parsing/serializing JSON = CPU cycles
- Use optimized libraries (like
fast-json-stringify
, etc.)
๐น Case 7: File Upload / Download
๐งฑ Architecture:
Client โ Web Server โ Disk / Object Store (e.g., S3)
Step | Description | CPU Used? | Why |
---|---|---|---|
1 | TCP + Parse | โ | Start request |
2 | Read File Chunks (Upload) | โ + I/O | Buffered I/O reads |
3 | Write to Disk/S3 | โ ๏ธ | Disk or network-based I/O |
4 | Send Acknowledgement | โ | Final response |
โ Conclusion:
- I/O-bound process, CPU handles chunking and buffering
- Network & Disk performance matter a lot here
HTTP/2 and HTTP/3 Support in Web Servers
๐น What is HTTP?
- HTTP (HyperText Transfer Protocol) is an application-layer protocol used for communication between clients (like browsers) and web servers.
- Versions: HTTP/1.1 โ HTTP/2 โ HTTP/3
๐ Why HTTP/2 and HTTP/3?
- To improve latency, reduce page load times, and utilize modern internet features like multiplexing, better compression, and faster handshake.
๐ธ HTTP/1.1 Limitations (Why Upgrade?)
- Head-of-line (HOL) blocking: One slow resource blocks others.
- Multiple TCP connections needed โ overhead.
- No compression of headers.
- High latency in handshake and transfer.
โ HTTP/2 Features
1. Multiplexing
- Multiple streams (requests/responses) over a single TCP connection.
- No need for multiple TCP connections.
โโโโโโโโโโโโโโ
โ Browser โ
โโโโโโโโโโโโโโค
โ req1 โโโโโโโโ
โ req2 โโโโโโโบโ
โ req3 โโโโโโโโ
โ โ
โโโโโโโโโโโโโโ
โ
One TCP connection
2. Binary Framing
- All messages (headers, data) are encoded in binary format instead of plain text โ faster and more compact.
3. **Header Compression (HPACK)
**
- HTTP headers are compressed to save bandwidth.
4. **Server Push (Optional)
- Server can "push" resources (CSS/JS/fonts) before the client even asks.
- Useful in predictable page loads.
โ Client: GET /index.html
โ Server: /index.html + /style.css + /app.js (pushed without asking)
HTTP/3: What Changed Again?
โ Uses QUIC protocol instead of TCP
QUIC = Quick UDP Internet Connections (built by Google)
Why QUIC?
TCP has these problems:
Slow connection setup (3-way handshake).
Head-of-Line blocking at the TCP level.
Connection loss resets everything.
๐ง Web Server vs Application Server - Deep Dive
๐ฅ๏ธ 1. What is a Web Server?
๐ง Primary Role:
A web server handles static content such as:
- HTML
- CSS
- JavaScript
- Images (JPG, PNG, etc.)
It serves files directly from disk to the client browser.
๐ก Think of a Web Server like a waiter โ it brings pre-cooked food (static files) to your table.
โ๏ธ Features of Web Server
Feature | Description |
---|---|
Static File Serving | Serves .html , .css , .js , images directly from file system. |
SSL/TLS Termination | Handles HTTPS encryption/decryption (SSL certificates). |
Caching | Stores frequently requested files in memory to improve speed. |
Load Balancing | Distributes incoming requests across multiple App Servers. |
๐ Popular Web Servers
- Apache HTTPD (older but reliable)
- Nginx (very fast, efficient)
- Caddy (auto HTTPS with Let's Encrypt)
๐ญ 2. What is an Application Server?
๐ง Primary Role:
An Application Server handles dynamic content. It:
- Executes backend code
- Fetches data from databases
- Performs business logic
๐ก Think of an Application Server as a chef โ it cooks fresh food (generates dynamic content) based on your order (request).
โ๏ธ Features of App Server
Feature | Description |
---|---|
Code Execution | Runs backend code (e.g. Express, Django, Spring Boot) |
DB Connectivity | Connects to databases like MySQL, MongoDB, PostgreSQL |
Session Management | Maintains user session, login state, etc. |
Transactions | Ensures atomic DB operations (commit or rollback) |
๐ก Common Examples
Language | Application Servers |
---|---|
Node.js | Express.js, NestJS |
Java | Tomcat, Jetty, WildFly |
Python | Django, Flask, FastAPI |
PHP | Laravel, Symfony |
๐ 3. How They Work Together
Client (Browser / Mobile App)
โฌ๏ธ
Web Server (Nginx / Apache)
โฌ๏ธ
Static Route? โก๏ธ Serve static file directly
โฌ๏ธ
Dynamic Route? โก๏ธ Forward to App Server
โฌ๏ธ
App Server (Express / Django)
โฌ๏ธ
DB, Business Logic Execution
โฌ๏ธ
Response sent back via Web Server
โฌ๏ธ
Client receives result
Why do we separate static and dynamic content handling?
Performance: Static files (e.g., images, JS) can be cached and served quickly by a web server like Nginx.
Scalability: Separating allows static content to be offloaded from the heavier app server.
Security: Keeps the app logic isolated; static servers donโt need access to databases or internal logic.
Simplicity: Web servers are optimized for speed and concurrency, while app servers are optimized for logic and computation.
- Can a single server act as both web and application server?
โ Yes, especially in small-scale setups.
Node.js Express, Django, and Spring Boot can serve both static and dynamic content.
However, in production, itโs a best practice to separate them:
Nginx (web server) handles routing, SSL, compression.
App server handles dynamic requests.
โ๏ธ Technical
-
How does Nginx improve performance with caching and load balancing?
Caching:
Stores frequent responses (e.g., HTML pages, JSON APIs) in memory. Reduces load on backend app servers and databases.
Load Balancing:
Distributes incoming traffic across multiple app servers. Methods: Round Robin, Least Connections, IP Hash. Ensures high availability and scalability.
Extra features:
Connection pooling GZIP compression SSL offloading
-
What happens when an HTTPS request reaches Nginx?
TLS Handshake:
Nginx decrypts the request using the SSL certificate. Ensures data confidentiality and authenticity.
Routing:
Nginx uses server_name and location blocks to match the request.
Proxying (if configured):
Passes the decrypted request to a backend app server over HTTP (or internal HTTPS).
Response:
Nginx sends the encrypted response back to the client.
โ You can also use Nginx as a reverse proxy + SSL terminator.
โ๏ธ What Is a Presigned URL?
A presigned URL is a special type of temporary, secure link that allows someone to access a specific resource โ like a file in cloud storage โ without logging in or having permanent credentials.
It gives permission to perform actions like:
- ๐ผ Uploading a file
- ๐ฝ Downloading a file
- โ Deleting a file
... for a limited time.
This is especially useful when you:
- Want users to upload or download files without giving them full access to your server or cloud.
- Need secure sharing without managing login systems or API keys.
๐ ๏ธ How It Works (Behind the Scenes)
Letโs break down the upload process using a YouTube-like example:
โ Step 1: Client Requests a Presigned URL
When a user wants to upload a video, the client (e.g., browser or mobile app) sends a request to YouTubeโs backend asking for a presigned URL.
โ Step 2: Server Generates Presigned URL
The backend (YouTube server) generates a secure, short-lived URL using:
- The file path (
Key
) - HTTP method (
PUT
for upload) - Expiry time (e.g., 5 minutes)
- A cryptographic signature created using AWS credentials
โ Step 3: URL Is Sent to Client
The server returns the presigned URL to the userโs device.
โ Step 4: Client Uploads File Directly to Cloud
The client uploads the video directly to S3 using the URL, bypassing the application server entirely.
โ Step 5: S3 Validates & Stores the File
S3 checks the URLโs validity:
- Is the signature correct?
- Has the URL expired?
If valid, the upload is accepted and stored. The backend can then be notified to process or catalog the file.
โ๏ธ Whatโs Inside a Presigned URL?
A presigned URL contains:
- The target resource (bucket + file path)
- The action allowed (
PUT
,GET
,DELETE
) - Expiry timestamp
- A secure signature (HMAC with access key)
This ensures that only authorized, time-bound operations are allowed.
๐ Why Use Presigned URLs Instead of Traditional Uploads?
Traditional Upload | Presigned URL |
---|---|
File flows through backend | File uploads directly to cloud |
Backend must handle large files | Backend just creates the URL |
Slower and expensive | Fast and scalable |
Higher server load | Offloaded to cloud (e.g., S3) |
Exposes infrastructure to risks | Link auto-expires, more secure |
โ Presigned URLs are:
- ๐ Faster
- ๐ฐ Cheaper
- ๐ More secure
- ๐ Easier to scale
๐ AJAX โ Asynchronous JavaScript and XML
โ What is AJAX?
AJAX is a technique used in web development to send and receive data from a server asynchronously without reloading the entire web page.
๐ AJAX allows partial page updates, making web apps fast and interactive.
๐ง Full Form:
Asynchronous
JavaScript
And
XML (Originally XML, now mostly JSON is used)
๐ฑ Real-World Example:
Google Search Suggestions:
When you type in Googleโs search bar, suggestions appear immediately without reloading the page. This is powered by AJAX.
โ๏ธ Technologies Involved:
Technology | Role |
---|---|
HTML/CSS | Structure & Styling |
JavaScript | Logic and Events |
XMLHttpRequest / fetch()
|
Send/receive data to/from server |
JSON/XML | Data format used for communication |
DOM | To update the web page dynamically |
๐ How AJAX Works (Step-by-Step):
- User interacts with the web page (e.g., clicks a button).
- JavaScript sends a request to the server (in background).
- Server processes the request and sends data back.
- JavaScript receives the data and updates the web page (without reload).
๐ฆ Example Code (Using fetch API):
// Send AJAX request to server
fetch('/api/user')
.then(response => response.json())
.then(data => {
// Update page dynamically
document.getElementById('username').innerText = data.name;
});
Database Partitioning vs Sharding
๐ Introduction
As data grows exponentially in modern systems, managing and querying large datasets efficiently becomes critical. Two common approaches to handle large-scale databases are:
- Partitioning: Dividing data within a single database.
- Sharding: Distributing data across multiple databases or servers.
Both techniques improve performance, scalability, and maintainability, but they serve different purposes and operate at different levels of system architecture.
1๏ธโฃ What is Partitioning?
โ Definition:
Partitioning is the process of dividing a single large table or index into smaller, manageable pieces called partitions.
These partitions are still part of the same logical table and are managed by the same database engine.
๐ง Types of Partitioning:
Type | Description | Use Case |
---|---|---|
Range | Data split by value range in a column | Time-based data (logs, sales) |
List | Data split by discrete column values | Country/region/user-type |
Hash | Data distributed via a hash function | Even load distribution |
Composite | Combines two types (e.g., Range + Hash) | Multi-dimensional datasets |
๐งฑ Horizontal vs Vertical Partitioning:
Type | Description | Use Case |
---|---|---|
Horizontal | Split rows across partitions | Logs, user records, transactions |
Vertical | Split columns across tables | Sensitive vs non-sensitive data |
โ Benefits
- Faster queries (due to partition pruning)
- Easier maintenance (backup/drop/archive)
- Scalability within a single database
โ ๏ธ Drawbacks
- Added schema complexity
- Not all DBs support all partition types
- Uneven data can cause data skew
2๏ธโฃ What is Sharding?
โ Definition:
Sharding is the process of splitting a dataset across multiple physical databases or servers, each called a shard.
Each shard holds a subset of the entire data and can be queried independently.
๐ง Types of Sharding:
Type | Description | Use Case |
---|---|---|
Horizontal | Different rows in each shard | Large user base split by user_id |
Vertical | Different tables or services per shard | Microservices with separate schemas |
Geo-Sharding | Based on geography or region | Global apps (e.g., Asia, EU users) |
๐งฑ Example:
Shard | Data Range |
---|---|
Shard 1 | user_id 1โ10 million |
Shard 2 | user_id 10Mโ20 million |
Shard 3 | user_id 20Mโ30 million |
๐ Tools That Support Sharding:
- MongoDB (built-in)
- Vitess (MySQL)
- Citus (PostgreSQL)
- Cassandra (sharded by design)
- ElasticSearch (auto-sharding)
โ Benefits
- True horizontal scaling
- Improved availability & fault isolation
- Handles very large datasets across regions
โ ๏ธ Drawbacks
- Complex to implement and maintain
- Cross-shard joins are difficult
- Requires careful shard key design
- Complex backup & consistency management
๐ Partitioning vs Sharding: Comparison Table
Feature | Partitioning | Sharding |
---|---|---|
Scope | Inside one database | Across multiple databases/servers |
Managed By | Database Engine | Application or Shard Middleware |
Logical Unit | Table partition | Database/shard |
Cross-Partition Joins | Supported | Difficult or unsupported |
Scalability | Limited to DB machine | Horizontally scalable |
Use Case | Structured, large tables | Global-scale systems (Facebook, etc.) |
๐ Summary
- Partitioning is suitable for scaling within a single database and improving query performance for large tables.
- Sharding is ideal for massive-scale, distributed systems that require true horizontal scaling and fault tolerance.
Use the right strategy based on your system's architecture, data volume, and scalability requirements.
๐งญ Difference Between Observability and Monitoring
Aspect | Monitoring | Observability |
---|---|---|
๐ Definition | Collecting predefined metrics to track system health | Understanding internal state of a system by analyzing outputs |
๐ฏ Goal | Detect known issues and alert when something breaks | Investigate and diagnose unknown or complex issues |
๐ง Approach | Reactive โ predefined checks and dashboards | Proactive โ enables asking new questions and exploring behavior |
๐ฌ Focus | Known problems | Unknown unknowns |
๐งฑ Components | Metrics, alerts, dashboards | Metrics + Logs + Traces (3 Pillars of Observability) |
๐ Tools | Prometheus, Nagios, Zabbix | OpenTelemetry, Grafana, Jaeger, Honeycomb |
๐จ Use case | Alert when CPU > 90% | Understand why latency is increasing randomly |
๐ก Analogy | Thermometer shows temperature (monitoring) | Doctor uses symptoms + scan + history to diagnose (observability) |
๐ฆ Example
Monitoring:
- You set a rule: โAlert me if memory usage goes above 90%โ.
- You get notified when it does.
Observability:
- Your app slows down.
- You don't know why.
- You dive into metrics, traces, logs โ see a DB call is slow due to network latency.
- You find a misconfigured load balancer in a specific region.
โ Key Takeaway:
Monitoring is a subset of Observability.
Observability is about having enough data and tooling to answer any question about your system, even if you didnโt anticipate the issue in advance.
๐ก What is OpenTelemetry?
OpenTelemetry is a vendor-neutral, open-source observability framework by the CNCF that provides standardized tools to collect, process, and export telemetry data โ specifically metrics, logs, and traces โ from applications and infrastructure.
It consists of:
- SDKs for instrumentation, and
- A collector component that receives telemetry data, processes it (like batching or sampling), and exports it to observability backends like New Relic, Prometheus, Jaeger, or any OTLP-compatible platform.
๐ช Why OpenTelemetry is Powerful
What makes OpenTelemetry powerful is that it decouples telemetry generation from storage or visualization.
You write once using OTel SDKs and can export to any backend without being locked into a vendor.
๐งช Real-World Example
In my previous project at Janitri, I used OpenTelemetry SDKs in the backend to instrument REST APIs and used the OpenTelemetry Collector to forward metrics to Prometheus.
Logs and traces were optionally integrated via extensions.
๐ In a New Relic Setup
This same SDK can send data directly to New Relic via the OTLP exporter, giving you full-stack visibility โ with no vendor-specific lock-in.
๐ฏ Conclusion
Thatโs the beauty of OpenTelemetry:
- Itโs interoperable
- Itโs future-proof
- It aligns deeply with New Relicโs support for open standards
๐ What is Prometheus?
Prometheus is an open-source, time-series database and monitoring system originally developed by SoundCloud and now part of the CNCF (Cloud Native Computing Foundation).
It is designed to collect and store metrics from systems and applications using a pull-based model.
โ๏ธ How Prometheus Works
- Prometheus scrapes data from exposed endpoints (typically
/metrics
). - It stores this data in its local time-series database (TSDB).
- Querying is done using its powerful query language called PromQL.
- It supports rule-based alerting using its built-in component called Alertmanager.
๐ Key Characteristics
Feature | Description |
---|---|
๐ Pull-Based Model | Prometheus pulls metrics data from targets, instead of targets pushing data |
๐ Metric-Focused | Only handles metrics (no support for logs or traces) |
๐ง PromQL | A flexible and powerful query language |
๐ซ No Built-in Clustering | Does not support native clustering or long-term storage out of the box |
๐ Extensibility | Can be extended using projects like Thanos or Cortex for high availability |
๐จโ๐ป Real-World Example (Janitri Project)
In my project at Janitri, I used Prometheus alongside OpenTelemetry to collect real-time metrics related to API performance.
I visualized this data using Grafana, which gave immediate insights, although the setup required some effort and configuration.
๐ค Why Prometheus with OpenTelemetry?
OpenTelemetry is a telemetry generation and export framework โ not a full observability stack.
It collects metrics, logs, and traces from applications using SDKs and exports them to a backend.
Prometheus is one such backend โ specialized in metrics.
๐ Integration Flow
- I used OpenTelemetry SDKs to instrument my application.
- Then I used the OpenTelemetry Collector to expose metrics in Prometheus format via the
/metrics
receiver. -
Prometheus scraped this data, stored it, and allowed me to:
- Query it using PromQL
- Set up alerts via Alertmanager
๐ Conclusion
Prometheus completed what OpenTelemetry started โ
- ๐ ๏ธ OTel was the producer
- ๐ง Prometheus was the consumer, storage, and query engine
This architecture was:
- โ Modular
- ๐ Flexible
- ๐ฎ Future-proof
If needed, I could easily swap Prometheus with any OTLP-compatible backend (e.g., New Relic) without changing instrumentation code.
Thatโs the power of combining OpenTelemetry with open, pluggable tools like Prometheus.
This architecture was modular and future-proof. If needed, I could swap Prometheus with any other OTLP-compatible backend โ like New
Relic โ without changing instrumentation code, In New Relicโs case, I can just add an OTLP exporter to forward all telemetry to New Relicโs platform
๐งญ Full Observability Stack using OpenTelemetry
This architecture illustrates how telemetry flows from instrumented code all the way to dashboards using tools like OpenTelemetry, Prometheus, Loki, Jaeger, and Grafana.
1๏ธโฃ Instrumentation Layer (Your Code)
Add OpenTelemetry SDKs to generate telemetry (metrics, logs, traces).
You can use:
Auto-instrumentation agents
(e.g. for Node.js, Python, Java)Manual instrumentation
(tracer.startSpan()
,meter.record()
, etc.)
2๏ธโฃ Collector Layer
The OpenTelemetry Collector is the heart of the pipeline:
-
Receives data via
receivers
-
Processes data (optional) via
processors
-
Sends data to
exporters
(e.g., Prometheus, Jaeger)
You can run the Collector as:
- ๐ข Agent โ runs locally on each host (lightweight)
- ๐ฃ Gateway โ centralized telemetry router (common in prod)
3๏ธโฃ Backend Layer
These are the specialized storage tools for each data type:
Data Type | Tool | Purpose |
---|---|---|
Metrics | Prometheus | Monitoring, alerting, dashboards |
Logs | Loki | Log aggregation & searchable logs |
Traces | Jaeger/Tempo | Distributed tracing & request flow |
These tools store and index the telemetry so that Grafana (or New Relic) can query them.
4๏ธโฃ Visualization Layer (Grafana)
-
Grafana connects to:
- Prometheus (for metrics)
- Loki (for logs)
- Jaeger/Tempo (for traces)
Unified dashboards for all observability pillars
Create alerts (e.g., CPU > 80%, error rate > 5%)
-
Supports full correlation:
- Logs โ Traces โ Metrics from one screen
๐ง Key Interview Lines You Can Drop
โThe OpenTelemetry Collector acts as a hub where all telemetry โ metrics, logs, traces โ is routed, transformed, and exported.โ
โGrafana sits on top as the visual UI, but the data lifeblood flows from instrumented apps through OpenTelemetry.โ
โIn a real production setup, this model gives me flexibility: swap out Prometheus with New Relic just by changing the exporter.โ
Git Merge vs Rebase vs Squash - Complete Guide
The Problem
Tumhare paas ek feature branch hai jismein commits A, B, C hain. Main branch mein meanwhile commits D, E add ho gaye hain. Ab kya karna hai?
main: 1---2---D---E
\
feature: A---B---C
Option 1: Git Merge ๐
What happens:
git checkout main
git merge feature-branch
Result:
main: 1---2---D---E---M
\ /
feature: A---B---C
Simple Explanation:
- Ek merge commit (M) create hota hai
- Dono branches ka history preserve rehta hai
- "Knot" jaisa structure banta hai
When to use:
- Jab complete history chahiye
- Team collaboration mein transparency chahiye
- Feature branch ka detailed development track karna ho
Option 2: Git Rebase โ๏ธ
What happens:
git checkout feature-branch
git rebase main
Result:
main: 1---2---D---E---A'---B'---C'
Simple Explanation:
- Feature branch commits ko main ke "tip" pe move kar deta hai
- Clean, linear history milti hai
- Original commits A,B,C become A',B',C' (new commit IDs)
When to use:
- Clean, linear history chahiye
- Complex merge conflicts avoid karne ke liye
- Professional projects mein preferred
Option 3: Squash Commits ๐๏ธ
What happens:
git checkout main
git merge --squash feature-branch
git commit -m "Add complete feature X"
Result:
main: 1---2---D---E---S
Simple Explanation:
- Saare feature commits (A+B+C) ko ek single commit (S) mein combine kar deta hai
- Main branch mein sirf ek clean commit dikhta hai
- Individual commits ka detail lose ho jaata hai main mein
When to use:
- Main branch mein clean history chahiye
- Feature development details main mein nahi chahiye
- GitHub/GitLab mein popular approach
Real World Scenarios ๐
Scenario 1: Small Personal Project
Use: Simple merge
- History complexity matter nahi karta
- Quick and easy
Scenario 2: Professional Team Project
Use: Rebase + Fast-forward merge
- Clean linear history
- Easy to track changes
- Professional appearance
Scenario 3: Open Source Project
Use: Squash commits
- Main branch clean rehti hai
- Contributors ka detailed work feature branch mein preserved
- Easy to review and rollback
Commands Summary ๐
Merge:
git checkout main
git merge feature-branch
Rebase:
git checkout feature-branch
git rebase main
git checkout main
git merge feature-branch # Fast-forward merge
Squash:
git checkout main
git merge --squash feature-branch
git commit -m "Descriptive message"
Top comments (0)