As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!
Python's flexibility makes it a go-to choice for scientific computing, but performance optimization becomes essential when dealing with large-scale numerical computations. I've spent countless hours optimizing scientific code, and these eight techniques consistently deliver the most significant performance improvements while maintaining code clarity.
NumPy Vectorization: The Foundation of Fast Computing
When I first started scientific programming, I wrote loops everywhere. The transformation to vectorized operations changed everything. NumPy vectorization eliminates explicit loops by applying operations to entire arrays simultaneously, leveraging optimized C implementations and SIMD instructions.
import numpy as np
import time
def demonstrate_vectorization():
# Generate large dataset
size = 1000000
data = np.random.rand(size)
# Slow approach with explicit loop
start = time.time()
result_slow = np.zeros(size)
for i in range(size):
result_slow[i] = np.sin(data[i]) * np.cos(data[i])
loop_time = time.time() - start
# Fast vectorized approach
start = time.time()
result_fast = np.sin(data) * np.cos(data)
vector_time = time.time() - start
print(f"Loop time: {loop_time:.4f}s")
print(f"Vectorized time: {vector_time:.4f}s")
print(f"Speedup: {loop_time / vector_time:.1f}x")
# Verify results are identical
print(f"Results match: {np.allclose(result_slow, result_fast)}")
# Advanced vectorization with broadcasting
def compute_distance_matrix(points):
"""Compute pairwise distances efficiently"""
# points shape: (n_points, n_dimensions)
# Broadcasting creates (n, 1, d) - (1, n, d) = (n, n, d)
diff = points[:, np.newaxis, :] - points[np.newaxis, :, :]
distances = np.sqrt(np.sum(diff**2, axis=2))
return distances
# Example usage
points = np.random.rand(1000, 3)
distances = compute_distance_matrix(points)
print(f"Distance matrix shape: {distances.shape}")
The performance difference is staggering. Vectorized operations typically run 10-100 times faster than equivalent Python loops because they push the computation down to optimized C code that can utilize modern CPU features.
Just-In-Time Compilation with Numba
Numba transforms Python functions into optimized machine code at runtime. The @jit decorator provides near-C performance without changing Python syntax, making it perfect for numerical algorithms that can't be easily vectorized.
from numba import jit, prange, njit
import numpy as np
@njit
def mandelbrot_iteration(c, max_iter=100):
"""Compute Mandelbrot set membership (JIT compiled)"""
z = 0
for i in range(max_iter):
if abs(z) > 2:
return i
z = z*z + c
return max_iter
@njit(parallel=True)
def generate_mandelbrot(width, height, x_min, x_max, y_min, y_max, max_iter=100):
"""Generate Mandelbrot set using parallel JIT compilation"""
result = np.zeros((height, width))
for i in prange(height):
for j in prange(width):
# Map pixel to complex plane
real = x_min + (x_max - x_min) * j / width
imag = y_min + (y_max - y_min) * i / height
c = complex(real, imag)
result[i, j] = mandelbrot_iteration(c, max_iter)
return result
# Custom data structures with Numba
from numba import types
from numba.typed import Dict
@njit
def optimized_histogram(data, bins):
"""Fast histogram computation with JIT"""
hist = np.zeros(bins)
bin_width = (data.max() - data.min()) / bins
for value in data:
bin_idx = int((value - data.min()) / bin_width)
if bin_idx >= bins:
bin_idx = bins - 1
hist[bin_idx] += 1
return hist
# Performance comparison
def compare_mandelbrot_performance():
width, height = 800, 600
bounds = (-2.5, 1.0, -1.5, 1.5)
# Time JIT compiled version
start = time.time()
mandelbrot_fast = generate_mandelbrot(width, height, *bounds)
jit_time = time.time() - start
print(f"JIT compiled Mandelbrot: {jit_time:.4f}s")
return mandelbrot_fast
Numba excels at mathematical computations, nested loops, and algorithms that resist vectorization. The parallel=True parameter automatically distributes work across CPU cores for additional speedup.
Memory Layout Optimization
Memory access patterns significantly impact performance. Modern CPUs have complex cache hierarchies, and optimizing data layout to work with these caches can provide substantial speedups.
import numpy as np
def demonstrate_memory_layout():
# Create test matrices
size = 1000
matrix_c = np.random.rand(size, size) # C-contiguous (row-major)
matrix_f = np.asfortranarray(matrix_c) # Fortran-contiguous (column-major)
print(f"C-contiguous flags: {matrix_c.flags}")
print(f"F-contiguous flags: {matrix_f.flags}")
# Row-wise operations (favor C-contiguous)
start = time.time()
for i in range(size):
row_sum = np.sum(matrix_c[i, :])
c_row_time = time.time() - start
start = time.time()
for i in range(size):
row_sum = np.sum(matrix_f[i, :])
f_row_time = time.time() - start
# Column-wise operations (favor F-contiguous)
start = time.time()
for j in range(size):
col_sum = np.sum(matrix_c[:, j])
c_col_time = time.time() - start
start = time.time()
for j in range(size):
col_sum = np.sum(matrix_f[:, j])
f_col_time = time.time() - start
print(f"Row operations - C: {c_row_time:.4f}s, F: {f_row_time:.4f}s")
print(f"Column operations - C: {c_col_time:.4f}s, F: {f_col_time:.4f}s")
# Cache-friendly matrix multiplication
def cache_friendly_matmul(A, B, block_size=64):
"""Block-wise matrix multiplication for better cache usage"""
n, m, p = A.shape[0], A.shape[1], B.shape[1]
C = np.zeros((n, p))
for i0 in range(0, n, block_size):
for j0 in range(0, p, block_size):
for k0 in range(0, m, block_size):
# Define block boundaries
i1 = min(i0 + block_size, n)
j1 = min(j0 + block_size, p)
k1 = min(k0 + block_size, m)
# Multiply blocks
C[i0:i1, j0:j1] += A[i0:i1, k0:k1] @ B[k0:k1, j0:j1]
return C
# Memory pre-allocation for better performance
class PerformanceBuffer:
"""Reusable buffer to avoid memory allocation overhead"""
def __init__(self, max_size):
self.buffer = np.empty(max_size)
self.max_size = max_size
def get_view(self, size):
if size > self.max_size:
raise ValueError(f"Requested size {size} exceeds buffer size {self.max_size}")
return self.buffer[:size]
# Usage example
buffer = PerformanceBuffer(1000000)
working_array = buffer.get_view(50000)
# Use working_array for computations without allocation cost
Understanding memory layout helps optimize algorithms for modern hardware. C-contiguous arrays perform better for row-wise operations, while Fortran-contiguous arrays excel at column-wise access patterns.
Sparse Matrix Operations
Many scientific problems involve matrices with mostly zero values. Sparse matrix representations dramatically reduce memory usage and computation time by storing only non-zero elements.
import scipy.sparse as sp
import numpy as np
def sparse_matrix_efficiency():
"""Demonstrate sparse matrix advantages"""
# Create large sparse matrix (finite difference operator)
n = 10000
main_diag = -2 * np.ones(n)
off_diag = np.ones(n-1)
# Create sparse tridiagonal matrix
sparse_matrix = sp.diags([off_diag, main_diag, off_diag],
[-1, 0, 1],
shape=(n, n),
format='csr')
# Memory comparison
dense_matrix = sparse_matrix.toarray()
sparse_memory = sparse_matrix.data.nbytes + sparse_matrix.indices.nbytes + sparse_matrix.indptr.nbytes
dense_memory = dense_matrix.nbytes
print(f"Sparse matrix memory: {sparse_memory / 1024**2:.2f} MB")
print(f"Dense matrix memory: {dense_memory / 1024**2:.2f} MB")
print(f"Memory reduction: {dense_memory / sparse_memory:.1f}x")
# Performance comparison for matrix-vector multiplication
vector = np.random.rand(n)
start = time.time()
result_sparse = sparse_matrix @ vector
sparse_time = time.time() - start
start = time.time()
result_dense = dense_matrix @ vector
dense_time = time.time() - start
print(f"Sparse multiplication: {sparse_time:.6f}s")
print(f"Dense multiplication: {dense_time:.6f}s")
print(f"Speedup: {dense_time / sparse_time:.1f}x")
# Sparse linear system solving
def solve_sparse_system():
"""Efficient sparse linear system solution"""
from scipy.sparse.linalg import spsolve
from scipy.sparse import csc_matrix
# Create sparse coefficient matrix (Poisson equation discretization)
n = 1000
A = sp.diags([-1, 4, -1], [-1, 0, 1], shape=(n, n), format='csc')
b = np.random.rand(n)
# Solve sparse system
start = time.time()
x_sparse = spsolve(A, b)
sparse_solve_time = time.time() - start
# Compare with dense solve
A_dense = A.toarray()
start = time.time()
x_dense = np.linalg.solve(A_dense, b)
dense_solve_time = time.time() - start
print(f"Sparse solve time: {sparse_solve_time:.6f}s")
print(f"Dense solve time: {dense_solve_time:.6f}s")
print(f"Solution error: {np.linalg.norm(x_sparse - x_dense):.2e}")
# Custom sparse matrix operations
def create_sparse_graph_laplacian(adjacency_matrix):
"""Create graph Laplacian from adjacency matrix"""
degree_matrix = sp.diags(np.array(adjacency_matrix.sum(axis=1)).flatten())
laplacian = degree_matrix - adjacency_matrix
return laplacian
# Generate example graph
n_nodes = 5000
density = 0.001
adjacency = sp.random(n_nodes, n_nodes, density=density, format='csr')
adjacency = (adjacency + adjacency.T) / 2 # Make symmetric
laplacian = create_sparse_graph_laplacian(adjacency)
Sparse matrices are essential for large-scale scientific computing. They enable solving systems with millions of unknowns that would be impossible with dense representations due to memory constraints.
Parallel Processing Strategies
Modern computers have multiple CPU cores, and parallel processing can dramatically accelerate computations. Python offers several approaches for parallel execution, each suited to different problem types.
import multiprocessing as mp
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
import numpy as np
from numba import prange, jit
def cpu_intensive_task(data_chunk):
"""Simulate CPU-intensive scientific computation"""
result = np.zeros_like(data_chunk)
for i, value in enumerate(data_chunk):
# Complex mathematical operation
result[i] = np.sum([np.sin(value * k) * np.cos(value / k)
for k in range(1, 100)])
return result
def parallel_processing_comparison():
"""Compare different parallel processing approaches"""
# Generate test data
data_size = 10000
data = np.random.rand(data_size)
n_processes = mp.cpu_count()
chunk_size = data_size // n_processes
# Serial processing
start = time.time()
result_serial = cpu_intensive_task(data)
serial_time = time.time() - start
# Multiprocessing with Pool
start = time.time()
with ProcessPoolExecutor(max_workers=n_processes) as executor:
chunks = [data[i:i+chunk_size] for i in range(0, data_size, chunk_size)]
results = list(executor.map(cpu_intensive_task, chunks))
result_parallel = np.concatenate(results)
parallel_time = time.time() - start
print(f"Serial time: {serial_time:.4f}s")
print(f"Parallel time: {parallel_time:.4f}s")
print(f"Speedup: {serial_time / parallel_time:.1f}x")
print(f"Efficiency: {(serial_time / parallel_time) / n_processes:.1%}")
@jit(nopython=True, parallel=True)
def parallel_matrix_operations(matrix):
"""Numba parallel matrix processing"""
rows, cols = matrix.shape
result = np.zeros_like(matrix)
for i in prange(rows):
for j in range(cols):
# Complex operation on each element
result[i, j] = np.sqrt(matrix[i, j]**2 +
np.sin(matrix[i, j]) +
np.cos(matrix[i, j]))
return result
# Parallel Monte Carlo simulation
@jit(nopython=True)
def monte_carlo_pi_chunk(n_samples):
"""Estimate π using Monte Carlo method"""
count = 0
for _ in range(n_samples):
x = np.random.random()
y = np.random.random()
if x*x + y*y <= 1.0:
count += 1
return count
def parallel_monte_carlo_pi(total_samples, n_processes=None):
"""Parallel Monte Carlo π estimation"""
if n_processes is None:
n_processes = mp.cpu_count()
samples_per_process = total_samples // n_processes
with ProcessPoolExecutor(max_workers=n_processes) as executor:
futures = [executor.submit(monte_carlo_pi_chunk, samples_per_process)
for _ in range(n_processes)]
total_hits = sum(future.result() for future in futures)
pi_estimate = 4.0 * total_hits / total_samples
return pi_estimate
# Usage example
if __name__ == "__main__":
pi_estimate = parallel_monte_carlo_pi(10000000)
print(f"π estimate: {pi_estimate:.6f}")
print(f"Error: {abs(pi_estimate - np.pi):.6f}")
The key to effective parallelization is matching the approach to the problem structure. CPU-bound tasks benefit from multiprocessing, while I/O-bound operations often work better with threading.
BLAS Integration and Linear Algebra Optimization
NumPy automatically uses optimized BLAS (Basic Linear Algebra Subprograms) libraries when available. These implementations provide hardware-specific optimizations for fundamental mathematical operations that form the backbone of scientific computing.
import numpy as np
import scipy.linalg.lapack as lapack
def demonstrate_blas_performance():
"""Show BLAS optimization benefits"""
# Check which BLAS library NumPy is using
print("NumPy configuration:")
np.show_config()
# Large matrix operations that benefit from BLAS
size = 2000
A = np.random.rand(size, size)
B = np.random.rand(size, size)
# Matrix multiplication (uses GEMM from BLAS)
start = time.time()
C = A @ B
matmul_time = time.time() - start
# Eigenvalue decomposition (uses LAPACK)
start = time.time()
eigenvals, eigenvecs = np.linalg.eigh(A @ A.T)
eigen_time = time.time() - start
print(f"Matrix multiplication ({size}x{size}): {matmul_time:.4f}s")
print(f"Eigenvalue decomposition: {eigen_time:.4f}s")
# Optimized linear algebra operations
def solve_linear_systems_efficiently():
"""Demonstrate efficient linear system solutions"""
n = 1000
# Symmetric positive definite system (use Cholesky)
A_spd = np.random.rand(n, n)
A_spd = A_spd @ A_spd.T # Make SPD
b = np.random.rand(n)
# Cholesky decomposition for SPD matrices
start = time.time()
L = np.linalg.cholesky(A_spd)
y = np.linalg.solve(L, b)
x_chol = np.linalg.solve(L.T, y)
chol_time = time.time() - start
# General LU decomposition
start = time.time()
x_lu = np.linalg.solve(A_spd, b)
lu_time = time.time() - start
print(f"Cholesky solve: {chol_time:.6f}s")
print(f"LU solve: {lu_time:.6f}s")
print(f"Speedup: {lu_time / chol_time:.1f}x")
print(f"Solution error: {np.linalg.norm(x_chol - x_lu):.2e}")
# Matrix operations with different data types
def optimize_data_types():
"""Demonstrate performance impact of data types"""
size = 1000
# Different precision levels
A_f64 = np.random.rand(size, size).astype(np.float64)
A_f32 = A_f64.astype(np.float32)
# Float64 operations
start = time.time()
result_f64 = A_f64 @ A_f64.T
f64_time = time.time() - start
# Float32 operations (often faster on modern hardware)
start = time.time()
result_f32 = A_f32 @ A_f32.T
f32_time = time.time() - start
print(f"Float64 computation: {f64_time:.6f}s")
print(f"Float32 computation: {f32_time:.6f}s")
print(f"Speedup: {f64_time / f32_time:.1f}x")
print(f"Memory usage ratio: {A_f64.nbytes / A_f32.nbytes:.1f}x")
# Advanced BLAS operations
def advanced_blas_operations():
"""Use advanced BLAS routines directly"""
from scipy.linalg import blas
# Direct BLAS calls for maximum performance
m, n, k = 1000, 800, 1200
A = np.random.rand(m, k)
B = np.random.rand(k, n)
C = np.zeros((m, n))
# Use GEMM directly (General Matrix Multiply)
start = time.time()
blas.dgemm(alpha=1.0, a=A, b=B, c=C, overwrite_c=True)
blas_time = time.time() - start
# Compare with NumPy
start = time.time()
C_numpy = A @ B
numpy_time = time.time() - start
print(f"Direct BLAS: {blas_time:.6f}s")
print(f"NumPy: {numpy_time:.6f}s")
print(f"Results match: {np.allclose(C, C_numpy)}")
BLAS integration provides the foundation for high-performance scientific computing. Understanding which operations benefit from optimized libraries helps guide algorithm design decisions.
Memory Mapping for Large Datasets
Memory mapping allows processing datasets larger than available RAM by loading portions on-demand. This technique is crucial for analyzing massive scientific datasets without memory constraints.
import numpy as np
import h5py
from pathlib import Path
def create_large_dataset():
"""Create a large HDF5 dataset for demonstration"""
filename = "large_scientific_data.h5"
n_samples = 1000000
n_features = 100
with h5py.File(filename, 'w') as f:
# Create dataset with chunking and compression
dataset = f.create_dataset('data',
shape=(n_samples, n_features),
dtype=np.float32,
chunks=True,
compression='gzip')
# Fill with synthetic scientific data
for i in range(0, n_samples, 10000):
end_idx = min(i + 10000, n_samples)
chunk_data = np.random.randn(end_idx - i, n_features).astype(np.float32)
dataset[i:end_idx] = chunk_data
# Add metadata
dataset.attrs['description'] = 'Synthetic scientific measurement data'
dataset.attrs['units'] = 'arbitrary units'
dataset.attrs['sampling_rate'] = 1000.0
return filename
def memory_mapped_analysis(filename):
"""Analyze large dataset using memory mapping"""
with h5py.File(filename, 'r') as f:
data = f['data']
n_samples, n_features = data.shape
print(f"Dataset shape: {data.shape}")
print(f"Dataset size: {data.nbytes / 1024**3:.2f} GB")
# Compute statistics without loading entire dataset
chunk_size = 10000
running_sum = np.zeros(n_features)
running_sum_sq = np.zeros(n_features)
for i in range(0, n_samples, chunk_size):
end_idx = min(i + chunk_size, n_samples)
chunk = data[i:end_idx]
running_sum += np.sum(chunk, axis=0)
running_sum_sq += np.sum(chunk**2, axis=0)
# Calculate mean and standard deviation
mean = running_sum / n_samples
variance = (running_sum_sq / n_samples) - mean**2
std = np.sqrt(variance)
print(f"Mean values (first 5 features): {mean[:5]}")
print(f"Std values (first 5 features): {std[:5]}")
# NumPy memory mapping
def numpy_memmap_example():
"""Demonstrate NumPy memory mapping"""
# Create large array on disk
filename = "large_array.dat"
shape = (100000, 1000)
# Create memory-mapped array
mmap_array = np.memmap(filename, dtype='float32', mode='w+', shape=shape)
# Fill with data (only loads necessary pages)
for i in range(0, shape[0], 1000):
end_idx = min(i + 1000, shape[0])
mmap_array[i:end_idx] = np.random.randn(end_idx - i, shape[1])
# Perform operations on memory-mapped array
start = time.time()
column_means = np.mean(mmap_array, axis=0)
operation_time = time.time() - start
print(f"Memory-mapped operation time: {operation_time:.4f}s")
print(f"Column means shape: {column_means.shape}")
# Clean up
del mmap_array
Path(filename).unlink(missing_ok=True)
# Streaming data processing
class StreamingProcessor:
"""Process large datasets in streaming fashion"""
def __init__(self, chunk_size=10000):
self.chunk_size = chunk_size
self.processed_samples = 0
def process_chunk(self, chunk):
"""Process a single chunk of data"""
# Example: compute chunk statistics
chunk_stats = {
'mean': np.mean(chunk, axis=0),
'std': np.std(chunk, axis=0),
'min': np.min(chunk, axis=0),
'max': np.max(chunk, axis=0)
}
return chunk_stats
def process_file(self, filename):
"""Process entire file in chunks"""
results = []
with h5py.File(filename, 'r') as f:
data = f['data']
n_samples = data.shape[0]
for i in range(0, n_samples, self.chunk_size):
end_idx = min(i + self.chunk_size, n_samples)
chunk = data[i:end_idx]
chunk_result = self.process_chunk(chunk)
results.append(chunk_result)
self.processed_samples += chunk.shape[0]
if i % (self.chunk_size * 10) == 0:
print(f"Processed {self.processed_samples}/{n_samples} samples")
return results
# Usage example
def demonstrate_memory_mapping():
# Create test dataset
filename = create_large_dataset()
# Analyze using memory mapping
memory_mapped_analysis(filename)
# Process in streaming fashion
processor = StreamingProcessor(chunk_size=50000)
results = processor.process_file(filename)
print(f"Processed {len(results)} chunks")
# Clean up
Path(filename).unlink(missing_ok=True)
Memory mapping transforms how we handle large-scale scientific data. Instead of being limited by RAM, we can process datasets of arbitrary size with predictable memory usage.
Profile-Guided Optimization
Profiling identifies computational bottlenecks and guides optimization efforts toward the most impactful improvements. Python provides several excellent profiling tools for scientific code analysis.
python
import cProfile
import pstats
import line_profiler
import numpy as np
from functools import wraps
def profile_function(func):
"""Decorator for profiling individual functions"""
@wraps(func)
def wrapper(*args, **kwargs):
profiler = cProfile.Profile()
profiler.enable()
result = func(*args, **kwargs)
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10) # Top 10 functions
return result
return wrapper
# Scientific computation example for profiling
@profile_function
def complex_scientific_computation(n=1000):
"""Example scientific computation with performance bottlenecks"""
# Bottleneck 1: Inefficient matrix creation
def create_matrix_slow(size):
matrix = np.zeros((size, size))
for i in range(size):
for j in range(size):
matrix[i, j] = np.sin(i) * np.cos(j)
return matrix
# Bottleneck 2: Redundant computations
def redundant_computation(matrix):
result = np.zeros_like(matrix)
for i in range(matrix.shape[0]):
for j in range(matrix.shape[1]):
result[i, j] = np.exp(matrix[i, j]) + np.log(abs(matrix[i, j]) + 1)
return result
# Bottleneck 3: Memory allocation in loop
def memory_inefficient_operation(data):
results = []
for i in range(len(data)):
temp = np.array([data[i] * k for k in range(100)])
results.append(np.sum(temp))
return np.array(results)
# Execute bottlenecked operations
matrix = create_matrix_slow(n)
processed = redundant_computation(matrix)
final_result = memory_inefficient_operation(processed.flatten()[:1000])
return final_result
# Optimized version for comparison
@profile_function
def optimized_scientific_computation(n=1000):
"""Optimized version of the same computation"""
# Optimization 1: Vectorized matrix creation
i_vals = np.arange(n)[:, np.newaxis]
j_vals = np.arange(n)[np.newaxis, :]
matrix = np.sin(i_vals) * np.cos(j_vals)
# Optimization 2: Vectorized computation
processed = np.exp(matrix) + np.log(np.abs(matrix) + 1)
# Optimization 3: Pre-allocated arrays and vectorization
data = processed.flatten()[:1000]
k_vals = np.arange(100)
temp_matrix = data[:, np.newaxis] * k_vals[np.newaxis, :]
final_result = np.sum(temp_matrix, axis=1)
return final_result
# Memory profiling
def memory_profiling_example():
"""Demonstrate memory usage profiling"""
import tracemalloc
# Start tracing
tracemalloc.start()
# Memory-intensive operations
large_arrays = []
for i in range(10):
array = np.random.rand(1000, 1000)
transformed = np.fft.fft2(array)
large_arrays.append(transformed)
# Get memory statistics
current, peak = tracemall
---
📘 **Checkout my [latest ebook](https://youtu.be/WpR6F4ky4uM) for free on my channel!**
Be sure to **like**, **share**, **comment**, and **subscribe** to the channel!
---
## 101 Books
**101 Books** is an AI-driven publishing company co-founded by author **Aarav Joshi**. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as **$4**—making quality knowledge accessible to everyone.
Check out our book **[Golang Clean Code](https://www.amazon.com/dp/B0DQQF9K3Z)** available on Amazon.
Stay tuned for updates and exciting news. When shopping for books, search for **Aarav Joshi** to find more of our titles. Use the provided link to enjoy **special discounts**!
## Our Creations
Be sure to check out our creations:
**[Investor Central](https://www.investorcentral.co.uk/)** | **[Investor Central Spanish](https://spanish.investorcentral.co.uk/)** | **[Investor Central German](https://german.investorcentral.co.uk/)** | **[Smart Living](https://smartliving.investorcentral.co.uk/)** | **[Epochs & Echoes](https://epochsandechoes.com/)** | **[Puzzling Mysteries](https://www.puzzlingmysteries.com/)** | **[Hindutva](http://hindutva.epochsandechoes.com/)** | **[Elite Dev](https://elitedev.in/)** | **[JS Schools](https://jsschools.com/)**
---
### We are on Medium
**[Tech Koala Insights](https://techkoalainsights.com/)** | **[Epochs & Echoes World](https://world.epochsandechoes.com/)** | **[Investor Central Medium](https://medium.investorcentral.co.uk/)** | **[Puzzling Mysteries Medium](https://medium.com/puzzling-mysteries)** | **[Science & Epochs Medium](https://science.epochsandechoes.com/)** | **[Modern Hindutva](https://modernhindutva.substack.com/)**
Top comments (0)