# PYTHON 101: INTRODUCTION TO PYTHON FOR DATA ANALYTICS

Python is a versatile and powerful programming language, widely used in data analytics due to its simplicity and the vast ecosystem of libraries tailored for data processing. In this guide, we'll cover the essentials you need to get started with Python for data analytics, including variables, data types, control structures, functions, and an introduction to NumPy, a fundamental library for numerical computing.

## INTRODUCTION TO PYTHON FOR DATA ANALYTICS CONCEPTS

### Variables

Variables are containers for storing data values. In Python, you don’t need to declare the type of a variable explicitly, as it is inferred based on the value you assign.

CODE:

```
age = 30
name = "John"
```

### Data Types

Python has various built-in data types:

Integers: Whole numbers (10, -5)

Floats: Decimal numbers (3.14, -2.5)

Strings: Text data ("Hello", "123")

Booleans: True or False values (True, False)

Lists: Ordered, mutable collections of items ([1, 2, 3])

Dictionaries: Key-value pairs ({"name": "John", "age": 30})

CODE:

```
x = 10
print(type(x))
```

### Lists vs. Tuples

Lists are mutable, meaning you can modify their elements after creation.

CODE:

```
my_list = [1, 2, 3]
my_list[0] = 4
print(my_list) # The Output is [4, 2, 3]
```

Tuples are immutable, meaning once they are created, their values cannot be changed.

CODE:

```
my_tuple = (1, 2, 3)
# my_tuple[0] = 4
print(my_tuple) # The Output is (1, 2, 3)
```

```
# my_tuple[0] = 4 would raise the following error
TypeError Traceback (most recent call last)
Cell In[12], line 2
1 my_tuple = (1, 2, 3)
----> 2 my_tuple[0] = 4
3 print(my_tuple) # The Output is (1, 2, 3)
TypeError: 'tuple' object does not support item assignment
```

### Comparison Operators

Comparison operators allow you to compare values:

==: Equal to

!=: Not equal to

: Greater than

<: Less than

CODE:

```
x = 5
y = 10
print (x > y) # The Output Is False
```

### Logical Operators

Logical operators are used to combine conditional statements:

and: True if both conditions are true

or: True if at least one condition is true

not: Reverses the result (True becomes False)

CODE:

```
x = 5
y = 10
print(x < 10 and y > 5) # The Output Is True
```

### Membership Operators

Membership operators check if an item is present in a sequence (list, tuple, string):

in: True if the item is found

not in: True if the item is not found

CODE:

```
my_list = [1, 2, 3]
print(3 in my_list) # The Output Is True
```

### If-Else Statements

Conditional statements allow decision-making:

CODE:

```
if x > 5:
print("x is greater than 5")
else:
print("x is less than or equal to 5") # The Output Is x is less than or equal to 5
```

### For Loops

Loops allow you to iterate over sequences:

CODE:

```
for i in range(5):
print(i)
```

### Functions

Functions enable code reuse. You define a function using the def keyword

CODE:

```
def greet(name):
return f"Hello, {name}!"
print(greet("John"))
```

## NUMPY

Python alone is powerful, but for large-scale data analytics and mathematical operations, NumPy is essential. NumPy introduces a high-performance, multi-dimensional array object known as ndarray, which is much more efficient for numerical computations than Python's built-in lists.

### NumPy Arrays vs. Python Lists

- Lists: Flexible, can store mixed data types, but are slower for numerical operations.

CODE:

```
my_list_1 = [11, 21, 31, 41]
```

- NumPy Arrays: Homogeneous (all elements are of the same type) and optimized for performance.

CODE:

```
import numpy as np
my_array = np.array([10, 20, 30, 40])
```

NumPy arrays are faster and more efficient because they use contiguous memory. Python lists store each element as an independent object in memory, whereas NumPy arrays store data in a block of memory, making it easier and faster to perform operations like matrix multiplication and element-wise arithmetic.

### Creating NumPy Arrays

You can create arrays in NumPy using various functions.

CODE:

```
import numpy as np
# Creating a simple array
arr = np.array([1, 2, 3, 4])
# Creating an array of zeros
zeros = np.zeros(5)
# Creating an array with a range of values
range_arr = np.arange(1, 10, 2)
```

### Operations with NumPy Arrays

NumPy allows you to perform element-wise operations on arrays, which is not as straightforward with Python lists.

CODE:

```
arr = np.array([1, 2, 3, 4])
arr2 = arr * 2 # Element-wise multiplication
print (arr2) # The Output Is [2 4 6 8]
```

### Memory Efficiency in NumPy

NumPy arrays consume less memory compared to lists because arrays store elements of the same data type, allowing for more compact storage. For instance, a Python list stores references to each item, while a NumPy array stores data directly in contiguous memory locations, making operations faster and more memory-efficient.

### Converting Data Types in NumPy

NumPy makes it easy to convert data types for numerical computations.

CODE:

```
arr = np.array([1.0, 2.0, 3.0])
arr_int = arr.astype(int) # Convert array to integers
```

### Functions in Data Analytics Scripts

In addition to Python's built-in functions, you will often define custom functions for specific tasks like data cleaning, analysis, and transformation. When working with data analytics, functions help modularize your code and make it reusable across different datasets.

CODE:

```
def normalize_data(data):
max_value = np.max(data)
min_value = np.min(data)
return (data - min_value) / (max_value - min_value)
# Usage with NumPy array
data = np.array([10, 20, 30, 40, 50])
normalized_data = normalize_data(data)
```

### Final Thoughts

Python, combined with libraries like NumPy, provides a solid foundation for data analytics. Understanding key concepts such as variables, data types, loops, and functions, alongside NumPy’s efficient array manipulation, prepares you to handle large datasets with ease. As you progress, you’ll unlock more sophisticated tools in Python’s data analytics ecosystem, including Pandas for data manipulation and Matplotlib for visualization.

## Top comments (0)