FatimaAlam1234

Posted on Feb 21 • Updated on Mar 6

Python Interview Questions

How do you copy an object in Python?

In Python, the assignment statement (= operator) does not copy objects. Instead, it creates a binding between the existing object and the target variable name. To create copies of an object in Python, we need to use the copy module. Moreover, there are two ways of creating copies for the given object using the copy module -

Shallow Copy is a bit-wise copy of an object. The copied object created has an exact copy of the values in the original object. If either of the values is a reference to other objects, just the reference addresses for the same are copied.
Deep Copy copies all values recursively from source to target object, i.e. it even duplicates the objects referenced by the source object.

from copy import copy, deepcopy
list_1 = [1, 2, [3, 5], 4]
## shallow copy
list_2 = copy(list_1) 
list_2[3] = 7
list_2[2].append(6)
list_2    # output => [1, 2, [3, 5, 6], 7]
list_1    # output => [1, 2, [3, 5, 6], 4]
## deep copy
list_3 = deepcopy(list_1)
list_3[3] = 8
list_3[2].append(7)
list_3    # output => [1, 2, [3, 5, 6, 7], 8]
list_1    # output => [1, 2, [3, 5, 6], 4]

What is the difference between xrange and range in Python?

xrange() and range() are quite similar in terms of functionality. They both generate a sequence of integers, with the only difference that range() returns a Python list, whereas, xrange() returns an xrange object.

So how does that make a difference? It sure does, because unlike range(), xrange() doesn't generate a static list, it creates the value on the go. This technique is commonly used with an object-type generator and has been termed as "yielding".

Yielding is crucial in applications where memory is a constraint. Creating a static list as in range() can lead to a Memory Error in such conditions, while, xrange() can handle it optimally by using just enough memory for the generator (significantly less in comparison).

for i in xrange(10): # numbers from o to 9
print i # output => 0 1 2 3 4 5 6 7 8 9
for i in xrange(1,10): # numbers from 1 to 9
print i # output => 1 2 3 4 5 6 7 8 9
for i in xrange(1, 10, 2): # skip by two for next
print i # output => 1 3 5 7 9

Note: xrange has been deprecated as of Python 3.x. Now range does exactly the same as what xrange used to do in Python 2.x, since it was way better to use xrange() than the original range() function in Python 2.x.

. What is pickling and unpickling?

Python library offers a feature - serialization out of the box. Serializing an object refers to transforming it into a format that can be stored, so as to be able to deserialize it, later on, to obtain the original object. Here, the pickle module comes into play.

Pickling:

Pickling is the name of the serialization process in Python. Any object in Python can be serialized into a byte stream and dumped as a file in the memory. The process of pickling is compact but pickle objects can be compressed further. Moreover, pickle keeps track of the objects it has serialized and the serialization is portable across versions.
The function used for the above process is pickle.dump().

Unpickling:

Unpickling is the complete inverse of pickling. It deserializes the byte stream to recreate the objects stored in the file and loads the object to memory.
The function used for the above process is pickle.load().

Note: Python has another, more primitive, serialization module called marshall, which exists primarily to support .pyc files in Python and differs significantly from the pickle.

What are generators in Python?

Generators are functions that return an iterable collection of items, one at a time, in a set manner. Generators, in general, are used to create iterators with a different approach. They employ the use of yield keyword rather than return to return a generator object.
Let's try and build a generator for fibonacci numbers -

generate fibonacci numbers upto n

def fib(n):
p, q = 0, 1
while(p < n):
yield p
p, q = q, p + q
x = fib(10) # create generator object

iterating using next(), for Python2, use next()

x.next() # output => 0
x.next() # output => 1
x.next() # output => 1
x.next() # output => 2
x.next() # output => 3
x.next() # output => 5
x.next() # output => 8
x.next() # error

iterating using loop

for i in fib(10):
print(i) # output => 0 1 1 2 3 5 8

What is PYTHONPATH in Python?

PYTHONPATH is an environment variable which you can set to add additional directories where Python will look for modules and packages. This is especially useful in maintaining Python libraries that you do not wish to install in the global default location.

What is the use of help() and dir() functions?

help() function in Python is used to display the documentation of modules, classes, functions, keywords, etc. If no parameter is passed to the help() function, then an interactive help utility is launched on the console.
dir() function tries to return a valid list of attributes and methods of the object it is called upon. It behaves differently with different objects, as it aims to produce the most relevant data, rather than the complete information.

For Modules/Library objects, it returns a list of all attributes, contained in that module.
For Class Objects, it returns a list of all valid attributes and base attributes.
With no arguments passed, it returns a list of attributes in the current scope.

How are arguments passed by value or by reference in python?

Pass by value: Copy of the actual object is passed. Changing the value of the copy of the object will not change the value of the original object.
Pass by reference: Reference to the actual object is passed. Changing the value of the new object will change the value of the original object.

In Python, arguments are passed by reference, i.e., reference to the actual object is passed.

def appendNumber(arr):
arr.append(4)
arr = [1, 2, 3]
print(arr) #Output: => [1, 2, 3]
appendNumber(arr)
print(arr) #Output: => [1, 2, 3, 4]

What does *args and **kwargs mean?

*args

*args is a special syntax used in the function definition to pass variable-length arguments.
“*” means variable length and “args” is the name used by convention. You can use any other.

def multiply(a, b, *argv):
mul = a * b
for num in argv:
mul *= num
return mul
print(multiply(1, 2, 3, 4, 5)) #output: 120

**kwargs

**kwargs is a special syntax used in the function definition to pass variable-length keyworded arguments.
Here, also, “kwargs” is used just by convention. You can use any other name.
Keyworded argument means a variable that has a name when passed to a function.
It is actually a dictionary of the variable names and its value.

def tellArguments(**kwargs):
for key, value in kwargs.items():
print(key + ": " + value)
tellArguments(arg1 = "argument 1", arg2 = "argument 2", arg3 = "argument 3")

output:

arg1: argument 1

arg2: argument 2

arg3: argument 3

Which is faster, python list or Numpy arrays, and why?
A. NumPy arrays are faster than Python lists for numerical operations. NumPy is a library for working with arrays in Python, and it provides a number of functions for performing operations on arrays efficiently.

One reason why NumPy arrays are faster than Python lists is that NumPy arrays are implemented in C, while Python lists are implemented in Python. This means that operations on NumPy arrays are implemented in a compiled language, which makes them faster than operations on Python lists, which are implemented in an interpreted language.

What are python sets? Explain some of the properties of sets.
A. In Python, a set is an unordered collection of unique objects. Sets are often used to store a collection of distinct objects and to perform membership tests (i.e., to check if an object is in the set). Sets are defined using curly braces ({ and }) and a comma-separated list of values.

Here are some key properties of sets in Python:

Sets are unordered: Sets do not have a specific order, so you cannot index or slice them like you can with lists or tuples.
Sets are unique: Sets only allow unique objects, so if you try to add a duplicate object to a set, it will not be added.
Sets are mutable: You can add or remove elements from a set using the add and remove methods.
Sets are not indexed: Sets do not support indexing or slicing, so you cannot access individual elements of a set using an index.
Sets are not hashable: Sets are mutable, so they cannot be used as keys in dictionaries or as elements in other sets. If you need to use a mutable object as a key or an element in a set, you can use a tuple or a frozen set (an immutable version of a set).

What are 2 mutable and 2 immutable data types in python?
A. 2 mutable data types are –
Dictionary
List
2 immutable data types are:
Tuples
String

Q13. Why does NumPy have huge popularity in the field of data science?
It has gained a lot of popularity in the data science community because it provides fast and efficient tools for working with large arrays and matrices of numerical data.
t uses optimized C and Fortran code behind the scenes to perform these operations, which makes them much faster than equivalent operations using Python’s built-in data structures.

NumPy provides a large number of functions for performing mathematical and statistical operations on arrays and matrices.
It allows you to work with large amounts of data efficiently. It provides tools for handling large datasets that would not fit in memory, such as functions for reading and writing data to disk and for loading only a portion of a dataset into memory at a time.
NumPy integrates well with other scientific computing libraries in Python, such as SciPy (Scientific Python) and pandas. This makes it easy to use NumPy with other libraries to perform more complex data science tasks.

Q14. Explain list comprehension and dict comprehension.
A. List comprehension and dict comprehension are both concise ways to create new lists or dictionaries from existing iterables.

# List comprehension for squares of even numbers
squares = [x**2 for x in range(10) if x % 2 == 0]
print(squares)
# Output: [0, 4, 16, 36, 64]
# Dictionary comprehension for squares of even numbers
squares_dict = {x: x**2 for x in range(10) if x % 2 == 0}
print(squares_dict)
# Output: {0: 0, 2: 4, 4: 16, 6: 36, 8: 64}

Q16. What is an ordered dictionary?
A. An ordered dictionary, also known as an OrderedDict, is a subclass of the built-in Python dictionary class that maintains the order of elements in which they were added

Q19. What is the use of the ‘assert’ keyword in python?
A. In Python, the assert statement is used to test a condition. If the condition is True, then the program continues to execute. If the condition is False, then the program raises an AssertionError exception.

The assert statement is often used to check the internal consistency of a program. For example, you might use an assert statement to check that a list is sorted before performing a binary search on the list.

It’s important to note that the assert statement is used for debugging purposes and is not intended to be used as a way to handle runtime errors. In production code, you should use try and except blocks to handle exceptions that might be raised at runtime.

Q20. What are decorators in python?

# Example of a simple decorator
def my_decorator(func):
    def wrapper():
        print("Something is happening before the function is called.")
        func()  # Call the original function
        print("Something is happening after the function is called.")
    return wrapper

# Using the decorator
@my_decorator
def say_hello():
    print("Hello!")

# Calling the decorated function
say_hello()

Output

Something is happening before the function is called.
Hello!
Something is happening after the function is called.

Q21. How to perform univariate analysis for numerical and categorical variables?
A. Univariate analysis is a statistical technique used to analyze and describe the characteristics of a single variable
For numerical variables:

Calculate descriptive statistics such as the mean, median, mode, and standard deviation to summarize the distribution of the data.
Visualize the distribution of the data using plots such as histograms, boxplots, or density plots.
Check for outliers and anomalies in the data.
Check for normality in the data using statistical tests or visualizations such as a Q-Q plot.
For categorical variables.

Calculate the frequency or count of each category in the data.
Calculate the percentage or proportion of each category in the data.
Visualize the distribution of the data using plots such as bar plots or pie charts.
Check for imbalances or abnormalities in the distribution of the data.

What are the different ways in which we can find outliers in the data?
Outliers are data points that are significantly different from the majority of the data

Visual inspection: Outliers can often be identified by visually inspecting the data using plots such as histograms, scatterplots, or boxplots.
Z-score: The z-score of a data point is a measure of how many standard deviations it is from the mean. Data points with a z-score greater than a certain threshold (e.g., 3 or 4) can be considered outliers.