Searching is the technique of selecting specific data from a collection of data based on some condition. It is the process of finding a particular item in a collection. It can also be described as:
"Given a list of values, a function that compares two values and a desired value, find the position of the desired value in the list."
In this post, we'll focus on two main search algorithms:
- Linear search- checks the items in sequence until the desired item is found
- binary search-requires a sorted sequence of list, and checks for the value in the middle of the list, repeatedly discarding the half of the list which contains values which are definitely either all larger or all smaller than the desired value
Linear or sequential search is the simplest solution. It simply iterates over the list, one item at a time, until the specific item is found or all items have been examined.
In Python, a target item can be found in a sequence using the in operator:
if key in theArray : print( "The key is in the array." ) else : print( "The key is not in the array.")
To determine if an item is in the array, the search begins with the value in the first element. If the first element does not contain the target value, the next element in sequential order is compared to the item. This process is repeated until the item is found. If the item is not in the array, we still have to iterate over the list comparing every item in the array to the target value. This is because it cannot be determined that the item is not in the sequence until the entire array has been traversed.
def Linear_Search(arr,target): #iterate to the end of the list for i in range(len(arr)): #match found, return true if arr[i]==target: return True return False
Searching A Sorted Sequence using Linear Search
If we know the list is ordered, we only have to check until we have found the element or an element greater than it
def ordered_seq_search(arr,target): # Start at position 0 pos= 0 # Target becomes true if item is in the list found = False # Stop marker stopped = False # go until end of list while pos< len(arr) and not found and not stopped: # If match if arr[pos] == target: found = True else: # Check if element is greater if arr[pos] > target: stopped = True # Otherwise move on else: pos = pos+1 return found
Finding the smallest item in a list
Instead of searching for a specific value in an unsorted sequence, suppose we wanted to search for the smallest value, which is equivalent to applying Python's min() function to the sequence.
A linear search is performed as before, but this time we must keep track of the smallest value found for each iteration through the loop as shown in the sample code:
def smallestItem(arr): # Assume the first item is the smallest value. smallest=arr #loop through the list and check for any smaller value for i in range(len(arr)): if arr[i]==smallest: smallest=arr[i] return smallest #return the smallest value
Binary Search follows a divide and conquer methodology. It is faster than linear search but requires that the array be sorted before the algorithm is executed.
How it works:
- We first check the MIDDLE element in the list.
- If it is the value we want, we can stop.
- If it is HIGHER than the value we want, we repeat the search process with the portion of the list BEFORE the middle element.
- If it is LOWER than the value we want, we repeat the search process with the portion of the list AFTER the middle element.
The binary search algorithm can be written either recursively or iteratively. Recursion is generally slower in Python because it requires the allocation of new stack frames.
Iterative Implementation of Binary Search
def Binary_Search(arr,target): # First and last index values left=0 right=len(arr)-1 while left<=right: mid=(left+right)//2 # Match found if arr[mid]==target: return True # Set new midpoints up or down depending on comparison else: if target>arr[mid]: #set up left = mid+1 else: #set down right = mid-1 return False
Recursive implementation of Binary Search
def rec_bsearch(arr,target): # Base Case! if len(arr) == 0: return False # Recursive Case else: mid = len(arr)//2 # If match found if arr[mid]==target: return True else: # Call again on second half if target<arr[mid]: return rec_bsearch(arr[:mid],target) # Or call on first half else: return rec_bsearch(arr[mid+1:],target)
Sequential (Linear) Search
The time complexity of linear search is O(n), meaning that the time taken to execute increases with the number of items in our input list.
Linear search is not often used in practice, because the same efficiency can be achieved by using inbuilt methods or existing operators, and it is not as fast or efficient as other search algorithms.
Linear search is a good fit for when we need to find the first occurrence of an item in an unsorted collection because unlike most other search algorithms, it does not require that a collection be sorted before searching begins.
Binary Search Algorithm
We can only pick one possibility per iteration, and the input size gets divided by two in each iteration. This makes the time complexity of binary search O(log n), which is more efficient than the linear search.
One drawback of binary search is that if there are multiple occurrences of an element in the array, it does not return the index of the first element, but rather the index of the element closest to the middle.
|Algorithm||Best case||Expected||Worst case|
|Binary search||O(1)||O(log N)||O( log N)|