The binary search is an excellent algorithm to find an item in a sorted list, being O(Log n) it is very efficient. This is an implementation in Python:
def binary_search(items, required_item):
low = 0
high = len(items) - 1
while low <= high:
index = (low + high) // 2
item = items[index]
if item == required_item:
return index
if item > required_item:
high = index - 1
else:
low = index + 1
Essentially it works like this:
- start with low set to the first item, high to the last, index to the middle
- if the item at the index is a match, we are done, return the index
- if the item is greater, halve the list by ignoring the item and all items after it
- otherwise, halve the list by ignoring the item and all items before it
- repeat until there are no more items to check
For example, consider this :
animals = ["cat", "dog", "elephant", "lion", "sparrow", "tiger", "wombat"]
print('index of tiger : ', binary_search(animals, "tiger") or "not found")
print('index of dog : ', binary_search(animals, "dog") or "not found")
print('index of turtle: ', binary_search(animals, "turtle") or "not found")
# outputs:
index of tiger : 5
index of dog : 1
index of turtle: not found
These are the steps involved in finding tiger:
- Our initial search range is between the very first and last items, hence low is 0 and high is 6. We calculate the middle to be 3, so we set the index accordingly.
Initial Step
0 cat <-- low 1 dog 2 elephant 3 lion <-- index 4 sparrow 5 tiger 6 wombat <-- high
- lion is less than tiger, so we now set low to index + 1, thus excluding lion and the items before it from our next search. We calculate the new middle to be 5, and set the index.
Next Step
0 cat 1 dog 2 elephant 3 lion 4 sparrow <-- low 5 tiger <-- index 6 wombat <-- high
- tiger is indeed what we are looking for, so we have finished our search.
That’s the binary search in a nutshell.