Sets in Python

The set type is a really useful data structure, it cannot contain duplicate values. It is an unordered collection of objects. It supports:

  • in
  • not in
  • for i in set
  • len()
  • all()
  • any()
  • enumerate()
  • min()
  • max()
  • sorted()
  • sum()

Python sets also support mathematical operations including:

  • Subset (x < y)
  • Intersection (x & y)
  • Union (x | y)
  • Difference (x – y)
  • Equality (x == y)

Sets have the following methods you can invoke:

  • add()
  • clear()
  • copy()
  • difference()
  • difference_update()
  • discard()
  • intersection()
  • intersection_update()
  • isdisjoint()
  • issubset()
  • issuperset()
  • pop()
  • remove()
  • symmetric_difference()
  • symmetric_difference_update()
  • union()
  • update()

You can create a set from a set literal:

countries = {"England", "Australia", "Ireland", "Singapore", "USA"}

Or using the constructor:

country_list = ["England", "Australia", "Ireland", "Singapore", "USA"]

countries = set(country_list)

An example of creating a set of duplicates found in a list:

chars = ['a', 'b', 'a', 'c', 'd', 'd', 'x', 'y', 'y', 'z']
duplicates = set([c for c in chars if chars.count(c) > 1])

# outputs:

{'y', 'a', 'd'}

Finally, a simple example of using some common methods:

countries = {"England", "Australia", "Ireland", "Singapore", "USA", "China",
             "Spain",  "Indonesia", "Malaysia", "Wales", "Italy"}

visited = {"England", "Australia", "Ireland", "Singapore", "Indonesia",
           "Malaysia", "Wales", "Italy"}

worked_in = {"England", "Australia", "Ireland", "Singapore"}

next_trip = {"USA"}

print('subset:       visited is a subset of countries: ', visited < countries)
print('intersection: countries visited: ', countries & visited)
print('difference:   visited but not worked in: ', visited - worked_in)
print('union:        visited and next_trip: ', visited | next_trip)
print('equality:     visited all countries: ', visited == countries)

print('worked_in is empty: ', len(worked_in) == 0)


# silently ignores the duplicate, it is not added to the set

# throws an exception if Wales is not in the set

# does not throw an exception if Belgium is not in the set

# let's visit the rest of the countries
visited.update(['USA', 'China', 'Spain'])
print('equality:     visited all countries: ', visited == countries)

# membership test
print("USA" in countries)
print("France" not in countries)

# iteration
for country in countries:

For more information, see the official docs here.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s