Latest update Android YouTube

DSA | Basics of Python Developer

Python Data Structures: Core Concepts

Mastering Python's built-in data structures for efficient programming

1. Lists and Tuples

Lists (Mutable Sequences)

Basics

Lists are ordered, mutable collections that can hold heterogeneous data types.

# List creation
numbers = [1, 2, 3, 4]
mixed = [1, "two", 3.0, [4, 5]]

# Indexing and slicing
print(numbers[0])     # 1 (first element)
print(numbers[-1])    # 4 (last element)
print(numbers[1:3])   # [2, 3] (slice from index 1 to 2)

Common Methods

Method Description Example
append(x) Add item to end numbers.append(5)
extend(iterable) Add all items from iterable numbers.extend([6,7])
insert(i, x) Insert item at position numbers.insert(0, 0)
remove(x) Remove first occurrence numbers.remove(3)
pop([i]) Remove and return item last = numbers.pop()

List Operations

# Concatenation
combined = [1, 2] + [3, 4]  # [1, 2, 3, 4]

# Repetition
repeated = [0] * 5  # [0, 0, 0, 0, 0]

# Membership testing
print(2 in numbers)  # True

Advanced Operations

Shallow vs Deep Copying
import copy

original = [[1, 2], [3, 4]]

# Shallow copy (only copies references)
shallow = original.copy()
shallow[0][0] = 99  # Affects original!

# Deep copy (copies all nested objects)
deep = copy.deepcopy(original)
deep[0][0] = 100  # Doesn't affect original
Memory Allocation

Python lists are dynamic arrays that overallocate memory for growth:

import sys

lst = []
for i in range(10):
    print(f"Length: {len(lst)}, Size: {sys.getsizeof(lst)} bytes")
    lst.append(i)

Tuples (Immutable Sequences)

Basics

Tuples are immutable, ordered sequences typically used for heterogeneous data.

# Tuple creation
point = (10, 20)
colors = ("red", "green", "blue")

# Single-element tuple (note trailing comma)
single = (42,)

# Common gotcha - not a tuple
not_a_tuple = (42)  # Just an integer

Immutability and Use Cases

  • Dictionary keys (must be hashable/immutable)
  • Function arguments and return values
  • Data integrity (can't be modified accidentally)
  • Generally more memory efficient than lists

Advanced Concepts

Named Tuples
from collections import namedtuple

# Create a class-like structure
Point = namedtuple("Point", ["x", "y"])
p = Point(10, y=20)
print(p.x, p.y)  # 10 20
Tuple Unpacking
# Basic unpacking
x, y = (10, 20)

# Extended unpacking (Python 3+)
first, *rest = (1, 2, 3, 4)  # first=1, rest=[2, 3, 4]

# Swapping variables
a, b = b, a

Performance Note: Tuples can be more memory efficient than lists for small collections, but the difference is negligible for most use cases.

2. Sets and Dictionaries

Sets (Unique, Unordered Collections)

Basics

Sets are mutable collections of unique, hashable elements with fast membership testing.

# Set creation
primes = {2, 3, 5, 7}
empty_set = set()  # {} creates empty dict!

# Basic operations
primes.add(11)      # Add element
primes.remove(3)    # Remove element (raises KeyError if missing)
primes.discard(99)  # Remove if present (no error)

# Set operations
odds = {1, 3, 5, 7, 9}
print(primes | odds)   # Union
print(primes & odds)   # Intersection
print(primes - odds)   # Difference
print(primes ^ odds)   # Symmetric difference

Frozensets

Immutable version of sets (can be used as dictionary keys):

immutable = frozenset([1, 2, 3])

Advanced Usage

Set Comprehensions
squares = {x**2 for x in range(10)}
Performance

Sets provide O(1) average case membership testing:

# Much faster than checking lists for membership
if 999999 in huge_set:  # Fast
    print("Found")

if 999999 in huge_list:  # Slow (O(n))
    print("Found")
Use Cases
  • Removing duplicates from a sequence: list(set(duplicates))
  • Membership testing (faster than lists)
  • Mathematical set operations
  • Finding unique elements

Dictionaries (Key-Value Stores)

Basics

Dictionaries are mutable mappings from unique keys to values.

# Dictionary creation
person = {"name": "Alice", "age": 25}
empty_dict = {}

# Accessing values
print(person["name"])      # Alice
print(person.get("age"))   # 25
print(person.get("job", "unemployed"))  # Default if missing

# Adding/modifying
person["job"] = "Engineer"
person["age"] = 26

# Membership testing
print("name" in person)  # True

Dictionary Methods

Method Description Example
keys() View of dictionary keys person.keys()
values() View of dictionary values person.values()
items() View of (key, value) pairs person.items()
update() Merge with another dict person.update({"city": "NY"})
setdefault() Get value or set default person.setdefault("age", 30)

Advanced Topics

Dictionary Comprehensions
squares = {x: x**2 for x in range(5)}
# {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
Default Dictionaries
from collections import defaultdict

# Automatically initialize missing keys
word_counts = defaultdict(int)
for word in text.split():
    word_counts[word] += 1
Ordered Dictionaries

As of Python 3.7+, regular dictionaries preserve insertion order. For additional features:

from collections import OrderedDict

# Remember insertion order (useful before Python 3.7)
od = OrderedDict()
od["a"] = 1
od["b"] = 2
od.move_to_end("a")  # Move to end
Memory Optimization

Dictionaries consume more memory than lists. For memory-critical applications:

  • Use __slots__ in classes to prevent dynamic dictionary creation
  • Consider array.array or NumPy arrays for homogeneous numeric data
  • Use sys.getsizeof() to measure memory usage

3. List Comprehensions

Basic Syntax

Concise way to create lists by applying operations to iterables.

# Basic comprehension
squares = [x**2 for x in range(10)]

# With condition
even_squares = [x**2 for x in range(10) if x % 2 == 0]

# Equivalent to:
even_squares = []
for x in range(10):
    if x % 2 == 0:
        even_squares.append(x**2)

Advanced Patterns

Nested Comprehensions

# Flatten a 2D list
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [num for row in matrix for num in row]
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

Multiple Loops

# Cartesian product
colors = ["red", "green"]
sizes = ["S", "M", "L"]
tshirts = [(color, size) for color in colors for size in sizes]
# [('red', 'S'), ('red', 'M'), ('red', 'L'), 
#  ('green', 'S'), ('green', 'M'), ('green', 'L')]

Performance Considerations

List comprehensions are generally faster than equivalent loops:

  • Optimized at the C level in Python
  • More concise and often more readable
  • But can become unreadable if overly complex

Readability Tip: If a comprehension spans multiple lines or has complex logic, consider using a traditional loop for better readability.

4. Generator Expressions

Basics

Similar to list comprehensions but produce items lazily (one at a time).

# Generator expression
squares_gen = (x**2 for x in range(10))

# Consume generator
for num in squares_gen:
    print(num)

# Can only be consumed once
print(list(squares_gen))  # [] (already exhausted)

Memory Efficiency

Generators don't store all values in memory at once:

# Compare memory usage
import sys

list_comp = [x**2 for x in range(10000)]
gen_expr = (x**2 for x in range(10000))

print(sys.getsizeof(list_comp))  # Large (stores all values)
print(sys.getsizeof(gen_expr))   # Small (constant size)

Advanced Usage

Chaining Generators

def integers():
    """Infinite generator of integers"""
    i = 1
    while True:
        yield i
        i += 1

def squares():
    for i in integers():
        yield i * i

# Take first 5 squares
first_5 = (x for _, x in zip(range(5), squares()))

yield vs Generator Expressions

Generator functions (using yield) are more flexible but generator expressions are more concise:

# Generator function
def squares(n):
    for i in range(n):
        yield i ** 2

# Generator expression equivalent
squares = (i**2 for i in range(n))

Use Cases

  • Processing large files line by line
  • Memory-efficient pipelines
  • Infinite sequences
  • Anywhere you only need to process items once

5. Performance & Optimization

Time Complexity

Understanding Big-O notation helps choose the right data structure:

Operation List Set/Dict Deque
Index Access O(1) N/A O(1) (ends only)
Search O(n) O(1) O(n)
Insert at End O(1)* N/A O(1)
Insert at Start O(n) N/A O(1)
Delete Item O(n) O(1) O(n)

* Amortized O(1) due to occasional resizing

Choosing the Right Data Structure

  • Lists: When you need ordered, indexable data that changes frequently
  • Tuples: For immutable sequences, dictionary keys, or when memory matters
  • Sets: For membership testing and uniqueness guarantees
  • Dictionaries: For key-value associations and fast lookups
  • Deques: When you need fast appends/pops from both ends

6. Specialized Collections

collections.deque

Double-ended queue with fast O(1) appends and pops from both ends:

from collections import deque

d = deque([1, 2, 3])
d.appendleft(0)  # [0, 1, 2, 3]
d.pop()          # Removes 3
d.popleft()      # Removes 0

collections.Counter

Specialized dictionary for counting hashable objects:

from collections import Counter

words = ["apple", "banana", "apple", "orange", "banana", "apple"]
word_counts = Counter(words)
print(word_counts.most_common(2))  # [('apple', 3), ('banana', 2)]

heapq

Priority queue implementation using a binary heap:

import heapq

nums = [5, 7, 9, 1, 3]
heapq.heapify(nums)  # Convert to heap

print(heapq.heappop(nums))  # 1 (smallest element)
heapq.heappush(nums, 2)     # Add new element

7. Memory Management

Internal Storage

Python data structures have different memory layouts:

  • Lists: Dynamic array of pointers to objects
  • Tuples: Fixed-size array of pointers
  • Dictionaries: Hash tables with open addressing
  • Sets: Similar to dictionaries but without values

Memory Measurement

import sys

lst = list(range(1000))
tup = tuple(range(1000))

print(sys.getsizeof(lst))  # ~9024 bytes
print(sys.getsizeof(tup))  # ~8040 bytes

Memory Profiling

For detailed memory analysis:

# Install memory_profiler first: pip install memory-profiler
from memory_profiler import profile

@profile
def process_data():
    data = [i**2 for i in range(100000)]
    return sum(data)

process_data()

Post a Comment

Feel free to ask your query...
Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.