Chapter 11 — Tensor Operations & Broadcasting
This chapter covers advanced tensor operations and broadcasting, essential for efficient deep learning computations. We'll see how operations generalize to high-dimensional tensors and why broadcasting makes code simpler.
11.1 Element-wise Operations
Most basic operations like addition, subtraction, multiplication, and division happen element-wise. AI/ML Context: These operations are used for applying activation functions, combining feature maps, or normalizing inputs.
import torch
A = torch.tensor([[1, 2], [3, 4]])
B = torch.tensor([[5, 6], [7, 8]])
# element-wise addition
C = A + B
# element-wise multiplication
D = A * B
print(C)
print(D)
11.2 Matrix Multiplication & Dot Products
Use torch.matmul()
or @
operator. For higher-dimensional tensors, matrix multiplication generalizes to batch operations.
AI/ML Context: All linear layers in neural networks rely on this.
# 2D matrix multiplication
E = torch.matmul(A, B)
# or using @ operator
F = A @ B
print(E)
print(F)
11.3 Broadcasting
Broadcasting allows operations between tensors of different shapes by automatically expanding dimensions. Example: Adding a vector to each row of a matrix. AI/ML Context: Makes computations efficient without manually reshaping tensors; widely used in layer-wise operations and feature normalization.
A = torch.tensor([[1,2,3],[4,5,6]])
b = torch.tensor([10,20,30])
# broadcasting adds b to each row of A
C = A + b
print(C)
11.4 Reduction Operations
Reductions compute summary statistics along specific dimensions: sum
, mean
, max
, min
.
AI/ML Context: Compute loss, normalize features, or pool over dimensions.
X = torch.tensor([[1,2,3],[4,5,6]])
# sum over columns
col_sum = X.sum(dim=0)
# mean over rows
row_mean = X.mean(dim=1)
print(col_sum)
print(row_mean)
11.5 Reshaping & Transposing Tensors
Tensors can be reshaped or transposed without copying data, making it easy to prepare data for layers. AI/ML Context: Flatten images for fully-connected layers or permute channels for convolutional networks.
# reshape 2x3 tensor to 3x2
reshaped = X.view(3,2)
# transpose rows and columns
transposed = X.T
print(reshaped)
print(transposed)
11.6 Advanced Indexing & Slicing
You can access, slice, or mask tensors easily. AI/ML Context: Extract specific batches, features, or pixels without loops.
Y = torch.randn(5,3,4) # 5 batches, 3 channels, 4 features
batch2 = Y[1] # second batch
channel1 = Y[:,0,:] # first channel of all batches
mask = Y > 0
positive_elements = Y[mask]
11.7 Combining Multiple Operations
Tensors can undergo multiple operations together efficiently using broadcasting and in-place operations. AI/ML Context: Forward pass in neural networks is just a sequence of tensor operations.
X = torch.randn(2,3)
W = torch.randn(3,4)
b = torch.randn(1,4)
# linear layer: output = X*W + b
output = X @ W + b
print(output)
11.8 Why Tensor Operations & Broadcasting Matter in AI/ML
Efficient tensor operations reduce memory usage and computation time, especially on GPUs. Broadcasting allows concise code without explicit loops. Both are critical for implementing:
- Neural network layers
- Batch processing
- Data normalization and augmentation
11.9 Exercises
- Create two tensors of shapes (3,4) and (4,) and perform element-wise addition using broadcasting.
- Compute the mean along the last dimension of a 3D tensor (2,3,4).
- Reshape a 4D tensor of shape (2,3,4,5) into (6,20).
- Implement a mini forward pass: linear layer + ReLU activation.
Answers / Hints
- Use
tensor + vector
→ broadcasting automatically adds vector to each row. - Use
tensor.mean(dim=-1)
. - Use
tensor.view(6,20)
. - Use
torch.relu(X @ W + b)
.
11.10 Practice Projects / Mini Tasks
- Implement a small feedforward network using batch tensors for training.
- Create a mini data normalization function using broadcasting.
- Perform a convolution-like operation manually using tensor slicing and broadcasting.
11.11 Further Reading & Videos
- PyTorch documentation — Tensor operations and broadcasting
- TensorFlow guide — Tensor manipulation
- 3Blue1Brown — visual intuition for high-dimensional operations
- Deep Learning Book — chapters on tensor operations in neural networks
Next chapter: Gradients & Automatic Differentiation — compute derivatives for training neural networks efficiently using tensor operations.