Environment Setup¶

This page walks you through setting up a Python environment with all required packages for the tutorial.

Option A — Local Installation¶

1. Create a virtual environment¶

python3 -m venv transformer-tutorial-env
source transformer-tutorial-env/bin/activate   # Linux / macOS
# transformer-tutorial-env\Scripts\activate    # Windows

2. Install PyTorch¶

Visit pytorch.org and select your platform. For a CUDA 12.1 system:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

For CPU-only (sufficient for chapters 1–5):

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

3. Install tutorial dependencies¶

pip install transformers datasets tokenizers sentencepiece \
    accelerate bitsandbytes peft \
    matplotlib jupyter ipykernel

4. Verify the installation¶

import torch
import transformers

print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Transformers: {transformers.__version__}")

Option B — Google Colab¶

Every chapter can be run in Google Colab with a free GPU runtime. At the top of each notebook, run:

!pip install transformers datasets tokenizers sentencepiece accelerate

Enable a GPU runtime via Runtime → Change runtime type → T4 GPU.

Option C — Kaggle Notebooks¶

Kaggle also provides free GPU access. Install extra packages:

!pip install --quiet transformers datasets tokenizers sentencepiece accelerate peft bitsandbytes

Recommended Versions¶

Package	Minimum version
Python	3.10
PyTorch	2.1
transformers	4.38
tokenizers	0.15
datasets	2.17

← Home Chapter 1: Tokenization →