Skip to content

Environment Setup

This page walks you through setting up a Python environment with all required packages for the tutorial.


Option A — Local Installation

1. Create a virtual environment

python3 -m venv transformer-tutorial-env
source transformer-tutorial-env/bin/activate   # Linux / macOS
# transformer-tutorial-env\Scripts\activate    # Windows

2. Install PyTorch

Visit pytorch.org and select your platform. For a CUDA 12.1 system:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

For CPU-only (sufficient for chapters 1–5):

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

3. Install tutorial dependencies

pip install transformers datasets tokenizers sentencepiece \
    accelerate bitsandbytes peft \
    matplotlib jupyter ipykernel

4. Verify the installation

import torch
import transformers

print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Transformers: {transformers.__version__}")

Option B — Google Colab

Every chapter can be run in Google Colab with a free GPU runtime. At the top of each notebook, run:

!pip install transformers datasets tokenizers sentencepiece accelerate

Enable a GPU runtime via Runtime → Change runtime type → T4 GPU.


Option C — Kaggle Notebooks

Kaggle also provides free GPU access. Install extra packages:

!pip install --quiet transformers datasets tokenizers sentencepiece accelerate peft bitsandbytes

Package Minimum version
Python 3.10
PyTorch 2.1
transformers 4.38
tokenizers 0.15
datasets 2.17

← Home Chapter 1: Tokenization →