MLHub Models API 

This document contains the code/API documentation of all models implemented in this repository. Refer to the All MLHub Models document for the detailed information pages.

LeNet-5 CNN 

The detailed information page is at LeNet-5 CNN. This is the API documentation and contains the following

mlhub.lenet.export for external use
mlhub.lenet.models for LeNet-5 model
mlhub.lenet.utils for utility functions
mlhub.lenet.data for dataset pipelines
mlhub.lenet.train for training process
mlhub.lenet.test for testing the trained model

It closely follows section 2 of the LeCun1998 paper.

A demo for external use can be this

# %%
# Testing LeNet5
import torch
from mlhub.lenet import download_trained_model, LeNet5, \
    MNISTDataset, model_output_to_labels

# Testing example: need 32, 32 normalized image(s)
test_dataset = MNISTDataset(train=False)
test_sample = test_dataset[200] # (img: Tensor[1, 32, 32], label: int)

# Download the trained model
model: LeNet5 = download_trained_model()

with torch.no_grad():
    model_output = model(test_sample[0])
    label = model_output_to_labels(model_output)
    print(f"Label: {label}, true label: {test_sample[1]}")

The following things are exported by this module

MNISTDataset: Dataset class for MNIST dataset.
LeNet5: The LeNet-5 model.
download_trained_model: Single function that downloads and loads the model from remote storage.
model_output_to_labels: Function to convert model’s output to label (using argmin).

Exporting Utilities 

mlhub.lenet.export.download_trained_model(ckpt_dir: str | None = None) → LeNet5[source]

Download the trained LeNet-5 model from remote storage and load checkpoint. If the checkpoint already exists, then the checksum is verified and it’s loaded (nothing is downloaded in this case).

Note

The checkpoint is loaded in eval mode.

Parameters:: ckpt_dir – The checkpointing directory (where the pth file should be stored).
Returns:: The loaded PyTorch Model
Return type:: LeNet5

Models 

class mlhub.lenet.models.LeNet5[source]

LeNet-5 network presented in section 2 of the paper. Contains the following Modules as members

SubSamplingLayer
CustomConvLayer
RBFUnits
SigmoidSquashingActivation

__init__() → None[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

class mlhub.lenet.models.CustomConvLayer(in_channels: int = 6, out_channels: int = 16, kernel_size: int = 5, bias: bool = True, connected_map: ndarray | str = 'default')[source]

Custom convolution layer for LeNet5. Details in Table 1 and related text in the paper.

__init__(in_channels: int = 6, out_channels: int = 16, kernel_size: int = 5, bias: bool = True, connected_map: ndarray | str = 'default') → None[source]

Parameters:

in_channels – Number of channels in the input image.
out_channels – Number of channels in the output image.
kernel_size – The size of the kernel to use for convolution. Should be an int (only square kernels supported).
bias – If True, use bias. Else bias is 0.
connected_map – The connectivity map to use for convolution. If "default", the connectivity map is taken from the table 1 of the paper. If a numpy array is given, it should be of shape [out_channels, in_channels] where [i, j] is True if j-th input channel is connected to the i-th output channel. The array datatype is bool.

class mlhub.lenet.models.RBFUnits(in_features: int = 84, out_features: int = 10, param_vect: ndarray | str = 'default', requires_grad: bool = False)[source]

Radial Basis Function units for the final classification head of LeNet. It is described in equation 7 and related text in the paper.

__init__(in_features: int = 84, out_features: int = 10, param_vect: ndarray | str = 'default', requires_grad: bool = False) → None[source]

Parameters:

in_features – Number of input features.
out_features – Number of output features.
param_vect – Parameter vector for the RBF units. Should be a numpy array. If default, then the default digit templates from Figure 3 of the paper is used.
requires_grad – If True, the gradient for the template weights is enabled, else no gradient is enabled (no backprop over the template weights)

class mlhub.lenet.models.SubSamplingLayer(in_channels: int, kernel_size: int = 2)[source]

Sub-sampling layer for LeNet-5. Labeled as Sx in section 2B of the paper.

Note

TL;DR: It adds four numbers in a single channel (it depends on the kernel_size), multiplies a weight, adds a bias, and returns the result. The weight and bias are trainable.

__init__(in_channels: int, kernel_size: int = 2) → None[source]

Parameters:

in_channels – Number of channels in the input image. The output has the same number of channels.
kernel_size – The size of the kernel to use for sub-sampling.

class mlhub.lenet.models.SigmoidSquashingActivation(A=1.7159, S=0.6666666666666666)[source]

Sigmoid squashing activation function from the LeNet paper. It is a scaled hyperbolic tangent function. It can be found in the equation 6 of the paper. Function does the following

\[f(x) = A \; \mathrm{tanh}(S\,x)\]

__init__(A=1.7159, S=0.6666666666666666) → None[source]

Parameters:

A – The value of \(A\) from equation
S – The value of \(S\) from equation

Utilities 

mlhub.lenet.utils.error_rate(model_labels: Tensor, target_labels: Tensor) → Tensor[source]

Compute error rate. These are the fraction of the model_labels that do not match the target_labels.

Parameters:

model_labels – Labels predicted by model
target_labels – Ground-truth labels

mlhub.lenet.utils.model_output_to_labels(model_output: Tensor) → Tensor[source]

Convert model output (from the RBF unit) to labels using argmin.

Parameters:: model_output – The output of the RBFLayer
Returns:: The label (as index of least value)

mlhub.lenet.utils.model_output_to_multi_labels(model_output: Tensor, top_n: int = 1) → Tensor[source]

Same as model_output_to_labels(), but instead of argmin and returning only one label, it returns top_n smallest values of the RBF output. This can be used for debugging purposes (to see what are the next most likely predictions, for example).

Parameters:

model_output – The output of the RBFLayer
top_n – The top_n value

Returns:

The indices of the lowest top_n values

mlhub.lenet.utils.test(test_dataloader: DataLoader, model: Module, device: device | None = None)[source]

Test the model through a DataLoader on the test set. This function is mainly used for internal evaluation/validation.

Parameters:

test_dataloader – The DataLoader to the MNISTDataset test split
model – The model to test
device – The device to use for testing. If None, then it is inferred from the device of the model.

Data 

Datasets and dataloaders for the MNIST dataset.

MNIST Dataset class. A child class of torch.utils.data.Dataset.

__getitem__(idx: int) → tuple[Tensor | ndarray, int][source]

Returns a sample at the index idx

Parameters:: idx – The index
Returns:: A tuple containing the (image, label)

Parameters:

download_root – The root directory where to download the files. If None, then it is DOWNLOAD_DIR/mnist.
train – If True, use the training set, else use the test set.
transform –
Transformation to apply to the images. No transformation is applied if None. If default, then
1. Convert to PyTorch tensor
2. Zero pad (28, 28) image to (32, 32) shape
3. Normalize input (so that mean and std are approx 0 and 1, respectively)
If some custom transform is given, it should be callable. It’s best to give torchvision transforms.
target_transform – Transformation to apply to the labels. None means no transform.

__len__() → int[source]: Returns the number of samples in the MNIST (train or test) split.

Training 

The LeNet-5 was trained using

python -m mlhub.lenet.train \
    --ckpt-dir /scratch/mlhub/checkpoints/lenet5 \
    --download-dir /scratch/mlhub --train-epochs 50

The above code invokes the Trainer class.

class mlhub.lenet.train.Trainer(batch_size: int = 32, learning_rate: float = 0.01, training_loss: Callable | None = None, ckpt_dir: str | None = None, device: device | None = None)[source]

Trainer for LeNet-5. Training happens on the GPU, if one is found, else it happens on the CPU.

It’s a wrapper for the following

MNISTDataset wrapped in a DataLoader
LeNet-5 model that loads on the GPU (if found) or CPU.
SGD optimizer
A TrainingLoss like object.
Checkpointing every epoch and tensorboard logging using a SummaryWriter. This feature is turned off if ckpt_dir is None.

__init__(batch_size: int = 32, learning_rate: float = 0.01, training_loss: Callable | None = None, ckpt_dir: str | None = None, device: device | None = None) → None[source]

Parameters:

batch_size – The batch size
learning_rate – The learning rate
training_loss – The loss function to use for training. If None, then TrainingLoss is used.
ckpt_dir – The directory where to store checkpoints. If None then no checkpoints are saved.

_checkpoint(ckpt_fname: str, model_only: bool = True, **kwargs)[source]

Checkpoint the model and additional keyword arguments. The function doesn’t check for ‘ckpt_dir’ value. If ‘model_only’ is False, then optimizer is also stored.

Parameters:

ckpt_fname – The full file name where the checkpoint should be stored.
model_only – If True, then the optimizer state is not checkpointed. If False, then the optimizer state dictionary is also included in the checkpoint.
kwargs – Additional information to checkpoint as extra arguments.

Warning

Use this only to store checkpoints in .pt files during training. After training is done, the best model’s state_dict is directly stored in a .pth file.

Warning

This function is private to the class.

_train_epoch(curr_epoch: int = 0) → float[source]

Train the model for a single epoch. It does forward pass for a training batch, computes the loss, and then computes and applies the gradients to the model. It also writes to tensorboard (prints if tensorboard is not enabled).

Parameters:: curr_epoch – The current epoch (only for storing the checkpoint).
Returns:: The loss value as item.

Warning

This function is private to the class.

test() → tuple[Tensor, Tensor, Tensor][source]

Tests the model and returns the statistics on the MNIST test set.

Returns:: A tuple of (test_error, model_preds, test_preds) where test_error is the error (percentage wrong), model_preds is a vector of model predictions for the test images, and test_preds is the ground truth for test set (image) labels.

Note

This function is used for validation or selection of the best checkpoint when training. Avoid calling it outside the class.

train(num_epochs: int = 20) → tuple[LeNet5, tuple[int, Tensor]][source]

The main training function.

Parameters:

num_epochs – The number of epochs to train.

Returns:

The training result as (model, (best_test_epoch, best_test_er)) where

model is the trained LeNet-5 model (after the last epoch)
best_test_epoch is the epoch where the best performance on the test set was achieved
best_test_er is the corresponding test set error

class mlhub.lenet.train.TrainingLoss(j: float = 0.01)[source]

The training loss as defined in Equation 9 of the paper.

__init__(j: float = 0.01) → None[source]

Parameters:: j – The \(j\) value.

Testing 

This is for testing the trained model. This is mainly internal to the LeNet sub-module. The trained LeNet-5 was tested using

python -m mlhub.lenet.test \
    --ckpt-dir /scratch/mlhub/checkpoints/lenet5 \
    --download-dir /scratch/mlhub

It basically runs the model through the test set, reports the test error, and allows you to sample results (view as matplotlib figures).