Francisco Mendes

Posted 2024-05-16Updated 2026-05-21machine-learning27 minutes read (About 4072 words)

A Manual Implementation of Quantization in PyTorch - Single Layer

INT8 quantization in PyTorch from scratch — scale factors, zero points, and integer arithmetic without QuantStub, to understand exactly what quantized layers do under the hood.

Posted 2024-04-24Updated 2026-04-18machine-learning5 minutes read (About 775 words)

Part II : Shrinking Neural Networks for Embedded Systems Using Low Rank Approximations (LoRA)

In this post, we will explore the Low Rank Approximation (LoRA) technique for shrinking neural networks for embedded systems. We will focus on the Convolutional Neural Network (CNN) case and discuss the rank selection process.

Recents

Links

Tags

Archives