ridwaanhall Profile Photo
Ridwan Halim

Neural Network from Scratch

A professional implementation of a neural network using only NumPy for MNIST digit classification.

Neural Network from Scratch

Project Description

This project is a deep dive into neural networks built entirely from scratch using NumPy, offering a comprehensive understanding of how neural networks function at a fundamental level. Unlike typical frameworks such as TensorFlow or PyTorch, this implementation directly leverages raw numerical computations, enabling users to grasp the mathematical foundations behind artificial intelligence in a structured and practical way.

It achieves an impressive 98.06% test accuracy on the MNIST dataset, demonstrating its efficiency in classifying handwritten digits. Designed with a clean, object-oriented architecture, it ensures modularity, flexibility, and ease of use for experimenting with different neural network configurations.

The project integrates six activation functions—including ReLU, Sigmoid, Tanh, Softmax, LeakyReLU, and Linear—allowing users to experiment with different activation dynamics and optimize model performance. Additionally, it incorporates five distinct loss functions, including CrossEntropy, MSE, BCE, CategoricalCE, and Huber Loss, enabling precise control over model optimization strategies.

Four weight initialization methods, such as Xavier, He, Random, and Zeros initialization, ensure robust training stability and avoid common pitfalls such as vanishing or exploding gradients.

The training framework is designed with advanced optimization techniques, including Stochastic Gradient Descent (SGD) with momentum, learning rate scheduling, and early stopping mechanisms. These tools provide greater control over model convergence and prevent overfitting.

Beyond traditional numerical computation, this project introduces a fully interactive graphical user interface (GUI) that allows users to draw digits in real time and receive immediate model predictions. This visualization component enhances the educational aspect by demonstrating how neural networks process image inputs dynamically.

Comprehensive visualization tools are embedded within the project, generating organized reports on model performance, including accuracy curves, loss graphs, confusion matrices, and sample misclassifications. This provides users with deeper insights into model behavior and enables data-driven refinement.

Ideal for researchers, students, and AI enthusiasts, this project serves as both a learning tool and a solid foundation for custom neural network experiments. Whether you’re diving into deep learning principles or developing specialized models, this implementation offers a streamlined, intuitive environment for gaining hands-on experience.

Key Features

Pure NumPy Implementation

No TensorFlow or PyTorch, just optimized NumPy operations.

High Accuracy

Achieves 98.06% test accuracy on MNIST dataset.

Comprehensive CLI

20+ configurable parameters for training and optimization.

Technical Details

Python

Versatile programming language for web development, data science, and automation

NumPy

Fundamental package for scientific computing in Python

Matplotlib

Comprehensive library for creating visualizations in Python