This Project was created and maintained as part of the Scientific Programming course (B.Sc CS) @TUHH
The objective of this project is to develop and evaluate a handwritten digit classifier using the MNIST dataset. Rather than trying different architectures, we want you to experiment with how the training data affects the classifier’s performance. You don’t need deep knowledge of machine learning to succeed in the project, but it does require some effort on your part to get used to some basic concepts of machine learning. More specifically, the project includes the following tasks:
-
Download the MNIST dataset either from http://yann.lecun.com/exdb/mnist/ or using the Julia package MLDatasets.jl.
-
Write a method to train a simple convolutional neural network called LeNet to classify the images. To get started, check out the Julia machine learning library Flux.jl. We also encourage you to look at the short guides provided by Flux.jl at https://fluxml.ai/Flux.jl/stable/guide/models/quickstart/ and the following pages. For the following tasks, keep the network, the training algorithm, and other hyperparameters fixed and concentrate on the data instead.
-
Write methods to evaluate the performance of your classifier. What accuracy do you achieve on the test dataset? Do you get the same accuracy in practice? If not, what might be the reasons?
-
Train the classifier using different proportions of the original training data. How does the size of the training set affect the classifier’s performance?
-
Using a fraction of the original training set (e.g., 10%), implement and apply different data augmentation methods to improve the classifier’s performance. Can you achieve a performance similar to or better than that of a network trained on the full training set?
-
Starting point is the kickoff meeting (2025-06-20)
-
Project phase with weekly meetings (2025-06-27, 2025-07-04, and 2025-07-11) with your supervisor
-
Presentations will finalize the project (2025-07-18)
| Open ⬜ | Done ✅ | In progress ⚙️ | Not working ❌ | Other ❓ |
|---|
-
2025-06-20 to 2025-06-27: Preparation
✅ Readme.md
✅ Roadmap
✅ Setup Project -
2025-06-27: Kickoff meeting w/ group and supervisor
-
2025-06-27 to 2025-07-04: Sprint 1
✅ Repository Cleanup and writing train! function ⭐⭐⭐ (refers to 2)
✅ Write confusion matrix and accuracy function ⭐⭐⭐ *(refers to 3) -
2025-07-05 to 2025-07-11: Sprint 2
✅ Save training models ⭐⭐ (refers to 4)
✅ Evaluation/Discussion/Testing ⭐⭐⭐ (refers to 4/5)
✅ Fix Augmentation and Backend ⭐⭐ (refers to 5) -
2025-07-12 to 2025-07-16: Sprint 3
✅ Write unit tests ⭐⭐ (refers to general tasks)
✅ Rework Frontend text⭐⭐ (refers to general tasks)
✅ Preparing the presentation ⭐⭐ (refers to general tasks) -
2025-07-18: Presentation
.
├── FrontendPluto.jl
├── Manifest.toml
├── models
│ └── model_54210.bson
├── presentation
│ ├── Presentation.html
│ └── Presentation.pdf
├── Project.toml
├── README.md
├── src
│ ├── augmentation_backend.jl
│ └── model_backend.jl
└── tests
└── test_lenet5.jl
Paul Hain - paulhain.developer@gmail.com
Ilya Acik - ilyaacik.dev@gmail.com
Yusa Kaya - yusakaya.dev@gmail.com