ELEC 395 - AI Applications Laboratory
Experiment 4

Image Datasets and CNNs

Working with CIFAR-10 and Transfer Learning

← Back to Course Home
📋

Laboratory Overview

↑ Go Up

This laboratory focuses on working with image datasets and Convolutional Neural Networks (CNNs) for computer vision tasks. Students will learn to process the CIFAR-10 dataset, implement CNN architectures from scratch, apply data augmentation techniques, and leverage transfer learning with pre-trained models. Through hands-on exercises on the Jetson Orin Nano, you'll develop practical skills in building and deploying image classification systems.

What You'll Learn

  • Image Data Loading: Use torchvision's ImageFolder and transforms for efficient dataset handling
  • CNN Architecture: Design convolutional layers, pooling layers, and feature extraction pipelines
  • Data Augmentation: Apply transformations to improve model generalization
  • CIFAR-10 Training: Train and evaluate CNNs on a standard benchmark dataset
  • Transfer Learning: Fine-tune pre-trained models (VGG, ResNet) for custom tasks
  • Custom Datasets: Organize and load your own image datasets

💡 Why This Matters

CNNs are the backbone of modern computer vision systems—powering applications from facial recognition and autonomous vehicles to medical image analysis and satellite imagery processing. Understanding how to work with image datasets, design CNN architectures, and apply transfer learning is essential for any AI engineer. By deploying these models on edge devices like the Jetson Orin Nano, you're learning practical skills for real-world deployment scenarios.

Lab Structure

This laboratory consists of three progressive parts, each building upon previous concepts:

  • Part 1: CIFAR-10 Classification with CNNs - Building and training convolutional networks
  • Part 2: Loading Image Data - Working with ImageFolder and custom datasets
  • Part 3: Transfer Learning - Fine-tuning pre-trained models for custom tasks
🎯

Learning Objectives

↑ Go Up

By the end of this laboratory session, you will be able to:

  • Load and preprocess image datasets using torchvision's ImageFolder and transforms modules for efficient data handling in computer vision tasks.
  • Design and implement CNN architectures for image classification, understanding the role of convolutional layers, pooling layers, and feature extraction.
  • Apply data augmentation techniques to improve model generalization and prevent overfitting when working with limited training data.
  • Train CNNs on the CIFAR-10 dataset and evaluate their performance using appropriate metrics for multi-class classification.
  • Implement transfer learning by fine-tuning pre-trained models (VGG, ResNet) for custom image classification tasks.
  • Create custom dataset loaders for organizing and loading your own image datasets using PyTorch's Dataset and DataLoader classes.
📚

Background

↑ Go Up

Introduction to Image Datasets and CNNs

Image datasets are fundamental to computer vision and deep learning applications. Unlike simple numerical data, images require specialized processing and neural network architectures to effectively learn visual patterns and features. This laboratory explores working with standard image datasets like CIFAR-10 and leveraging transfer learning with pre-trained models.

The CIFAR-10 dataset consists of 60,000 32×32 color images across 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. With 50,000 training images and 10,000 test images, CIFAR-10 serves as an excellent benchmark for learning CNN architectures and training strategies. Despite the small image size, CIFAR-10 presents significant challenges that make it ideal for understanding fundamental concepts in computer vision.

Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are specifically designed for processing grid-like data such as images. Unlike fully connected networks that treat images as flat vectors, CNNs preserve spatial relationships through several key mechanisms. Convolutional layers apply learnable filters to detect local patterns such as edges, textures, and shapes. Pooling layers reduce spatial dimensions while retaining important features, making the network more efficient and robust. Through multiple layers, CNNs build feature hierarchies that progress from low-level features (edges, corners) to high-level features (objects, faces).

Data Augmentation

Data augmentation artificially expands training datasets by applying transformations like rotations, flips, crops, and color adjustments to existing images. This technique provides several important benefits: it improves model generalization by exposing the network to varied inputs, reduces overfitting especially when training data is limited, makes models more robust to real-world variations in lighting and orientation, and effectively increases dataset size without collecting new data. In PyTorch, data augmentation is implemented through the transforms module in torchvision.

Transfer Learning

Transfer learning leverages knowledge from models pre-trained on large datasets like ImageNet (containing 1.2 million images across 1000 categories) and adapts them for new tasks. This approach offers significant advantages: faster training since the model starts with learned features rather than random weights, better performance especially with limited training data, reusable low-level features (edges, textures, shapes) that transfer across different visual domains, and resource efficiency by avoiding the computational cost of training large models from scratch. In this lab, you'll experiment with fine-tuning pre-trained VGG and ResNet models for custom classification tasks.

🎬

Pre-lab Preparation

↑ Go Up

Before starting the laboratory exercises, watch the following video tutorials from Udacity's "Introduction to Deep Learning with PyTorch" course. These videos provide essential background knowledge and practical demonstrations of the concepts you'll implement in this lab.

📺 Required Video Tutorials

Watch all videos from Udacity - Introduction to Deep Learning with PyTorch

Transfer Learning

Chapter: Introduction to PyTorch

📝 Pre-lab Quiz

Instructions: Complete this quiz after watching the required videos to assess your readiness for the lab. Click on your answer choice to see if it's correct.

Question 1: What is the size of images in the CIFAR-10 dataset?

  • A) 28×28 pixels (grayscale)
  • B) 64×64 pixels (RGB)
  • C) 32×32 pixels (RGB)
  • D) 224×224 pixels (RGB)

Question 2: What is the primary advantage of transfer learning?

  • A) It eliminates the need for training data
  • B) It leverages pre-trained features to improve performance with less data
  • C) It always produces 100% accuracy
  • D) It makes training slower but more accurate

Question 3: What is the purpose of data augmentation in image classification?

  • A) To reduce the size of the dataset
  • B) To make training faster by reducing image resolution
  • C) To increase dataset diversity and improve model generalization
  • D) To compress images for storage efficiency

Question 4: In a CNN, what is the primary function of a convolutional layer?

  • A) To detect local patterns and features in the input image
  • B) To reduce the spatial dimensions of the feature maps
  • C) To flatten the image into a vector
  • D) To perform classification at the output layer

Question 5: Which PyTorch module is commonly used for loading and preprocessing image datasets?

  • A) torch.nn
  • B) torchvision
  • C) torch.optim
  • D) torch.autograd

Question 6: What is the primary purpose of pooling layers in a CNN?

  • A) To increase the spatial dimensions of feature maps
  • B) To reduce spatial dimensions while retaining important features
  • C) To apply non-linear activation functions
  • D) To normalize the input data

Question 7: What does a typical CNN filter (kernel) size of 3×3 mean?

  • A) The filter examines a 3×3 pixel region at a time
  • B) The output image will be 3×3 pixels
  • C) The neural network has 3 input and 3 output layers
  • D) The stride is always 3 pixels

Question 8: Which pre-trained model architecture is known for using residual connections (skip connections)?

  • A) VGG
  • B) ResNet
  • C) AlexNet
  • D) LeNet

Question 9: When loading a custom image dataset using ImageFolder in PyTorch, how should the images be organized?

  • A) All images in a single folder with labels in a CSV file
  • B) Images organized in subdirectories, where each subdirectory name represents a class label
  • C) Images with labels embedded in the filename
  • D) Images in random folders with a separate JSON configuration file

Question 10: Which technique is NOT typically used to prevent overfitting in CNNs?

  • A) Dropout
  • B) Data augmentation
  • C) Increasing the learning rate significantly
  • D) L2 regularization (weight decay)

Note: Discuss your answers with your lab instructor before beginning the practical exercises. Understanding these concepts is crucial for successfully completing the lab.

⚙️

Lab Procedure

↑ Go Up

This laboratory is divided into three parts, each focusing on a specific aspect of working with image datasets and CNNs. Complete each part sequentially, as they build upon each other. Each part includes exercises that you should attempt before viewing the solutions.

⚠️ Before You Begin:
  • Ensure your Jetson Orin Nano is powered on and connected
  • Verify that PyTorch and torchvision are pre-configured by lab technician
  • All datasets will be automatically downloaded when running the exercises
  • Work through parts in order - later parts build on earlier concepts

Part 1: CIFAR-10 Classification with CNNs

Build and train a convolutional neural network to classify images from the CIFAR-10 dataset. You'll design a CNN architecture, implement training loops, apply data augmentation, and evaluate model performance on test data.

Key Topics:

  • CNN architecture design for image classification
  • CIFAR-10 dataset loading and preprocessing
  • Data augmentation techniques
  • Training and validation loops
  • Model evaluation and accuracy metrics

Part 2: Loading Image Data

Learn to organize, load, and preprocess custom image datasets using PyTorch's ImageFolder and transforms. You'll work with image datasets, implement data loaders, apply preprocessing pipelines, and visualize loaded images.

Key Topics:

  • ImageFolder for dataset organization
  • Data transforms and normalization
  • DataLoader configuration
  • Batch processing
  • Custom dataset creation

Part 3: Transfer Learning

Apply transfer learning by fine-tuning a pre-trained model (VGG or ResNet) for a custom classification task. You'll load pre-trained weights, freeze/unfreeze layers, replace the classifier head, and compare performance with training from scratch.

Key Topics:

  • Pre-trained model loading (VGG, ResNet)
  • Feature extraction vs fine-tuning
  • Layer freezing techniques
  • Classifier head modification
  • Performance comparison with training from scratch

💡 Tips for Success

  • Work sequentially - Each part builds on previous concepts
  • Experiment - Try modifying hyperparameters to see effects
  • Monitor training - Watch loss curves and accuracy metrics
  • Take screenshots - Capture all outputs and plots for your report
  • Ask questions - Consult your instructor when stuck
🔧

Lab Materials

↑ Go Up

Hardware Requirements

  • Platform: Jetson Orin Nano Developer Kit (assembled in Week 1)
  • Memory: Minimum 4GB RAM available
  • Storage: At least 2GB free space for datasets
  • Connection: Internet access for downloading CIFAR-10 dataset

Software Prerequisites

Pre-installed by lab technician on your Jetson Orin Nano:

  • Python: Version 3.8 or higher
  • PyTorch: Version compatible with Jetson (with CUDA support)
  • torchvision: For dataset loading and transformations
  • NumPy: For numerical operations
  • Matplotlib: For visualization

Datasets

  • CIFAR-10: Automatically downloaded by torchvision when running the exercises
  • Size: Approximately 170MB compressed
  • Classes: 10 categories (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)
  • Images: 60,000 total (50,000 training + 10,000 test), 32×32 RGB

Included Files

All necessary code and helper functions are included in the three lab parts. Each HTML file is self-contained and ready to execute on your Jetson Orin Nano through a Jupyter Notebook interface.

  • Part 1: CIFAR-10 CNN Exercise and Solution HTML files
  • Part 2: Loading Image Data Exercise and Solution HTML files
  • Part 3: Transfer Learning Exercise and Solution HTML files
📖

References & Resources

↑ Go Up

Primary Course Material

Source: Udacity - Introduction to Deep Learning with PyTorch

Chapters: Convolutional Neural Networks & Introduction to PyTorch

See Pre-lab Preparation section above for complete video tutorial list

Additional Resources

  • PyTorch Documentation: https://pytorch.org/docs/
  • torchvision Documentation: https://pytorch.org/vision/stable/
  • CIFAR-10 Dataset: https://www.cs.toronto.edu/~kriz/cifar.html
  • Transfer Learning Tutorial: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
📄

Lab Report Requirements

↑ Go Up

Students must submit a comprehensive lab report demonstrating their understanding of CNNs and transfer learning. The report should showcase practical skills acquired through the three laboratory parts and include evidence of all completed exercises.

⚠️ Submission Deadline:

Submit your completed lab report by [Insert Deadline - Typically 1 week after lab session]. Late submissions will be penalized according to course policy (10% per day, maximum 3 days).

Report Structure

Your lab report must include the following sections:

1. Title Page & Formatting (5 points)

  • Lab title, your name, student ID, date, course name, and instructor name
  • Professional formatting with clear headers and page numbers

2. Objectives (10 points)

  • List all learning objectives
  • Briefly explain why each is important (1-2 sentences each)

3. Procedure & Results (50 points)

For each of the 3 parts:

  • Include code snippets with clear outputs
  • Provide screenshots of key results
  • Add plots where applicable (loss curves, accuracy graphs, etc.)
  • Explain what each part demonstrates

4. Discussion (20 points)

  • Analyze your experimental results
  • Compare CNN performance on CIFAR-10 with different architectures
  • Discuss the benefits of transfer learning vs. training from scratch
  • Support all statements with evidence from your experiments

5. Challenges & Solutions (10 points)

  • Describe problems you encountered
  • Explain your debugging process
  • Reflect on what you learned from solving these challenges

6. Conclusion (5 points)

  • Summarize key learnings
  • Reflect on the most challenging concepts
  • Discuss potential applications of this knowledge

📋 Submission Checklist

Before submitting, ensure you have:

  • ✓ Completed all 3 lab parts with working, tested code
  • ✓ Included clear screenshots of all outputs, plots, and visualizations
  • ✓ Answered all discussion questions thoroughly with supporting evidence
  • ✓ Documented challenges and solutions in detail
  • ✓ Checked all code for errors and verified all functions execute correctly
  • ✓ Formatted report professionally with clear section headers and page numbers
  • ✓ Referenced all sources and datasets used
  • ✓ Proofread for grammar, spelling, and technical accuracy
  • ✓ Verified all images are clear, properly labeled, and referenced in text
  • ✓ Included your name and student ID on all pages

📤 Submission Format

  • File Format: Submit report as PDF document (required)
  • Code Files: Include Jupyter notebooks (.ipynb) in a separate ZIP file
  • File Naming Convention:
    • Report: Week4_[YourLastName]_[StudentID].pdf
    • Code: Week4_[YourLastName]_[StudentID]_Code.zip
    • Example: Week4_Ahmed_202012345.pdf
  • Submission Method: Upload to University LMS (Blackboard/Moodle)
  • File Size Limit: Maximum 50MB total
    • If exceeded, compress images or use PDF compression tools
    • Ensure PDF is searchable text, not scanned images
  • Required Components:
    • 1. Main PDF lab report
    • 2. ZIP file containing all Jupyter notebooks with outputs
    • 3. Any modified helper files (if applicable)
Important:
  • Ensure PDF is searchable and not password-protected
  • All code must be properly commented and executable
  • Include all necessary imports and dependencies
  • Test that your notebooks run completely from top to bottom

📊 Grading Rubric

Component Points Criteria
Title Page & Formatting 5 Complete, professional
Objectives 10 Clear, comprehensive
Procedure & Results 50 All parts complete, correct code, proper outputs
Discussion 20 Thoughtful analysis, supported by results
Challenges & Solutions 10 Detailed problem-solving process
Conclusion 5 Reflective, insightful
Total 100

Grading Notes:

  • All code must execute without errors for full credit
  • Screenshots must be clear, properly labeled, and referenced
  • All discussion questions must be answered with supporting evidence
  • Late penalty: 10% per day (up to 3 days)
  • Plagiarism will result in zero credit