Week 15: Final Examination - Comprehensive Assessment

📋

Examination Overview

↑ Go Up

The Final Examination for ELEC 395 is a comprehensive assessment covering 10 weeks of practical AI and robotics laboratory work. This examination evaluates your understanding of fundamental AI concepts, neural network architectures, PyTorch implementation, autonomous robotics, sensor interfacing, and edge AI deployment. The exam is designed to test both theoretical knowledge and practical problem-solving skills you've developed throughout the semester.

📅

When

Week 15

During scheduled lab time

⏱️

Duration

2-3 Hours

Total examination time

📝

Format

Hybrid

Written + Practical

💯

Total Points

100

Comprehensive assessment

📚 Exam Coverage

The final examination covers material from the following weeks:

Weeks 1-5: Python for hardware, ML/AI fundamentals, neural networks (ANN/DNN/CNN), PyTorch, data representation, ML libraries
Week 7: Sensor interface, CSI cameras, I²C communication, environmental sensing
Week 8: Classification algorithms (SVM, Random Forest)
Week 9: Regression techniques (Linear, Polynomial)
Weeks 10-12: Autonomous driving - collision avoidance, object detection with YOLOv8, road following
Week 14: Large Language Models (LLMs) and Ollama edge deployment

Note: Week 6 (Midterm Exam) and Week 13 (Project Presentations) are NOT included in the final exam coverage.

📖 Open/Closed Book Policy

To Be Announced: Your instructor will announce whether this exam is open-book or closed-book during the review session. Please check your course announcements for the final policy.

Suggested: Prepare as if it's closed-book, then you'll be ready for either format!

📖

Detailed Topics Covered

↑ Go Up

Below is a comprehensive breakdown of topics from all weeks that will be assessed in the final examination:

Week 1

Python for Real Hardware - Jetson & Raspberry Pi

Key Topics:

NVIDIA Jetson Orin Nano specifications and capabilities (8-core ARM CPU, NVIDIA Ampere GPU, 8GB memory)
JetBot robot assembly, component identification, and system architecture
Hardware interfaces: GPIO, I2C, SPI, UART protocols
Power management and thermal considerations for embedded systems
Basic Linux commands for embedded systems administration
Raspberry Pi comparison and use cases

Week 2

Fundamentals of Machine Learning and Artificial Intelligence

Key Topics:

Introduction to artificial intelligence and machine learning concepts
Supervised vs. unsupervised vs. reinforcement learning paradigms
Gradient descent optimization algorithm and its variants (SGD, Adam, RMSprop)
Loss functions: MSE, cross-entropy, binary cross-entropy
Activation functions: ReLU, Sigmoid, Tanh, Softmax - properties and use cases
Backpropagation and chain rule fundamentals
Overfitting, underfitting, and regularization techniques

Week 3

Neural Network Architectures - ANN, DNN, CNN

Key Topics:

PyTorch tensors: creation, manipulation, and operations
Artificial Neural Networks (ANN) structure and forward propagation
Deep Neural Networks (DNN) with multiple hidden layers
Convolutional Neural Networks (CNN) architecture and components
Convolution operations, kernels/filters, stride, and padding
Pooling layers: max pooling, average pooling
Training neural networks in PyTorch: optimizer, loss, training loop
Fashion-MNIST dataset and image classification
Model evaluation: accuracy, precision, recall, F1-score

Week 4

Data Representation and Image Datasets

Key Topics:

MNIST handwritten digit dataset structure and applications
CIFAR-10 dataset: 10 classes of color images
IRIS dataset for classification tasks
Image preprocessing: normalization, resizing, augmentation
Data loading and batching in PyTorch (DataLoader, Dataset classes)
Train/validation/test split strategies
CNN architectures for image classification
Transfer learning concepts

Week 5

Python ML Libraries - PyTorch & TensorFlow

Key Topics:

PyTorch framework fundamentals and tensor operations
TensorFlow/Keras basics and comparison with PyTorch
Scikit-learn for classical machine learning algorithms
Model training, validation, and testing procedures
Hyperparameter tuning strategies
Model saving and loading (state_dict, checkpoints)
GPU acceleration and CUDA operations
Practical deep learning workflows

Week 7

Sensor Interface and Data Acquisition

Key Topics:

CSI (Camera Serial Interface) cameras and MIPI protocol
IMX219 camera specifications and configuration
I²C communication protocol and addressing
QWIIC system for sensor connectivity
BME280 environmental sensor (temperature, humidity, pressure)
GStreamer for video capture and processing
Camera parameters: resolution, frame rate, exposure, white balance
Sensor data collection and processing for AI applications

Week 8

Classification Algorithms

Key Topics:

Support Vector Machines (SVM): linear and non-linear kernels
Decision Trees: splitting criteria, pruning, depth control
Random Forest: ensemble learning, bagging, feature importance
K-Nearest Neighbors (KNN) algorithm
Classification metrics: confusion matrix, accuracy, precision, recall
Cross-validation techniques
Feature engineering and selection
Comparison of classical ML vs. deep learning for classification

Week 9

Regression Techniques

Key Topics:

Linear regression: simple and multiple regression
Polynomial regression for non-linear relationships
Least squares method and cost functions
Regularization in regression: L1 (Lasso), L2 (Ridge)
Regression metrics: MSE, RMSE, R-squared, MAE
Feature scaling and normalization importance
Neural network-based regression
Prediction vs. interpolation vs. extrapolation

Week 10

Autonomous Driving - Collision Avoidance

Key Topics:

Differential drive robot control and motor commands
Data collection for collision avoidance (blocked vs. free paths)
Training CNNs for binary classification (collision detection)
Real-time inference on edge devices
Motor control integration with vision system
Emergency stop mechanisms and safety protocols
Model optimization for low-latency inference
Testing and validation of autonomous behaviors

Week 11

Object Detection and Following with YOLOv8

Key Topics:

YOLO (You Only Look Once) architecture and evolution
YOLOv8 implementation using Ultralytics framework
COCO dataset and 80 object classes
Bounding box predictions and confidence scores
Object tracking algorithms and filtering
Visual servo control for object following
Proportional control for steering based on object position
Real-time object detection performance optimization

Week 12

Autonomous Road Following

Key Topics:

Regression-based road following vs. classification-based collision avoidance
Data collection: interactive labeling of road center coordinates
ResNet-18 architecture for regression tasks
Training neural networks to predict continuous steering coordinates
Coordinate-to-motor-command transformation
Smooth control and trajectory planning
Combining multiple AI models (collision avoidance + road following)
Real-world deployment and testing strategies

Week 14

Large Language Models (LLMs) and Edge Deployment

Key Topics:

Introduction to Large Language Models (LLMs) and transformer architecture
Ollama framework for running LLMs locally
Edge AI deployment considerations: memory, compute, latency
Quantization techniques for model compression
Running LLMs on Jetson Orin Nano
Prompt engineering and inference optimization
Applications of edge LLMs in robotics
Privacy and security benefits of on-device AI

📊

Exam Format & Structure

↑ Go Up

The final examination consists of two major components:

✏️ Part 1: Written Component (60 Points)

Duration: Approximately 60-75 minutes

Question Types:

Question Type	Number of Questions	Points Each	Total Points
Multiple Choice Questions (MCQ)	30	1	30
True/False Questions	15	1	15
Short Answer Questions	5	3	15
Total Written Component	50	-	60

Written Component Details:

MCQs: Cover conceptual understanding across all topics with balanced representation from each week
True/False: Test key facts, principles, and common misconceptions
Short Answers: Require brief explanations (3-5 sentences) demonstrating understanding of core concepts
Topics Emphasis: Heavier weight on autonomous driving (Weeks 10-12) and neural networks (Weeks 3-5)

💻 Part 2: Practical/Coding Component (40 Points)

Duration: Approximately 45-60 minutes

Programming Tasks:

Task Type	Description	Points
PyTorch Tensor Operations	Basic tensor creation, manipulation, and operations	8
Neural Network Implementation	Build simple ANN/CNN architecture in PyTorch	12
Data Processing	Load, preprocess, and visualize image dataset	8
Model Training/Inference	Complete training loop or inference pipeline	12
Total Practical Component		40

Practical Component Details:

Work on pre-configured Jetson Orin Nano or provide solutions in Jupyter Notebook
Code must be functional and demonstrate understanding of AI concepts
Partial credit awarded for correct approach even if final output has minor errors
Focus on PyTorch, data processing, and model implementation

⚖️ Total Exam Composition

Component	Points	Percentage
Written (MCQ + T/F + Short Answer)	60	60%
Practical (Coding Tasks)	40	40%
Total	100	100%

Final Exam Weight in Course Grade: 30% (as per course syllabus)

📝

Practice Question Bank

↑ Go Up

This comprehensive practice question bank is organized by week and question type. Click on each section to expand and view the questions with full answers and explanations. Use this to test your knowledge and identify areas that need more study.

Week 1: Python for Real Hardware - Practice Questions

▼

Multiple Choice Questions

1. What is the primary GPU architecture used in the NVIDIA Jetson Orin Nano?

A) Pascal
B) Volta
C) Ampere
D) Turing

Correct Answer: C) Ampere

The Jetson Orin Nano features NVIDIA's Ampere GPU architecture with 1024 CUDA cores, providing efficient AI inference capabilities for edge computing applications.

2. Which communication protocol uses only two wires (SDA and SCL) for connecting multiple sensors?

A) SPI
B) UART
C) I²C
D) GPIO

Correct Answer: C) I²C

I²C (Inter-Integrated Circuit) is a two-wire protocol using SDA (Serial Data) and SCL (Serial Clock) lines, allowing multiple devices to share the same bus with unique addresses.

3. How much RAM does the Jetson Orin Nano Developer Kit have?

A) 4GB
B) 8GB
C) 16GB
D) 32GB

Correct Answer: B) 8GB

The Jetson Orin Nano includes 8GB of unified memory shared between the CPU and GPU, enabling efficient AI processing without data transfer bottlenecks.

True/False Questions

4. The Jetson Orin Nano uses an x86 processor architecture like typical desktop computers.

Correct Answer: FALSE

The Jetson Orin Nano uses an 8-core ARM Cortex-A78AE processor, not x86 architecture. ARM processors are more power-efficient and commonly used in embedded systems.

5. GPIO pins on the Jetson can be configured as either input or output depending on the application requirements.

Correct Answer: TRUE

GPIO (General Purpose Input/Output) pins are flexible and can be programmed to function as either inputs (reading sensor data) or outputs (controlling actuators) based on software configuration.

Short Answer Questions

6. Explain the main advantages of using the Jetson Orin Nano for AI applications compared to a standard Raspberry Pi.

Model Answer:

The Jetson Orin Nano offers significant advantages for AI applications including dedicated GPU hardware with 1024 CUDA cores for parallel processing, built-in support for popular AI frameworks like PyTorch and TensorFlow, hardware-accelerated inference engines, and higher memory bandwidth. While the Raspberry Pi is excellent for general-purpose computing and IoT projects, the Jetson is specifically designed for AI workloads with capabilities up to 40 TOPS (Tera Operations Per Second) of AI performance, making it ideal for real-time computer vision, object detection, and autonomous robotics applications.

Week 2: ML/AI Fundamentals - Practice Questions

▼

Multiple Choice Questions

1. Which optimization algorithm adapts the learning rate for each parameter individually?

A) Gradient Descent
B) Stochastic Gradient Descent (SGD)
C) Adam
D) Batch Gradient Descent

Correct Answer: C) Adam

Adam (Adaptive Moment Estimation) combines the benefits of RMSprop and momentum by maintaining adaptive learning rates for each parameter. It computes exponential moving averages of both gradients and squared gradients, making it highly effective for training deep neural networks.

2. What is the primary purpose of the ReLU activation function?

A) Normalize inputs between 0 and 1
B) Introduce non-linearity while avoiding vanishing gradients
C) Convert outputs to probability distribution
D) Reduce overfitting

Correct Answer: B) Introduce non-linearity while avoiding vanishing gradients

ReLU (Rectified Linear Unit) outputs max(0, x), providing non-linearity necessary for learning complex patterns while maintaining strong gradients for positive values. This prevents the vanishing gradient problem common with sigmoid/tanh functions in deep networks.

3. Which loss function is typically used for multi-class classification problems?

A) Mean Squared Error (MSE)
B) Mean Absolute Error (MAE)
C) Binary Cross-Entropy
D) Categorical Cross-Entropy

Correct Answer: D) Categorical Cross-Entropy

Categorical cross-entropy is designed for multi-class classification where each sample belongs to exactly one class. It measures the dissimilarity between predicted probability distribution and true distribution, working with softmax activation in the output layer.

True/False Questions

4. Gradient descent always finds the global minimum of the loss function.

Correct Answer: FALSE

Gradient descent can get stuck in local minima, especially for non-convex loss functions common in deep learning. While techniques like momentum and adaptive learning rates help, there's no guarantee of reaching the global minimum. However, in practice, local minima in high-dimensional spaces often perform adequately.

5. The Softmax activation function is commonly used in the output layer for multi-class classification because it converts logits into probability distributions.

Correct Answer: TRUE

Softmax transforms raw output scores (logits) into probabilities that sum to 1.0, making it ideal for multi-class classification. Each output represents the probability of the input belonging to a particular class.

Short Answer Questions

6. Explain the concept of backpropagation and why it's essential for training neural networks.

Model Answer:

Backpropagation is the fundamental algorithm for training neural networks by computing gradients of the loss function with respect to each weight. It works backward from the output layer to the input layer using the chain rule of calculus to efficiently calculate how each weight contributed to the error. These gradients indicate the direction and magnitude needed to adjust weights to reduce loss. Without backpropagation, we couldn't efficiently train deep networks because computing gradients individually for millions of parameters would be computationally prohibitive. This algorithm made modern deep learning feasible.

Week 3: Neural Network Architectures - Practice Questions

▼

Multiple Choice Questions

1. In PyTorch, what is the purpose of the optimizer.zero_grad() call?

A) Reset the model weights to zero
B) Clear accumulated gradients from previous iterations
C) Initialize the optimizer
D) Set the learning rate to zero

Correct Answer: B) Clear accumulated gradients from previous iterations

In PyTorch, gradients accumulate by default with each backward() call. Calling optimizer.zero_grad() clears these accumulated gradients before computing new ones, preventing incorrect gradient values from mixing across training iterations.

2. What is the primary advantage of max pooling in convolutional neural networks?

A) Increases the number of parameters
B) Reduces spatial dimensions and provides translation invariance
C) Adds non-linearity to the network
D) Prevents overfitting by adding dropout

Correct Answer: B) Reduces spatial dimensions and provides translation invariance

Max pooling downsamples feature maps by selecting maximum values within pooling windows, reducing computational cost and memory usage. It also provides translation invariance, meaning small shifts in input position don't dramatically change the output, making the network more robust.

3. In a convolutional layer with kernel size 3×3, stride 1, and padding 1, what happens to the spatial dimensions of the input?

A) They increase
B) They decrease
C) They remain the same
D) They double

Correct Answer: C) They remain the same

With padding=1 and stride=1, the output spatial dimensions equal the input dimensions. The formula is: output_size = (input_size + 2×padding - kernel_size) / stride + 1. For example, (H + 2×1 - 3)/1 + 1 = H, so dimensions are preserved.

4. What does the Fashion-MNIST dataset contain?

A) Colored images of fashion products
B) Grayscale 28×28 images of 10 clothing categories
C) High-resolution photos of fashion models
D) Text descriptions of clothing items

Correct Answer: B) Grayscale 28×28 images of 10 clothing categories

Fashion-MNIST is a drop-in replacement for MNIST containing 70,000 grayscale images (60,000 training, 10,000 test) at 28×28 pixels, covering 10 categories: T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, and Ankle boot.

True/False Questions

5. In PyTorch, tensors can be moved between CPU and GPU using the .to() method.

Correct Answer: TRUE

PyTorch tensors can be transferred between devices using .to('cuda') for GPU or .to('cpu') for CPU. This is essential for GPU-accelerated training. For example: tensor.to('cuda:0') moves the tensor to the first GPU.

6. Convolutional layers have more parameters than fully connected layers of similar input/output sizes.

Correct Answer: FALSE

Convolutional layers have significantly fewer parameters due to weight sharing across spatial locations. A conv layer with 64 filters of size 3×3×3 has only 1,728 parameters, while a fully connected layer from 224×224×3 to 64 outputs would have over 9 million parameters!

Short Answer Questions

7. Describe the typical training loop structure in PyTorch for a classification task.

Model Answer:

A typical PyTorch training loop includes: (1) Set model to training mode with model.train(), (2) Iterate through batches from DataLoader, (3) Clear gradients with optimizer.zero_grad(), (4) Forward pass to compute predictions, (5) Calculate loss using appropriate loss function, (6) Backward pass with loss.backward() to compute gradients, (7) Update weights with optimizer.step(). After each epoch, validate on validation set using model.eval() and torch.no_grad() context. Track metrics like loss and accuracy to monitor training progress and detect overfitting.

Week 4: Data Representation - Practice Questions

▼

Multiple Choice Questions

1. What is the primary difference between MNIST and CIFAR-10 datasets?

A) MNIST has more classes than CIFAR-10
B) CIFAR-10 images are color while MNIST is grayscale
C) MNIST is used for object detection, CIFAR-10 for classification
D) They contain the same data in different formats

Correct Answer: B) CIFAR-10 images are color while MNIST is grayscale

MNIST contains grayscale 28×28 images of handwritten digits (0-9), while CIFAR-10 contains color 32×32 RGB images of 10 object categories (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck). CIFAR-10 is significantly more challenging due to color information, higher resolution, and more complex object classes.

2. What is the purpose of normalizing image data before feeding it to a neural network?

A) To increase image resolution
B) To speed up training and improve convergence
C) To reduce the number of classes
D) To convert color images to grayscale

Correct Answer: B) To speed up training and improve convergence

Normalization scales pixel values (typically 0-255) to a smaller range (e.g., 0-1 or -1 to 1), ensuring all features have similar magnitudes. This stabilizes gradient descent, allows for higher learning rates, and leads to faster, more reliable convergence. Common practice is to normalize using dataset mean and standard deviation.

3. What is the purpose of PyTorch's DataLoader class?

A) To define neural network architectures
B) To load data in batches and handle shuffling/parallelization
C) To save trained models
D) To visualize training progress

Correct Answer: B) To load data in batches and handle shuffling/parallelization

DataLoader wraps a Dataset and provides convenient batching, shuffling, parallel data loading (with multiple workers), and automatic collation of samples. It's essential for efficient training, especially with large datasets that don't fit in memory.

True/False Questions

4. Data augmentation techniques like random flips and rotations help reduce overfitting by artificially increasing the training dataset size.

Correct Answer: TRUE

Data augmentation applies random transformations (flips, rotations, crops, color jittering) during training, creating varied versions of each image. This effectively increases dataset diversity without collecting new data, helping the model generalize better and reducing overfitting on the training set.

5. A proper train/validation/test split should have the test set evaluated multiple times during training to tune hyperparameters.

Correct Answer: FALSE

The test set should be used ONLY ONCE at the very end for final model evaluation. Hyperparameter tuning should be done using the validation set. Using the test set multiple times leads to "overfitting on the test set" where hyperparameters are implicitly tuned to that specific data, producing overly optimistic performance estimates.

Short Answer Questions

6. Explain the concept of transfer learning and why it's beneficial for image classification tasks.

Model Answer:

Transfer learning involves using a pre-trained model (typically trained on ImageNet with millions of images) as a starting point for a new task. The early layers learn general features like edges, textures, and shapes that are useful across many vision tasks. By initializing with these pre-learned features and fine-tuning on your specific dataset, you can achieve better performance with less data and training time. This is especially valuable when you have limited labeled data, as the model already understands fundamental visual concepts and only needs to adapt to your specific classes.

Week 5: Python ML Libraries - Practice Questions

▼

Multiple Choice Questions

1. What is the main difference between PyTorch and TensorFlow in terms of computational graph construction?

A) PyTorch uses static graphs, TensorFlow uses dynamic graphs
B) PyTorch uses dynamic graphs, TensorFlow traditionally uses static graphs
C) Both use the same approach
D) Neither uses computational graphs

Correct Answer: B) PyTorch uses dynamic graphs, TensorFlow traditionally uses static graphs

PyTorch builds computational graphs dynamically (define-by-run), allowing for more flexible and intuitive debugging. TensorFlow 1.x used static graphs (define-and-run) requiring graph construction before execution, though TensorFlow 2.x introduced eager execution similar to PyTorch. Dynamic graphs make PyTorch popular in research for their ease of use.

2. What is the purpose of saving a model's state_dict in PyTorch?

A) To save the model architecture only
B) To save the learned weights and parameters
C) To save the training data
D) To save the optimizer configuration

Correct Answer: B) To save the learned weights and parameters

state_dict is a Python dictionary mapping layer names to their parameter tensors. Saving state_dict with torch.save() preserves the trained weights, allowing you to load them later with load_state_dict(). This is the recommended way to save PyTorch models as it's more flexible than saving the entire model object.

3. Which library is specifically designed for classical machine learning algorithms like SVM and Random Forest?

A) PyTorch
B) TensorFlow
C) Scikit-learn
D) Keras

Correct Answer: C) Scikit-learn

Scikit-learn provides simple and efficient tools for classical machine learning including classification (SVM, Random Forest, KNN), regression (Linear, Ridge, Lasso), clustering (K-Means), and preprocessing utilities. While PyTorch and TensorFlow focus on deep learning, scikit-learn is ideal for traditional ML algorithms.

True/False Questions

4. GPU acceleration in PyTorch requires manually coding CUDA kernels for each operation.

Correct Answer: FALSE

PyTorch provides high-level abstractions that automatically handle GPU acceleration. You simply move tensors and models to GPU using .to('cuda') or .cuda(), and PyTorch handles the CUDA operations internally. Manual CUDA programming is only needed for custom operations not available in PyTorch.

5. Model checkpoints should save both the model state and optimizer state for proper training resumption.

Correct Answer: TRUE

For proper training resumption, save both model.state_dict() and optimizer.state_dict(), plus the current epoch and loss. The optimizer state includes momentum buffers and adaptive learning rates that are crucial for continuing training from where it left off.

Short Answer Questions

6. Explain the importance of hyperparameter tuning and name three common hyperparameters that should be tuned.

Model Answer:

Hyperparameter tuning optimizes model configuration choices that aren't learned during training. Poor hyperparameters lead to slow convergence, overfitting, or underfitting. Three critical hyperparameters are: (1) Learning rate - controls step size in gradient descent; too high causes divergence, too low slows convergence; (2) Batch size - affects training stability and memory usage; larger batches provide more stable gradients but require more memory; (3) Number of layers/neurons - determines model capacity; too few limits learning ability, too many causes overfitting. Methods like grid search, random search, or Bayesian optimization help find optimal values systematically.

Week 7: Sensor Interface - Practice Questions

▼

Multiple Choice Questions

1. What does CSI stand for in the context of camera interfaces?

A) Computer System Interface
B) Camera Serial Interface
C) Communication Signal Interpreter
D) Centralized Sensor Integration

Correct Answer: B) Camera Serial Interface

CSI (Camera Serial Interface) uses the MIPI (Mobile Industry Processor Interface) protocol for high-bandwidth video transfer from camera sensors to processors. The Jetson Orin Nano has CSI-2 connectors supporting up to 8 lanes, providing lower latency and higher bandwidth compared to USB cameras.

2. What environmental parameters does the BME280 sensor measure?

A) Temperature only
B) Temperature and humidity
C) Temperature, humidity, and pressure
D) Temperature, humidity, pressure, and air quality

Correct Answer: C) Temperature, humidity, and pressure

The BME280 from Bosch measures three environmental parameters: temperature (±1°C accuracy), relative humidity (±3% accuracy), and barometric pressure (±1 hPa accuracy). It communicates via I²C or SPI and is commonly used in weather stations, indoor climate monitoring, and altitude estimation.

3. What advantage does the QWIIC system provide for sensor connectivity?

A) Wireless sensor communication
B) Higher data transfer speeds
C) Standardized plug-and-play I²C connections without soldering
D) Automatic sensor calibration

Correct Answer: C) Standardized plug-and-play I²C connections without soldering

QWIIC (from SparkFun) uses standardized 4-pin JST connectors (SDA, SCL, 3.3V, GND) for I²C communication, allowing daisy-chaining multiple sensors without breadboards or soldering. This dramatically simplifies prototyping and makes sensor integration much faster and more reliable for students and hobbyists.

True/False Questions

4. I²C communication allows multiple devices to share the same two-wire bus because each device has a unique address.

Correct Answer: TRUE

I²C uses 7-bit or 10-bit addresses to identify devices on the shared bus. The master device specifies which slave to communicate with using these addresses, allowing dozens of sensors to coexist on the same SDA and SCL lines. This makes I²C extremely efficient for multi-sensor systems.

5. CSI cameras connected to Jetson can leverage hardware-accelerated image processing through the Image Signal Processor (ISP).

Correct Answer: TRUE

CSI cameras have direct access to the Jetson's built-in ISP, which handles operations like demosaicing, noise reduction, color correction, and exposure adjustment in hardware. This offloads CPU/GPU resources and provides better image quality and lower latency compared to USB cameras requiring software processing.

Short Answer Questions

6. Describe GStreamer and explain why it's commonly used for video processing on Jetson devices.

Model Answer:

GStreamer is a multimedia framework that provides a pipeline-based architecture for processing audio and video streams. On Jetson devices, it's especially valuable because it supports hardware-accelerated encoding/decoding through NVIDIA's multimedia API, significantly reducing CPU load and power consumption. GStreamer pipelines can capture CSI camera streams, apply transformations, encode video, and display or save output with low latency. The framework's modular design allows developers to chain together "elements" (source, filters, sinks) to create complex multimedia applications efficiently. For AI applications, GStreamer can feed frames to neural networks while handling all the video I/O operations in hardware.

Week 8: Classification Algorithms - Practice Questions

▼

Multiple Choice Questions

1. What is the primary goal of Support Vector Machines (SVM)?

A) Minimize the number of features
B) Find the hyperplane that maximizes the margin between classes
C) Cluster similar data points together
D) Predict continuous values

Correct Answer: B) Find the hyperplane that maximizes the margin between classes

SVM finds the optimal separating hyperplane by maximizing the margin (distance) between the closest points of different classes (support vectors). This maximum-margin approach provides good generalization and is effective even in high-dimensional spaces. The kernel trick allows SVM to handle non-linear decision boundaries.

2. How does Random Forest improve upon a single decision tree?

A) By using deeper trees
B) By combining predictions from multiple trees trained on different subsets of data
C) By using fewer features
D) By requiring less training data

Correct Answer: B) By combining predictions from multiple trees trained on different subsets of data

Random Forest is an ensemble method that trains multiple decision trees on random subsets of both samples (bagging) and features. Final predictions come from majority voting (classification) or averaging (regression) across all trees. This reduces overfitting, improves generalization, and provides more robust predictions than a single decision tree.

3. What does the confusion matrix diagonal represent?

A) Incorrectly classified samples
B) Correctly classified samples
C) Total number of samples
D) Feature importance

Correct Answer: B) Correctly classified samples

The confusion matrix diagonal shows true positives for each class - instances correctly classified. Off-diagonal elements show misclassifications. For a perfect classifier, all values would be on the diagonal with zeros elsewhere. The confusion matrix provides detailed insight into which classes are confused with each other.

True/False Questions

4. Random Forest can provide feature importance rankings showing which features contribute most to predictions.

Correct Answer: TRUE

Random Forest calculates feature importance based on how much each feature decreases impurity across all trees in the forest. Features used for important splits near the root get higher importance scores. This helps identify which features are most predictive and can guide feature selection.

5. K-Nearest Neighbors (KNN) requires training a model before making predictions.

Correct Answer: FALSE

KNN is a lazy learning algorithm that doesn't build an explicit model during training. It simply stores the training data. At prediction time, it finds the K nearest neighbors to a query point and predicts based on majority class (classification) or average value (regression) of those neighbors. This makes training instant but prediction slower.

Short Answer Questions

6. Explain precision and recall, and why both metrics matter in classification tasks.

Model Answer:

Precision measures what fraction of predicted positives are actually positive (TP / (TP + FP)), answering "When the model predicts positive, how often is it correct?" Recall measures what fraction of actual positives were correctly identified (TP / (TP + FN)), answering "Of all actual positives, how many did we find?" Both matter because accuracy alone can be misleading with imbalanced datasets. For medical diagnosis, high recall (finding all sick patients) is critical even if precision suffers slightly. For spam detection, high precision (avoiding false alarms) might be prioritized. The F1-score harmonically averages precision and recall, providing a single metric when both are important.

Week 9: Regression Techniques - Practice Questions

▼

Multiple Choice Questions

1. What is the primary difference between Ridge and Lasso regression?

A) Ridge uses L2 regularization, Lasso uses L1 regularization
B) Ridge is for classification, Lasso is for regression
C) They are identical algorithms with different names
D) Ridge is faster than Lasso

Correct Answer: A) Ridge uses L2 regularization, Lasso uses L1 regularization

Ridge regression adds L2 penalty (sum of squared coefficients) to the loss function, shrinking coefficients but keeping all features. Lasso uses L1 penalty (sum of absolute coefficients), which can drive some coefficients exactly to zero, effectively performing feature selection. Both combat overfitting, but Lasso additionally identifies important features.

2. When would polynomial regression be preferred over linear regression?

A) When data has a linear relationship
B) When the relationship between features and target is non-linear
C) When working with categorical variables
D) When dataset size is small

Correct Answer: B) When the relationship between features and target is non-linear

Polynomial regression creates polynomial features (x², x³, etc.) to capture non-linear relationships. While still technically linear in parameters, it can fit curves to data. However, high-degree polynomials risk overfitting and should be used with regularization. For complex non-linearities, neural networks often work better.

3. What does R-squared (R²) measure in regression?

A) The average error of predictions
B) The proportion of variance in the target variable explained by the model
C) The number of features used
D) The training time required

Correct Answer: B) The proportion of variance in the target variable explained by the model

R² ranges from 0 to 1 (sometimes negative for very poor models), indicating how much of the target variance is captured by predictions. R²=1 means perfect predictions, R²=0 means the model is no better than predicting the mean. While useful, R² can be misleading with overfitting or when comparing models with different numbers of features (use adjusted R² instead).

True/False Questions

4. Feature scaling (normalization/standardization) is crucial for regression algorithms that use gradient descent.

Correct Answer: TRUE

Without feature scaling, features with larger magnitudes dominate the loss function, causing gradient descent to converge slowly or poorly. Standardization (zero mean, unit variance) or min-max normalization ensures all features contribute proportionally to learning. This is critical for neural networks and algorithms using gradient-based optimization.

5. Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) are in the same units as the target variable.

Correct Answer: FALSE (for MSE), TRUE (for RMSE)

MSE squares the errors, resulting in squared units (if predicting price in dollars, MSE is in dollars²). RMSE takes the square root of MSE, returning to the original units and making it more interpretable. RMSE is preferred when you want to understand typical prediction error magnitude in meaningful units.

Short Answer Questions

6. Explain the difference between interpolation and extrapolation in regression, and why extrapolation can be unreliable.

Model Answer:

Interpolation makes predictions within the range of training data (between known points), while extrapolation predicts beyond this range. Interpolation is generally reliable because the model has observed similar data patterns. Extrapolation is risky because the model hasn't seen data in that region and assumes relationships continue unchanged, which often isn't true. For example, a linear model trained on housing prices from 1000-3000 sq ft might extrapolate poorly for a 10,000 sq ft mansion because price relationships change at extreme values. Models should warn users when extrapolating, and predictions should be treated with appropriate skepticism.

Weeks 10-12: Autonomous Driving - Practice Questions

▼

Multiple Choice Questions

1. In differential drive robots, how do you make the robot turn right?

A) Left motor faster than right motor
B) Right motor faster than left motor
C) Both motors at same speed forward
D) Both motors at same speed backward

Correct Answer: A) Left motor faster than right motor

In differential drive, the robot turns toward the slower wheel. To turn right, the left motor must spin faster than the right motor. For a sharp right turn, you can even run the right motor backward while the left goes forward (pivot turn). Understanding this control scheme is fundamental to programming autonomous navigation.

2. What is the primary purpose of data collection in the collision avoidance lab (Week 10)?

A) To test the camera quality
B) To gather labeled images of "blocked" and "free" scenarios for training a classifier
C) To calibrate the motors
D) To measure robot speed

Correct Answer: B) To gather labeled images of "blocked" and "free" scenarios for training a classifier

Data collection involves capturing camera images in various environments and labeling them as "blocked" (obstacle ahead) or "free" (safe to proceed). This supervised learning dataset trains a CNN to recognize dangerous situations. Quality and diversity of training data directly impact collision avoidance performance in real-world scenarios.

3. What does YOLO stand for and what makes it different from traditional object detection methods?

A) You Only Look Once - it processes the entire image in a single forward pass
B) Your Object Locator Online - it uses cloud processing
C) Year of Learned Optimization - it's an optimization technique
D) Yellow Object Locating Operation - it only detects yellow objects

Correct Answer: A) You Only Look Once - it processes the entire image in a single forward pass

YOLO treats object detection as a regression problem, predicting bounding boxes and class probabilities directly from the full image in one evaluation. Traditional methods like R-CNN use region proposals and multiple passes. YOLO's single-pass approach enables real-time detection speeds (30+ FPS) making it ideal for robotics and autonomous driving applications on edge devices.

4. In road following (Week 12), what is the model predicting?

A) Binary classification of left/right turns
B) Continuous x,y coordinates representing the road center target point
C) Object classes in the scene
D) Distance to obstacles

Correct Answer: B) Continuous x,y coordinates representing the road center target point

Road following uses regression to predict continuous (x,y) coordinates of where the robot should aim. This differs from collision avoidance classification (blocked/free) because it needs precise steering angles. The model outputs coordinates in image space, which are converted to motor commands via proportional control - horizontal offset determines steering, vertical offset affects forward speed.

5. What is visual servo control in the context of object following?

A) Using mechanical sensors for feedback
B) Using camera images to adjust robot motion in real-time
C) Pre-programming robot paths
D) Remote control via video feed

Correct Answer: B) Using camera images to adjust robot motion in real-time

Visual servo control uses camera feedback to guide robot actions. For object following, the robot detects the target, calculates its position relative to image center, and adjusts motor commands proportionally to center the object. This closed-loop control enables smooth tracking despite environmental changes and imperfect models.

True/False Questions

6. The COCO dataset contains 80 different object classes that YOLOv8 can detect.

Correct Answer: TRUE

COCO (Common Objects in Context) is a large-scale dataset containing 80 common object categories including people, vehicles, animals, and household items. Pre-trained YOLOv8 models on COCO can detect these 80 classes out-of-the-box, making them immediately useful for many robotics applications without additional training.

7. In proportional control for object following, larger position errors should produce larger corrective motor commands.

Correct Answer: TRUE

Proportional control multiplies error by a gain constant (Kp). If an object is far right of center (large error), the correction should be stronger (turn sharply left). As the object approaches center (small error), corrections become gentler. This creates smooth, stable following behavior. Tuning Kp balances responsiveness with stability.

8. Classification-based collision avoidance (blocked/free) provides more precise steering control than regression-based road following.

Correct Answer: FALSE

Classification provides discrete categories (blocked/free) useful for binary decisions but lacks precision for steering. Regression predicts continuous coordinates allowing fine-grained control adjustments. For smooth road following, regression enables the robot to make subtle steering corrections, while classification would cause jerky, imprecise movements.

Short Answer Questions

9. Explain why collecting diverse training data is crucial for collision avoidance systems.

Model Answer:

Diverse training data ensures the collision avoidance system generalizes to various real-world scenarios. If data only includes one type of obstacle (e.g., cardboard boxes), the system may fail with different objects (people, furniture, walls). Data should cover: different obstacle types and colors, various lighting conditions (bright, dim, shadows), multiple distances and angles, cluttered vs. sparse environments, and edge cases (partially blocked paths). Limited diversity causes overfitting to specific training scenarios, resulting in poor performance in novel situations. Collecting 100-200 images per class with good variety produces robust behavior across environments.

10. Describe how bounding box size in object detection can be used to estimate distance to an object.

Model Answer:

Bounding box size provides a proxy for object distance because objects appear larger when closer and smaller when farther away. In object following, a large bounding box (object fills much of the frame) indicates the target is very close, prompting the robot to slow down or stop. A small box means the object is far, so the robot can move faster to approach. This relationship isn't perfectly linear due to perspective effects and object size variations, but it's sufficient for basic distance estimation. For precise distance measurement, depth cameras or stereo vision would be needed, but bounding box size offers a computationally cheap approximation suitable for real-time control.

Week 14: LLMs & Edge Deployment - Practice Questions

▼

Multiple Choice Questions

1. What is Ollama?

A) A cloud-based API for accessing GPT models
B) A framework for running Large Language Models locally
C) A programming language for AI
D) A dataset for training language models

Correct Answer: B) A framework for running Large Language Models locally

Ollama simplifies running LLMs like Llama 2, Mistral, and other open-source models locally on your hardware. It handles model downloading, quantization, and inference, making it easy to deploy LLMs on edge devices like the Jetson without requiring cloud connectivity or paying API fees. This enables privacy-preserving AI applications.

2. What is model quantization and why is it important for edge deployment?

A) Increasing model accuracy
B) Reducing model size by using lower precision numbers (e.g., int8 instead of float32)
C) Training models faster
D) Adding more parameters to models

Correct Answer: B) Reducing model size by using lower precision numbers (e.g., int8 instead of float32)

Quantization converts high-precision weights (32-bit floats) to lower precision (8-bit integers or 4-bit), dramatically reducing memory footprint and computational requirements. A 7B parameter model in FP32 requires ~28GB; quantized to 4-bit, it needs only ~3.5GB, making it feasible to run on edge devices. Some accuracy loss is acceptable for the massive efficiency gains.

3. What is a key advantage of running LLMs on edge devices versus using cloud APIs?

A) Faster model training
B) Better accuracy
C) Privacy, no internet required, no API costs
D) Access to larger models

Correct Answer: C) Privacy, no internet required, no API costs

Edge deployment keeps all data processing local, ensuring privacy (no data leaves the device), enables offline operation (critical for robots in areas without connectivity), eliminates ongoing API costs, and reduces latency (no round-trip to cloud). However, edge devices are limited to smaller, quantized models compared to massive cloud-hosted LLMs.

True/False Questions

4. Transformer architecture is the foundation of modern Large Language Models like GPT and Llama.

Correct Answer: TRUE

The transformer architecture, introduced in "Attention is All You Need" (2017), revolutionized NLP through its self-attention mechanism. It processes sequences in parallel (unlike RNNs) and captures long-range dependencies effectively. Nearly all modern LLMs (GPT, BERT, Llama, Mistral) are based on transformer architecture or its variants.

5. Prompt engineering is unnecessary when using LLMs because they understand any input naturally.

Correct Answer: FALSE

Prompt engineering - crafting effective instructions and context - significantly impacts LLM output quality. Well-designed prompts with clear instructions, examples, and constraints produce much better results. Techniques include few-shot learning (providing examples), chain-of-thought prompting (asking for step-by-step reasoning), and role-playing. Poor prompts lead to vague, incorrect, or irrelevant responses.

Short Answer Questions

6. Explain potential applications of LLMs in robotics and why edge deployment might be beneficial for these applications.

Model Answer:

LLMs enable robots to understand natural language commands, making human-robot interaction more intuitive. Applications include: (1) Voice-controlled navigation - "Go to the kitchen and find my keys," (2) Task planning - translating high-level goals into action sequences, (3) Scene understanding - describing what the robot's camera sees, (4) Fault diagnosis - explaining problems in human language. Edge deployment is beneficial because: robots need to work offline (warehouses, outdoor environments), low latency is critical for real-time interaction, sensitive data (inside homes) shouldn't be sent to cloud, and no ongoing API costs for commercial deployments. While edge LLMs are smaller and less capable than cloud versions, they're sufficient for many focused robotics tasks and can be augmented with specialized domain knowledge.

📝 Additional Practice Tips

Review your weekly lab reports and identify any concepts you struggled with
Re-watch key video tutorials from course materials
Practice coding exercises hands-on - reading about PyTorch isn't enough, you must write code
Form study groups to quiz each other on these practice questions
Create your own flashcards for terminology and key concepts
Test yourself without looking at answers first, then check your understanding
Focus extra time on autonomous driving topics (Weeks 10-12) as they represent significant course content

📚

Study Resources

↑ Go Up

📄 Review All Weekly Materials

Access all laboratory pages to review objectives, procedures, and concepts:

🎥 Video Tutorial Playlists

PyTorch Fundamentals: Review Udacity's Deep Learning with PyTorch course sections on tensors, neural networks, and training
Computer Vision: Revisit CNN architectures, object detection, and image processing tutorials
Autonomous Robotics: JetBot assembly, motor control, and navigation strategy videos
Edge AI: Model optimization, quantization, and deployment techniques

💻 Coding Practice Resources

PyTorch Official Tutorials: pytorch.org/tutorials
NVIDIA JetBot GitHub: github.com/NVIDIA-AI-IOT/jetbot
Ultralytics YOLOv8 Docs: docs.ultralytics.com
Scikit-learn Documentation: scikit-learn.org

📖 Recommended Reading

Neural Networks and Deep Learning: Foundations of gradient descent, backpropagation, and network architectures
Practical Deep Learning for Coders: Fast.ai course materials on effective DL practices
Hands-On Machine Learning with Scikit-Learn & TensorFlow: Comprehensive ML algorithms reference
NVIDIA Jetson Developer Zone: Edge AI deployment best practices and optimization techniques

💡

Exam Preparation Tips

↑ Go Up

📅 Three Weeks Before Exam

Create a comprehensive study schedule covering all 10 weeks of material
Review all laboratory reports - identify concepts that were challenging
Compile a list of "must-know" concepts, algorithms, and code patterns from each week
Set up your local PyTorch development environment for practice coding
Form or join a study group with classmates

📚 Two Weeks Before Exam

Begin working through this practice question bank systematically - week by week
Re-watch critical Udacity video tutorials on topics you find confusing
Practice implementing neural networks from scratch in PyTorch
Review all Background sections from weekly lab pages - these contain exam-relevant theory
Test yourself on terminology: Can you explain gradient descent, backpropagation, convolution, etc.?
Practice data preprocessing: loading datasets, normalization, batching

🔬 One Week Before Exam

Focus on autonomous driving material (Weeks 10-12) - this represents major course content
Practice complete coding tasks: tensor operations, building models, training loops
Review common errors and debugging strategies in PyTorch
Quiz yourself on short answer questions without looking at model answers first
Understand the "why" behind algorithms, not just memorizing steps
Attend the instructor's review session and ask for clarification on confusing topics

📝 Day Before Exam

Do a quick review of key concepts from all weeks - don't cram new material
Re-take the practice MCQs and True/False questions for final self-assessment
Review your "must-know" concept list and ensure you understand everything
Get good sleep - your brain needs rest to perform well
Prepare materials: charged laptop, student ID, any allowed reference materials
Stay calm and confident - you've prepared well!

⚠️ Common Mistakes to Avoid

Passive reading: Don't just read code - type it out and run it to truly understand
Memorization over understanding: Understand concepts deeply; don't just memorize formulas
Neglecting early weeks: Weeks 1-5 contain foundational material that everything else builds on
Skipping practice questions: This question bank is your most valuable study resource - use it!
Last-minute cramming: Start studying early; deep understanding takes time
Studying alone: Study groups help identify knowledge gaps and clarify confusing topics
Ignoring practical coding: The practical component requires hands-on coding skill

🆘 Getting Help Before the Exam

Office Hours: Attend instructor and TA office hours for clarification on difficult topics
Review Session: Don't miss the pre-exam review session - bring prepared questions
Study Groups: Collaborate with classmates to explain concepts to each other
Lab Discord: Post questions on the course Discord for community support
Email Instructor: For urgent questions or special accommodations

📊

Grading Information

↑ Go Up

📝 Detailed Grading Breakdown

Component	Number of Items	Points Each	Total Points
Multiple Choice Questions	30	1	30
True/False Questions	15	1	15
Short Answer Questions	5	3	15
Written Component Subtotal	50	-	60
Practical Component
PyTorch Tensor Operations	1 task	8	8
Neural Network Implementation	1 task	12	12
Data Processing	1 task	8	8
Model Training/Inference	1 task	12	12
Practical Component Subtotal	4 tasks	-	40
GRAND TOTAL	-	-	100

Grading Notes:

All code in the practical component must run without critical errors for full credit
Partial credit is awarded for correct approach even if final output has minor issues
Short answer questions graded on accuracy, completeness, and clarity of explanation
Show your work in practical tasks - comments help graders understand your thinking

⚖️ Final Exam Weight in Course Grade

According to the course syllabus, the Final Examination accounts for 30% of your final course grade.

Assessment Component	Weight in Final Grade
Lab Reports (Weekly)	45%
Term Project	25%
Final Examination	30%
TOTAL	100%

📋 Exam Policies

Attendance: Mandatory - missing the exam without prior approval results in zero credit
Late Arrival: Arrive 10 minutes early; late arrivals may not be admitted
Academic Integrity: All work must be your own; cheating results in course failure and disciplinary action
Device Policy: Laptops allowed for practical component; phones/smartwatches must be turned off and stored
Questions During Exam: Raise hand for instructor assistance; only clarification questions allowed
Early Departure: Not allowed during first 30 minutes or last 15 minutes of exam

💪 Final Encouragement

You've worked hard throughout this semester, building practical AI and robotics skills from scratch. You started by assembling JetBot hardware, learned fundamental ML concepts, mastered PyTorch and neural networks, and developed real autonomous robot behaviors. This comprehensive exam is your opportunity to demonstrate the impressive knowledge and skills you've gained. Trust in your preparation, stay calm during the exam, and remember that your hands-on laboratory experience has prepared you well for both the written and practical components. You've got this! 🎓🤖

Good luck on your final examination! We're proud of your progress this semester.