Projects > Deep Learning Practicum > Assignment 2

Sabina Chen - sabinach@mit.edu
Overview

Assignment 2 - Multilayer Networks (Instructions)

Section 1.1: Building models with model builder
  1. The model is invalid because there needs to be a first layer that takes the input image [28,28,1] and flattens it into a 1D vector for the other layers of the network to work with.
  2. The current classifications are almost always wrong because the weights and biases of the network have not been adjusted/trained yet. The expected probability of the network being right without training is 1 in 15 (for 15 images). However, this was not what is observed because the network was initialized with randomly generated weights and biases already that effect the output.
Section 1.2: Training
  1. MNIST: The accuracy in training MNIST is usually greater >80%, however there are some examples that have pretty low accuracies as well, such as 15%, 30% and 60%. The demo performs an average of 740 inferences per second. It trained 1410 examples per second.
    Fashion MNIST: The accuracy in training Fashion MNIST is around 65%. The demo performs an average of 561 inferences per second. It trained 1550 examples per second.
  2. I trained the CIFAR-10 dataset for about 1 minute 30 seconds. The accuracy is about 25%-40%, which is a lot worse than MNIST.
  3. I trained the MNIST dataset by adding 7 layers of fully connected units in an attempt to improve training accuracy. Training the network with ~5000 examples, the accuracy is worse than when the network just had one fully connected layer. This network with 7 fully connected layers averaged around 40% accuracy.
  4. Changed the model to: Input → Flatten → FC(10) → FC(10) → Softmax → Label. Trained the model, and the accuracy plummets to either 0% or NaN. This could be happening because the output dimension from the two fully connected layers are not compatible with the output layer, and is thus unable to make numerical calculations correctly. NaN Accuracy

    NaN Model

Section 1.4: Activation layers
  1. Added an activation layer by changing the model to: Input → Flatten → FC(10) → ReLU → FC(10) → Softmax → Label. Trained the model with ~5000 images, and it performed even worse than before, with an accuracy of 10%. deeplearn.js

    2 FC Layers w/ ReLU
    10% accuracy

Section 1.6: Exploring with Model Builder
  1. Trained MNIST model using ~5000 images for different number of layers and ReLU between each layer, with the same hyperparameters. The training accuracy started going down on the 4th FC Layer, meaning that the network most likely started overfitting on this layer. Thus we should use 3 LC FC layers for this network to prevent overfitting of data.
    MNIST Training Results:
    • 1 FC Layer: 80% accuracy
    • 2 FC Layers: 30% accuracy
    • 3 FC Layers: 45% accuracy
    • 4 FC Layers: 30% accuracy
    • 5 FC Layers: 30% accuracy
    1 FC Layer

    1 FC Layer: 80% accuracy

    2 FC Layers

    2 FC Layers: 30% accuracy

    3 FC Layers

    3 FC Layers: 45% accuracy

    4 FC Layers

    4 FC Layers: 30% accuracy

    5 FC Layers

    5 FC Layers: 30% accuracy

  2. The 3 FC Layers with first layer wide (10), second layer narrow (3), and third layer wide (10), performs at ~15% accuracy, which is worse than the regular 3 FC Layer in the example above. Narrowing down the number node gets rid of some classification information, and then widening the number of nodes back to the original magnifies these modifications, thereby making classifying new training data less accurate. deeplearn.js

    3 FC Layers w/ Modified Dimensions
    15% accuracy

  3. The Fashion MNIST stabilized around layers 4 and 5. The CIFAR-10 got worse around layers 3 and 4, but seems to be gradually improving afterwards. Both accuracies decrease when we modify the dimensions of the 3 layers (ie. 10 wide, 3 narrow, then 10 wide) Fashion MNIST Training Results:
    • 1 FC Layer: 80% accuracy
    • 2 FC Layers: 65% accuracy
    • 3 FC Layers: 58% accuracy
    • 3 FC Layers: 35% accuracy (wide-narrow-wide)
    • 4 FC Layers: 55% accuracy
    • 5 FC Layers: 50% accuracy
    CIFAR-10 Training Results:
    • 1 FC Layer: 30% accuracy
    • 2 FC Layers: 30% accuracy
    • 3 FC Layers: 18% accuracy
    • 3 FC Layers: 17% accuracy (wide-narrow-wide)
    • 4 FC Layers: 18% accuracy
    • 5 FC Layers: 22% accuracy
Section 2.1: Setting up to code multilayer models
  1. I examined the effects of changing the ratio of BATCH_SIZE to NUM_BATCHES, as well as the total number of training images. The larger the total number of training images, the smaller the loss gets. Large BATCH_SIZE to NUM_BATCHES ratios (ie. 100/9, 100/25, 100/49) creates smoother line graphs but larger losses, whereas smaller BATCH_SIZE to NUM_BATCHES ratios have more bumpy looking loss graphs but smaller overall loss. This is because for larger BATCH_SIZE to NUM_BATCHES ratios, there are less batches to allow the network to backpropagate/to adjust the weights, so the oscillations in loss are less extreme.

    Total Batch Size: 900
    9-100

    Batch Size: 9
    Num Batches: 100

    9-100

    Batch Size: 100
    Num Batches: 9

    30-30

    Batch Size: 30
    Num Batches: 30

    Total Batch Size: 2500
    25-100

    Batch Size: 25
    Num Batches: 100

    100-25

    Batch Size: 100
    Num Batches: 25

    50-50

    Batch Size: 50
    Num Batches: 50

    Total Batch Size: 4900
    49-100

    Batch Size: 49
    Num Batches: 100

    100-49

    Batch Size: 100
    Num Batches: 49

    70-70

    Batch Size: 70
    Num Batches: 70

Section 2.2: Training and Testing
  1. With a batch size of 20 and a batch number of 50, I trained the model with the initial number of 1000 training images. The test accuracy ended at 50%, with a loss of 1.8. There were two interesting classification examples, both involving the image 8:
    • The network classified the training image incorrectly at index 59231 (number 8). The network incorrecly classified this image as a 9 at 13.4%, while the probability of the image being an 8 was at 6.9%. The three lowest probabilities were 0, 3, and 8 (6.6-6.9%). In contrast, the network more confidently believed the image to be either 1, 2, or 9 at a probability (13.2-13.4%). This is interesting because 8 (the correct number) was among the lower probabilities. Even 0, which may look more similar to an 8 than a 9 does, was among the lower probabilities, whereas 1 and 2 which looks nothing like the 8 was among the higher probabilities.
    • Another interesting example was also 8(?) which was at index 38109. The image provided is difficult to decipher even for a human. The loops of the 8 are flattened, so at a quick glance, it is hard to distinguish whether the image is an 8, 9, or 1. The network incorrectly classifies the image as a 1 because it misses the flattened loop at the top. For more messier looking numbers, the network has more trouble identifying numbers that aren't "normal looking".

      testing 8

      Incorrectly Classified, Regular "8"

      testing 8

      Incorrectly Classified, Unique "8"

  2. Increasing the total number of batches, as well as the increasing the number of batches trained greatly improved the accuracy and decreased the overall loss. On the other hand, super low ratios resulted in extreme accuracy oscillations, and lower losses as well. In general, larger gap between incorrect probabilities and correct probabilities.
    • Using high batch_size to num_batch ratios, lower probabilities are lower (ie. 4-8% range), whereas higher probabilities are higher (ie. 17-23% range). The network is more confident in identifying between images.
    • Out of curiosity, I tried a super lower batch_size to num_batches ratio (1:100). This resulted in accuracies that oscillated between 0% to 1% discretely because there is only 1 image in each batch. However, the loss decreased to 0.33, which is very low compared to other parameters tried so far, because the network has more immediate feedback between image trials.

      high ratio

      batch_size: 50, num_batches: 30 -> loss: 0.91, accuracy: 88%

      super low ratio

      batch_size: 1, num_batches: 100 -> loss: 0.33, accuracy: 0%/100%

  3. Fashion MNIST had the highest accuracy at 62% and lowest loss at 1.65, while CIFAR_10 had the lowest accuracy 22% and highest loss 2.24. MNIST did almost as well as Fashion MNIST. MNIST and Fashion_MNIST has a more stable increasing slope, whereas CIFAR_10 had a more oscillating increasing slope.
    MNIST

    MNIST
    loss: 2.02, accuracy: 52%

    Fashion MNIST

    Fashion MNIST
    loss: 1.65, accuracy: 62%

    CIFAR 10

    CIFAR_10
    loss: 2.24, accuracy: 22%

  4. MNIST: index.js
    Fashion_MNIST: index.js
    CIFAR_10: index.js
    Network Code

    Flatten -> Leaky ReLU -> Dense -> Leaky ReLU -> Softmax

Section 2.3: Style transfer examples
    Generated images via Deep Art

  1. original

    Original

    style

    Style

    generated

    Generated

  2. original

    Original

    style

    Style

    generated

    Generated

  3. original

    Original

    style

    Style

    generated

    Generated

  4. original

    Original

    style

    Style

    generated

    Generated

  5. original

    Original

    style

    Style

    generated

    Generated

  6. original

    Original

    style

    Style

    generated

    Generated