Projects > Deep Learning Practicum > Assignment 1

Sabina Chen - sabinach@mit.edu
Overview

Assignment 1 - Teachable Machine (Instructions)

Section 2: Train some classes
  1. The network was able to tell the difference between our two faces with >99% confidence.
    (Class 1: my face showing different expressions, Class 2: my partner's face showing different expressions)
    Class 1

    Class 1: my face
    99% confidence

    Class 2

    Class 2: partner's face
    99% confidence

  2. The network was able to tell the differences between three inanimate objects with >99% confidence.
    (Class 1: pink calculator, Class 2: pink/brown umbrella, Class 3: teal notebook)
    Class 1

    Class 1: calculator
    99% confidence

    Class 2

    Class 2: umbrella
    99% confidence

    Class 3

    Class 3: notebook
    99% confidence

  3. The network was very unstable, and flipped quickly between being >90% confident in one class, to 20%/75% confident in another class from the slightest changes in body movement.
    (Class 1: my neutral face, Class 2: my neutral face)

    Class 1: 23% confidence
    Class 2: 76% confidence

  4. The network recognized FACES, instead of expressions. It could not tell the difference between expressions, and classified differences in faces with >99% confidence.
    (Class 1: two people with smiling face, Class 2: two people with angry face)
    Class 1

    Class 1: smiling face
    0% confidence

    Class 1

    Class 1: smiling face
    0% confidence

    Class 2

    Class 2: angry face
    0% confidence

    Class 2

    Class 2: angry face
    0% confidence

  5. The network learned distance of objects, not people. All classifications (even if it was wrong) were >99% confident. Tested distances were: far, mid-range, and close. The network classified me (black clothes) correct for everything far range, and it got my friend (peach clothes) correct for everything but close range.
    (Class 1: my face close up, Class 2: friend's face far away)
    Class 1

    Class 1: my face
    0% confidence

    Class 1

    Class 1: my face
    99% confidence

    Class 1

    Class 1: my face
    99% confidence

    Class 2

    Class 2: my partner's face
    99% confidence

    Class 2

    Class 2: my partner's face
    99% confidence

    Class 2

    Class 2: my partner's face
    0% confidence

  6. Examples of Russian Tank Parable:
    1. (Class 1: my face close up, Class 2: friend's face far away)
      Expected classification are our faces, but the real classification is the amount of background visible or the color of clothes.

    2. (Class 1: dogs on grass, Class 2: wolves on snow)
      Expected classification are between dogs and wolves, but the real classification is the backtround grass/snow.

Section 4: Examining the confidence levels
  1. Stored "nCounts" and "confidences" into global variables "globNCounts" and "globConf" respectively. Used window.alert() to show the number of closest images and resulting confidences for each class. code updates

    Code modification to WebcamClassifier.js

  2. Experiment:
    Trained the network with three inanimate objects. During training, I moved the objects closer and farther away, and also tilted it at various angles to give the network more training data on what the different objects look like from various distances and positions.
    (Class 1: pink calculator, Class 2: pink/brown umbrella, Class 3: teal notebook)

    Observations:
    • The network was able to tell the differences between three inanimate objects with >99% confidence. For each object the network correctly identifies, the number of K closest matches for the correct class is maxed out at 20 closest matches, and the incorrect classes have 0 closest matches. The confidence for the correct class is at 1, and the confidence for the incorrect classes are at 0.
      Class 1

      Num Closest Matches: [20,0,0]
      Confidences: [1,0,0]

      Class 2

      Num Closest Matches: [0,20,0]
      Confidences: [0,1,0]

      Class 3

      Num Closest Matches: [0,0,20]
      Confidences: [0,0,1]

    • Out of curiosity, I tried to classify a beaver and an image with no object (aka. just me in the background). The network classified both the beaver and me to be an umbrella with 53% confidence and >99% confidence, respectively. One potential reason that the network classifies both images to be umbrellas (Class 2) because out of the three objects the network was trained on (ie. calculator, umbrella, notebook), the umbrella covered the background the least and matched the color of the background the most, whereas the teal notebook covered the background the most and matched the color of the background the least. Thus, in order to classify objects not previously trained in any of the classes, the network used the presence and similarity of the background to classify the new, unknown objects.
      Class 2

      Num Closest Matches: [0,20,0]
      Confidences: [0,1,0]

      Class 2

      Num Closest Matches: [7,12,1]
      Confidences: [0.35, 0.6, 0.05]

Section 5: Scaling the confidence values
  1. I tested two different ways to compute confidences: weighted, and top-two weighted. Download WebcamClassifier.js
    1. Weighted: (confidence value for class X) = (number of matches in class X) / (number of samples in class X), scaled equally to sum the total confidences to 1
    2. Weighted-topTwo: (confidence value for class X) = (number of matches in class X) / (number of samples in class X), but calculated only for the top two classes with the most number of image matches, scaled equally to sum the total confidences to 1
  2. The network performs better using weighted confidences when there is a sample imbalance between different classes.
    1. Example 1: Samples:[31, 73, 130], Num Closest Matches: [3, 1, 16], Original Confidences: [0.15, 0.05, 0.8], Weighted Confidences: [0.414, 0.059, 0.527]
    2. Example 2: Samples:[31, 130, 130], Num Closest Matches: [17, 0, 3], Original Confidences: [0.85, 0, 0.15], Weighted Confidences: [0.938, 0, 0.062]
    The network performs better using unweighted confidences to eliminate noise in background, and distinguish better in recognizing objects.
    1. Example 1: Samples:[31, 66, 122], Num Closest Matches: [4, 3, 13], Original Confidences: [0.2, 0.15, 0.65], Weighted Confidences: [0.552, 0, 0.448]
    2. Example 2: Samples:[39, 112, 114], Num Closest Matches: [2, 5, 13], Original Confidences: [0.1, 0.25, 0.65], Weighted Confidences: [0, 0.28, 0.72]
  3. Some image classification situations in which these alternative ways of confidences might come in useful (weighted vs weighted-topTwo) are when the number of training examples between classes are imbalanced (ie. Class 1 has ALOT of training examples vs. Class 2 has very little training examples). In this situation, we would put more weight on Class 2 training examples and normalize the overall confidences to 1 in order to find the actual weighted confidences.
Section 6: Limiting the number of training examples per class
  1. In animate()'s "if (this.isDown)", the code checks if the current image count for the specific class has gone over the alloted samples trainable. If the class has already maxed out the number of images it can train, an alert window will pop up to notify the user that no more samples can be trained for that class.
    code updates

    Code modification to WebcamClassifier.js

    Class 2 Limited

    Class 2: Allotted number of samples pop-up

  2. By severely imbalancing the number of training images, easy-to-classify cases still confidently classifies images at >99%. Whereas more confusing cases that confused the system previously still leads to have confidence classification >90%, even if the classification is false, as the sample imbalance may cause some sample matches to be weighted unfairly for classes with smaller sample sizes.
Section 7: Further Explorations
  1. By increasing the overall K value, we allowed the network to not be as easily biased from noise in small sample sizes. This enabled the network to give more accurate, overall results rather than relying on a few sample matches. For example, after increasing the K to 100 (as opposed to the original 20), I noticed that the class confidence levels tended to be less quick to change, as the network now needed more of a majority of image matches to move the overall confidence levels of each class. Increasing the overall K value helped make the resulting confidences more reliable, as it is not reliant on a subset of small noises in the image matches, and the confidences are not as extreme.
    code updates

    Code modification to WebcamClassifier.js

    Class 1

    Class 1: calculator
    67% confidence

  2. Removing a class makes the confidences more sensitive, and easier to flip quickly to extremes because the image matches are dispersed among less classes, thus putting more weight/confidence into those classes.
    code updates

    Code modification to WebcamClassifier.js

    Class 1

    Class 1: calculator
    99% confidence