Episode 8: ML Notes (Part I): Fundamentals of Machine Learning Theory

These days I am reading the book “Machine Learning” by Tom M. Mitchell and I will be documenting some important concepts from its chapters in a series of blog posts.

[Background/Important Terminology]:

  1. Machine learning is about getting good at some task T with experience E and measuring the performance by some measure P. e.g. Teaching a machine to play Ludo. The task T is to win the game. Experience E is the games played by the machine. Performance P is the proportion of games it wins.
  2. Instances refer to examples in the training data.
    For example, in the following table, Outlook, Temperature, Humidity, Wind are features and Play Tennis is the output or target class. A machine is given these features and asked to predict the output class. An instance refers to a row of the table. (<Sunny, Hot, High, Weak>, No) means that for features <Sunny, Hot, High, Weak>, the target class is “No”
Source: About logic, and how to do it fast, Geniferology
  1. Target Concept and Hypothesis: “Target concept” refers to the true function that maps all the features to it’s correct output. We may never find the this function because we don’t have access to all the possible examples for a concept in the world e.g. we don’t have data for all the games of Ludo ever played in time and space. But what we can do is try to guess or learn this target concept, and this learned model is then called “Hypothesis”. Hypothesis can be equal to target concept if our learning was perfect, but often hypothesis is just the best approximation of the target concept.]

Concepts:

Source: Neural network for machine learning by Ujjawal
  1. C : X → {0,1}
    There is a target concept C that we aim to learn. X is the set of instances over which C is defined. The target concept C maps the set of instances X to a set of boolean values 0 and 1.
    Example: Predicting whether to play outside or not based a set of features X = Temperature, Wind, Humidity. In this case, C(X) = 1 means play outside. C(X) = 0 means don’t play outside.
  2. (∀ x ε X)[(h2 (x) = 1) → (h1(x) = 1)] This is called more_general_than_or_equal_to rule. It means for any instance x in X, if it satisfies hypothesis h2, it also satisfies hypothesis h1, which implies that h1 is more general than h2. For example:
    H1 = <High, ? ,?> Temperature is high and I don’t care about other variables.
    H2 = <High,Weak,?> Temperature is high, wind is weak and I don’t care about humidity. H1 is more general than H2 as it will classify all instances with high wind as class = 1, where h2 is more specific and requires wind to be weak as well as temperature to be high for the output class to be 1.

  3. D is the training data. x is a new instance from the test set. L is a learner. The expression means that L inductively infers from the training data and test instance x that the class of x is L(x,D) = 1 (positive).

  4. B is the inductive bias of the learner L. This expression means that the learner “deduces” the class for test instance x, using bias B, training data D and description of test instance x.

Topics for further exploration:

  1. Version-space (set of short-listed hypotheses that satisfy the training data)
  2. Candidate-elimination algorithm (it lets all the hypotheses in the version space vote for the new test instance)
  3. Rote-learner algorithm (simply stores the training data in memeory and compares test data to it for classification deductively using no inductive bias.)
  4. Find-S (finds the most specific hypothesis to classify later instances)

Thought of the Week:

We are in the middle of a global emergency. Couple of things you can do to survive and some updates on COVID-19:

  1. Don’t go out. Even if you are able to beat the virus because you are young and fit, you might pass it on to someone else who might not be able to survive it. Be considerate.
  2. Read the following books: “The Plague” by Albert Camus, “Blindness” by Jose Saramago (also a movie) (A friend’s recommendation).
  3. Watch the movie Contagion.
  4. Read how Singapore, Taiwan and Hong Kong are beating the pandemic [2].
  5. Read about three researchers in Canada who brought the world closer to developing a vaccine for COVID-19 [3].
  6. Read how Shi Zhengli, China’s “Bat Woman” is hunting down bats to trace viruses from SARS to New Coronavirus [4].
  7. Wash your hands.

References:

[1] Machine Learning by Tom M. Mitchell
[2] What We Can Learn From Singapore, Taiwan and Hong Kong About Handling Coronavirus
[3] Canadian Researchers Have Isolated the Novel Coronavirus
[4] How China’s “Bat Woman” Hunted Down Viruses from SARS to the New Coronavirus

Share this: