Episode 9: ML Notes (Part II): Fundamentals of Decision Tree Theory

These days I am reading the book “Machine Learning” by Tom M. Mitchell and I will be documenting some important concepts from its chapters in a series of blog posts.

Check out “Part I” with notes from Chapter I,II if you haven’t already!

[Background]:

Decision trees iteratively partition the data and generate rules which help us make necessary predictions.

Source: Helping doctors validate decision trees by Samuel Laurence

Concepts:

Some classic old gold papers on decision trees by the ML gurus themselves:

  1. Induction of Decision Trees by J.R. Quinlan (1986)
  2. An Empirical Comparison of Pruning Methods for Decision Tree Induction by John Mingers (1989)
  3. Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms (1995)
  4. Symbolic and Neural Learning Algorithms: An Experimental Comparison (1991)

Thought of the Week:

This week’s picks from articles on COVID-19:

  1. The Shift Americans Must Make to Fight the Coronavirus by Meghan O’Rourke [5]
    “Americans today are not used to the notion that an infectious disease we have no treatment for might sweep through the nation. It startles the mind that our high-tech, modern medical system, which routinely wrests sick people from the arms of death, could actually be overwhelmed by a challenging but not remarkably deadly virus. One hundred or so years ago, most deaths in the U.S. were caused by infectious disease. Today, most are caused by chronic conditions. This is new to us…But we also live in a country stubbornly hung up on a damaging idea of self-reliance, a nation pathologically invested in the idea that we should all “just do it”—an attitude that challenges us to muscle through it—whatever it might be. We have no shared discourse for the idea that the hard thing to do, the truly challenging thing to do, might be to do less in order to help another…We are so addicted to the concept of individual responsibility that we have a fragmented health-care system, a weak social safety net, and a culture of averting our eyes from other people’s physical vulnerability…Unfortunately, there is no prescription for the “right” thing to do, no answer to the question of whether to have the playdate or fly to Florida right now. But there is an ethos to cling to…To be ill is to know our interconnectedness, but to be ill in America today is to be brought up against the pathology of a culture that denies this fact…No person is an island; the nation that believes in individuals more than it values community risks its own survival. Accepting and embracing this might be the tougher path than muscling through, each by each, whatever comes.”

  2. Why the Coronavirus Has Been So Successful by Ed Yong [6]
    “One of the few mercies during this crisis is that, by their nature, individual coronaviruses are easily destroyed. Each virus particle consists of a small set of genes, enclosed by a sphere of fatty lipid molecules, and because lipid shells are easily torn apart by soap, 20 seconds of thorough hand-washing can take one down. Lipid shells are also vulnerable to the elements; a recent study shows that the new coronavirus, SARS-CoV-2, survives for no more than a day on cardboard, and about two to three days on steel and plastic. These viruses don’t endure in the world. They need bodies. The virus has been remarkably stable given how much transmission we’ve seen,” says Lisa Gralinski of the University of North Carolina. “That makes sense, because there’s no evolutionary pressure on the virus to transmit better. It’s doing a great job of spreading around the world right now.”…Researchers can, however, offer a preliminary account of what the new coronavirus does to the people it infects. Once in the body, it likely attacks the ACE2-bearing cells that line our airways. Dying cells slough away, filling the airways with junk and carrying the virus deeper into the body, down toward the lungs. As the infection progresses, the lungs clog with dead cells and fluid, making breathing more difficult. (The virus might also be able to infect ACE2-bearing cells in other organs, including the gut and blood vessels.) The immune system fights back and attacks the virus; this is what causes inflammation and fever. But in extreme cases, the immune system goes berserk, causing more damage than the actual virus. For example, blood vessels might open up to allow defensive cells to reach the site of an infection; that’s great, but if the vessels become too leaky, the lungs fill even more with fluid.”

  3. The Coronavirus Is a Disaster for Feminism by Helen Lewis [7]
    “When people try to be cheerful about social distancing and working from home, noting that William Shakespeare and Isaac Newton did some of their best work while England was ravaged by the plague, there is an obvious response: Neither of them had childcare responsibilities…For those with caring responsibilities, an infectious-disease outbreak is unlikely to give them time to write King Lear or develop a theory of optics…A pandemic magnifies all existing inequalities (even as politicians insist this is not the time to talk about anything other than the immediate crisis). Working from home in a white-collar job is easier; employees with salaries and benefits will be better protected; self-isolation is less taxing in a spacious house than a cramped apartment…In both rich and poor countries, campaigners expect domestic-violence rates to rise during lockdown periods. Stress, alcohol consumption, and financial difficulties are all considered triggers for violence in the home, and the quarantine measures being imposed around the world will increase all three…They also worry that opportunities to collect high-quality data which will be useful for the future are being missed…We shouldn’t make that mistake again. Grim as it is to imagine now, further epidemics are inevitable, and the temptation to argue that gender is a side issue, a distraction from the real crisis, must be resisted. What we do now will affect the lives of millions of women and girls in future outbreaks.”

  4. DeepMind’s Protein Folding AI Is Going After Coronavirus by Shelly Fan [8]
    “How a protein “looks” in 3D is essential for developing new drugs, especially for new viruses. COVID-19, for example, has really spikey proteins that jut out from its surface. Normally, human cells don’t care—they won’t let the virus inside. But COVID-19’s spikey proteins also harbor a Trojan Horse that “activates” it in certain cells with a complementary component. Lung cells have an abundance of these factors, which is why they’re susceptible to invasion. Bottom line: if a drug is going to “fit” into a protein like a key into a lock to trigger a whole cascade of nasty reactions, then the first step is to figure out the structure of the lock. That’s what DeepMind’s AlphaFold is doing…China’s long-time Google surrogate and AI behemoth, Baidu, is using an algorithm to predict the structure of another important biomolecule, mRNA. mRNA shuttles information from the genome to protein factories, so shoot the mRNA messenger, then the viral proteins are never born. Similarly, AI could one day potentially predict epidemics and how a virus changes over time—but it will only help if there’s enough trust to listen to the models.”

#STAYHOME

References:

[1] Machine Learning by Tom M. Mitchell
[2] Tree algorithms: ID3, C4.5, C5.0 and CART
[3] C4.5: Programs for Machine Learning
[4] In simple language, how does C4.5 deal with missing values?
[5] The Shift Americans Must Make to Fight the Coronavirus
[6] Why the Coronavirus Has Been So Successful
[7] The Coronavirus Is a Disaster for Feminism
[8] DeepMind’s Protein Folding AI Is Going After Coronavirus

Share this: