Explainable AI Whiteboard Technical Series: Decision Trees

Learning Bytes AI & ML
A computer-generated whiteboard image showing 10 letter-sized envelops and six mailboxes below.

Technical Whiteboard Series: Decision Trees

In this video we cover decision trees, which help identify network issues like faulty cables, AP or switch health, and wireless coverage. For example, they can detect cable faults using features like frame errors and one-way traffic.


Show more

You’ll learn

  • How decision trees are used in networking

  • The steps involved in machine learning projects

  • How to build and prune decision trees

Who is this for?

Network Professionals Business Leaders

Transcript

0:11 today we'll be looking at decision trees

0:13 and how they identify common network

0:15 issues like faulty network cables AP or

0:18 switch health and wireless coverage the

0:21 algorithms used include simple random

0:23 Forest gradient boosting and XG boost

0:26 our example will be a simple decision

0:28 tree algorithm the essential steps in

0:30 any machine learning project involve

0:32 collecting data training a model then

0:34 deploying that model decision trees are

0:37 no different let's use a real world

0:39 example in networking decision trees can

0:41 be applied to incomplete negotiation

0:44 detection MTU mismatch among others our

0:47 example today applies the fundamental

0:49 problem of determining cable faults a

0:52 cable is neither 100% good nor bad but

0:55 somewhere in between to isolate a fault

0:58 we can create a decision tree using

1:00 features of a bad cable such as frame

1:02 errors and one-way traffic we begin a

1:05 decision Tree by asking a question which

1:07 will have the greatest impact on label

1:09 separation we use two metrics to

1:11 determine that question Genie impurity

1:13 and Information Gain Genie impurity

1:16 which is similar to entropy determines

1:18 how much uncertainty there is in a node

1:21 and Information Gain lets us calculate

1:23 how much a question reduces that

1:26 uncertainty a decision tree is based on

1:28 a data set from known results

1:30 each row is an example the first two

1:33 columns are features that describe the

1:34 label in the final column a good or bad

1:37 cable the data set can be modified to

1:40 add additional features and the

1:41 structure will remain the same a

1:43 decision Tree starts with a single root

1:45 node which is given as a whole training

1:47 set the node will then ask a yes or no

1:50 question about one of the features and

1:52 split into two subsets of data which is

1:54 now the input of a child node if the

1:57 labels are still mixed good and bad then

2:00 another yes or no question will be asked

2:02 the goal of a decision tree is to sort

2:04 the labels until a high level of

2:06 certainty is reached without overfitting

2:08 a tree to the training data larger trees

2:11 may be more accurate and tightly fit the

2:13 training data but once in production may

2:15 be inaccurate in predicting real events

2:17 we use metrics to ask the best questions

2:19 at each point and so there are no more

2:21 questions to ask then prune branches

2:23 starting at the leaves to address

2:25 overfitting the tree this produces an

2:28 accurate model with leaves illustrating

2:30 the final prediction Genie impurity is a

2:33 metric that ranges from 0 to one where

2:35 lower values indicate less uncertainty

2:37 or mixing in a node it gives us our

2:40 chance of being

2:42 incorrect let's look at a male carrier

2:45 as an example in a town with only one

2:47 person and one letter to deliver the

2:49 genie impurity would be equal to zero

2:52 since there's no chance of being

2:54 incorrect however if there are 10 people

2:56 in the town with one letter for each the

2:58 impurity would be 0.9 because now

3:01 there's a one in 10 chance of placing

3:03 the randomly selected mail into the

3:05 right

3:05 mailbox Information Gain helps us find

3:08 the question that reduces our

3:10 uncertainty the most it's just a number

3:12 that tells us how much a question helps

3:14 to unmix the labels that are known We

3:16 Begin by calculating the uncertainty of

3:18 our starting set impurity equals

3:21 0.48 then for each question we segment

3:24 the data and calculate the uncertainty

3:26 of the child nodes we take a weighted

3:28 average of their un C certainty because

3:30 we care more about a large set with low

3:32 uncertainty than a small set with high

3:34 uncertainty then we subtract that from

3:37 our starting uncertainty and that's our

3:39 information gain as we continue we'll

3:42 keep track of the question with the most

3:44 gain and that will be the best one to

3:46 ask at this node now we're ready to

3:48 build the tree we start at the root node

3:51 of the tree which is given the entire

3:53 data set then we need the best question

3:56 to ask at this node we find this by

3:58 iterating over each of these vales

3:59 values we split the data and calculate

4:02 the information gain for each one as we

4:04 go we keep track of the question that

4:06 produces the most gain found it once we

4:09 find the most useful question to ask we

4:11 split the data using this question we

4:14 then add a child to this branch and it

4:16 uses a subset of data after the split

4:19 again we calculate the Information Gain

4:21 to discover the best question for this

4:23 data rinse and repeat until there are no

4:25 further questions to ask and this node

4:27 becomes a leaf which provides the final

4:29 prediction

4:31 we hope this gives you insight into the

4:32 AI we've created if you'd like to see

4:34 more from a practical perspective visit

4:36 our Solutions page on juniper.net

Show more