Network as Code Advanced Topic, Part 3 of 3

Introduction

0:02 foreign

0:08 you folks are Troopers you've stayed

0:11 through the first two videos in a series

0:14 where we introduced the general topic

0:17 network is code video two we talked

0:20 through implementing Network as codes

0:22 some of the just uh considerations for

0:26 when you're starting a project now we're

0:29 going to talk about some of the advanced

0:31 stuff some of the gatches some of the

0:33 things that you really need to consider

0:36 we spent quite a bit of time on this in

0:38 the last video Ned and by the way thank

0:41 you Juniper for sponsoring the series

0:44 but now we spent quite a bit of time

0:46 talking about this near in the video and

0:48 it's a very important topic Network

0:50 testing and validation what what is that

0:54 and what are we looking to achieve

0:57 right well I think there's multiple

0:59 levels of testing you can do when it

1:01 comes to deploying your network as code

1:03 and it all starts with the initial

1:05 check-in of that code too whatever your

1:07 Source Control process is typically

1:09 that's going to kick off some sort of

1:11 integration Pipeline and it's the job of

1:14 that pipeline to do some very basic

1:16 checking of what you've submitted is it

1:19 formatted properly is the code

1:21 syntactically valid and makes sense and

1:24 maybe you're even testing for some best

1:27 practices by using static code analysis

1:30 tools and we'll touch on specifics later

1:32 but the general idea behind a static

1:34 code analysis is it doesn't try to run

1:36 the code it's simply looking at the code

1:38 itself and it has some rules it's

1:41 basically a rules engine that says oh

1:43 you you're opening up you know Port 22

1:46 to the entire world through this

1:47 firewall Rule and that's we don't do

1:49 that so you know might flag that as as a

1:53 not good configuration so you're going

1:55 to have some sort of static analysis

1:57 tool in there and that's all about

1:59 vetting the code just to make sure

2:01 before it even runs that it looks good

2:04 so so Ned as you're talking through this

2:07 I can't think of a single product that

2:11 kind of does all of this and as I'm

2:14 streaming together the test process I

2:18 can't hold I can't help but think am I

2:23 introducing another point of failure

2:26 because now there's another human uh

2:30 uh

2:31 uh doing with their meat hands stringing

2:34 together these tests isn't that another

2:37 you know potential Breaking Point

2:39 of course you're having more complexity

2:42 that's that's what we do as Engineers

2:44 right we add complexity I would say the

2:47 benefit here and and the good news is

2:50 that when you're building your CI CD

2:52 pipelines you can do that declaratively

2:54 with code so it's not necessarily

2:58 somebody sitting there and stitching all

3:01 these pieces together by dragging boxes

3:03 around the screen or anything like that

3:05 you can develop workflows

3:08 and standardize those workflows and

3:11 publish them for other folks to consume

3:13 and use and actually a lot of the vendor

3:16 platforms have that uh process baked

3:19 into them so they already have these

3:22 predefined templates or these predefined

3:24 workflows that you can take advantage of

3:26 and then kind of snap in your own

3:29 tooling and tool sets so you still are

3:32 going to have humans involved because

3:33 you're always going to have humans

3:35 involved at least I hope otherwise we're

3:36 out of a job but those humans are going

3:39 to be doing things that forward your

3:43 automation goal and they'll be doing it

3:46 in code not by manually clicking around

3:48 in a UI so what if this caused this

3:51 testing this code what what's what's the

3:54 name is it's just is this all just part

3:56 of the sunset of infrastructure is cold

3:58 I think so so we're borrowing a little

4:01 bit from the concepts that are behind

4:03 testing software and so software has a

4:06 whole bunch of different test types you

4:08 have things like unit testing which is

4:10 just testing that say a function

4:12 Returns the values you expect and errors

4:15 in the way that you want it to so that's

4:17 you know at the very essential I've got

4:18 a unit and I'm testing it and then you

4:21 get into integration testing okay how

4:23 does that function work with the rest of

4:24 my program as a whole

4:27 not all of these Concepts apply one to

4:30 one or map one to one two infrastructure

4:32 and to networking because they are

4:34 different in the way that they work so

4:37 we can try to apply some of these

4:39 Concepts while recognizing that not

4:41 everything is a perfect mapping from

4:44 software development to managing

4:46 infrastructure

4:47 so talk to me about integration

4:50 testing like a a continuous motion when

4:54 I was building it wasn't quite a green

4:58 field but it was greenish we had a

5:00 chance to kind of reset a massive

5:03 Network for a Fortune 100 this this

5:05 opportunity doesn't happen often we

5:10 and we were bringing in the application

5:13 it was a mission critical application

5:15 was my job to manage that mission

5:17 critical application and they built all

5:20 of this network redundancy and I wanted

5:23 to say hey

5:25 let's take the opportunity to turn off

5:28 that switch like it's in theory like we

5:30 don't get to do this in in the real

5:32 world in production right doing tests I

5:34 can say hey let's actually test the

5:36 redundancy turn off the switch and let's

5:38 see if we can still do a transaction

5:42 that's Nirvana but how do we gradually

5:45 get there how do we kind of integrate

5:48 merge these pipelines sure yeah so I

5:51 mean the testing begins with when code

5:54 is checked in right that's that's the

5:55 beginning and we're going to do some

5:56 basic tests to make sure that your code

5:58 is good

5:59 but then the next step is how does that

6:01 integrate with the existing system and

6:03 that can be really difficult to test

6:05 because you don't necessarily want to

6:07 apply changes live to your network so

6:10 you could potentially have a development

6:12 instance of some of your network maybe

6:15 it's a virtualized version of your

6:17 network where you can deploy those

6:19 changes and then have a series of checks

6:22 that just validate hey the configuration

6:24 loaded properly on the switch it didn't

6:26 it didn't barf on any of the commands or

6:28 or the instructions of the configuration

6:30 that's in there it's a valid config that

6:32 will actually load on that switch even

6:35 though it's a virtualized version of

6:36 that switch and then there's another

6:38 aspect of integration which is how does

6:41 the network function as a whole once

6:44 you've deployed your updates and that

6:46 can be very very difficult to test in uh

6:49 in a non-production environment so there

6:53 are certainly tools out there that will

6:56 attempt to make a digital wins sort of

7:00 of your existing environments apply the

7:02 changes there and review the results but

7:05 ultimately you're always testing in

7:07 production right eventually it has to

7:09 hit

7:10 your production Network and you're

7:12 essentially testing there so what's

7:14 really really critical in this whole

7:16 process is having a complete feedback

7:18 loop to capture what's happening in the

7:21 production environment and have that

7:23 inform your development process for the

7:26 next iteration of your code

7:29 so let's move on to another topic

7:32 monitoring how does Network as code

7:35 impact monitoring and analyst analytics

7:39 seems like there's opportunities there

7:41 but what are some of the advanced things

7:43 we can start thinking about once we move

7:45 to network is code

7:47 well certainly assuming that you're

7:49 monitoring an analysis analytics

7:51 software packages and devices support it

7:53 you can deploy them with network as code

7:56 so it seems like we there's there's a

7:59 theme I'm developing here I'm in hearing

8:01 a friend of mine likes to call it

8:03 everything is code right if it if it has

8:05 an API endpoint and you can program

8:08 against this you should be defining its

8:11 configuration using Code as much as

8:13 possible and I think if there are any

8:14 sres watching you know exactly what I'm

8:17 talking about that's the the Nirvana of

8:19 the SRE is to automate all the things

8:21 that can be automated so you can move on

8:24 and do something else

8:25 uh so that's that's a portion of it is

8:27 just setting up that initial monitoring

8:29 and analytics it's something that some

8:31 people forget to do when they set up the

8:33 switch or they set up the router they

8:34 forget to turn on proper monitoring or

8:36 get it integrated with that monitoring

8:38 package that you have somewhere else oh

8:40 I set up a new Switch but I forgot to

8:42 send the ticket to the monitoring team

8:43 to add it to their list of network

8:45 devices that sort of thing when you have

8:48 that monitoring portion defined using

8:51 network as code you don't have to

8:53 remember because now you've created an

8:55 integration where a new network device

8:58 is added it automatically gets

9:00 integrated into your existing analytics

9:02 and monitoring packages so it's that

9:04 Dynamic Discovery and integration so

9:06 that's certainly a partial portion of it

9:08 but I think another important portion of

9:11 it is the ability to capture the impact

9:14 of your changes

9:15 and that kind of gets back to what we

9:17 were just talking about like I deployed

9:19 my code did I break anything

9:22 that's certainly important but and even

9:24 another and possibly equally important

9:26 part is

9:28 what were the actual impact of my

9:31 changes and did I achieve the goal of

9:34 those changes to begin with because we

9:36 don't just change the network for

9:38 funsies right it's not you know Friday

9:40 we're going to deploy some new network

9:42 as code and then go out and have happy

9:43 hour and like you have a business reason

9:46 or a technology reason to deploy changes

9:49 to the network

9:50 and so defining what the

9:54 point of the change is and then figuring

9:57 out how to measure the impact of that

9:59 change to make sure that the change you

10:01 made actually is reflected in the

10:04 performance that's the job of monitoring

10:06 analytics oh hey you made this update to

10:09 the network and now customer requests

10:11 are coming in 50 faster than they were

10:14 before because you streamlined something

10:16 in the network that's awesome you get to

10:18 report that back to your boss I improve

10:20 the network performance so that you're

10:23 getting more customer orders per second

10:25 fantastic

10:27 so as I think of you know the cicd

10:30 process and things that we wish we could

10:32 do but we didn't have the people or

10:34 processes to do it uh it was too

10:37 expensive to do or too burdensome to do

10:40 it every time you know we could create

10:43 uh CID CI CD processes or pipelines that

10:48 would kick off specific monitoring for a

10:51 specific amount of time off a set of of

10:55 ports let's say you know Port mirroring

10:58 generally speaking is expensive from a

11:01 uh is expensive from a resource

11:04 perspective but after a certain change

11:07 we want to always mirror a port for

11:09 let's say two hours so we collect that

11:12 data and if there's a there's another

11:14 trigger from the monitoring tool that

11:16 says hey if we reach this threshold

11:20 take this action and this action may not

11:22 be disruptive like making configuration

11:24 changes it could be monitor this other

11:27 thing uh that is you know kind of this

11:30 limited resource that we can now put on

11:33 to collect more data and make better

11:35 informed decisions

11:38 yeah absolutely and at this point I

11:40 won't say like CPU time is cheap but

11:42 it's a lot cheaper than it used to be

11:43 right storage isn't cheap but it's a lot

11:45 cheaper than it used to be so the

11:48 ability to capture

11:50 all of this information is certainly

11:52 there the other big challenge is then

11:55 okay I got all this additional info how

11:58 do I analyze it how do I munge useful

12:01 information out of it and so that's

12:03 that's not really a network is code

12:05 challenge but it's something that's

12:07 going to feed back into the loop of your

12:09 development of network as code is having

12:11 some sort of data analysis tool that can

12:15 give you useful insights into the

12:17 information that you're Gathering

12:20 so let's talk about our last

12:22 Topic in this series I think

12:25 one of uh if you're a networking person

12:27 you've dealt with both sides of this

12:30 implementing your security policy via

12:33 the network and then ensuring uh uh just

12:37 proving that to some internal or

12:40 external audience so let's talk about

12:43 implementing security policies through

12:46 code I talked to a bunch of folks about

12:49 security is cold that's a thing

12:53 how do we where do we start with our

12:56 security policies through code

12:59 sure so there's a whole bunch of

13:00 different policy engines out there that

13:03 will analyze code compare it to some set

13:06 of policies and then give you the

13:08 results one of the most popular ones

13:10 that I've been working with for a little

13:12 while is called open policy agent or

13:14 oppa and that has the capability to

13:17 analyze anything that that is expressed

13:20 in Json

13:21 and compare it to some rule sets that

13:24 you've defined and then give you results

13:26 based off those rule sets and what can

13:29 you express with Json well almost just

13:32 about anything so you know whether

13:34 that's uh doing analysis of static code

13:37 analysis so just what does the code look

13:39 like uh or it could be I have a planned

13:43 set of changes that I want to apply to

13:46 my network and I can look through the

13:48 plan set of changes and make

13:49 determinations of whether or not I find

13:52 that it's secure all the way up to

13:54 analyzing the actual running

13:56 configuration on network devices or

13:58 servers as long as it can be expressed

14:01 through Json oppa can take a look at

14:04 that and make some policy decisions say

14:07 oh well someone went into this switch

14:09 after the fact and altered something and

14:12 it's no longer in compliance and that

14:14 compliance can be defined usually

14:16 through the security and compliance

14:18 teams in your organization they set the

14:20 policies and then they allow you to test

14:22 for whether or not you're in compliance

14:24 with those policies

14:27 and one of the things that frustrated me

14:30 to know in when I did Network

14:32 Administration and operations day to day

14:35 in large organizations is when the

14:38 dreaded auditor comes in

14:41 and I think I wanted to talk about two

14:44 topics within this sure one

14:48 how do I answer

14:50 uh the requests from Auditors when I'm

14:53 living in an infrastructure is code and

14:56 networking code environment because I'm

14:58 not in my mind going back to the

15:00 individuals which is pulling configs

15:03 going to backups Etc to in answer the

15:08 requests from the Auditors and then the

15:12 second one is how do I make how do I

15:16 help the Auditors Trust

15:18 those

15:20 those artifacts I'm giving to them as

15:23 proof so let's do the first one first

15:25 like you know how am I pulling the

15:28 request the the the

15:30 a sample request is show me that uh uh

15:35 authentication is configured on every

15:36 Network device

15:38 sure yeah and I mean that's a request

15:40 pretty common request that comes in now

15:43 let's assume that you've defined in your

15:45 network as code authentication policies

15:48 for every single Network device

15:50 all you need to do is run a drift

15:55 detection essentially against all your

15:58 existing network devices and that gets

16:00 back to the get set and test that we

16:02 talked about in the previous video

16:03 you're just basically running the get

16:06 and test portions of that get me the

16:09 configuration from every network switch

16:10 test it against my defined configuration

16:13 is there a difference no there's not

16:16 awesome and hopefully the answer is no

16:18 there is not and so you can go to the

16:20 auditor and say Here's the you know the

16:23 run that I did against all my network

16:25 devices it found no differences and

16:27 here's the configuration that I've

16:28 defined in code that clearly has the

16:31 authentication policy enabled there you

16:33 go I'm done I don't have to tap every

16:35 single switch myself and pull the config

16:38 and dump it out into this giant you know

16:40 document that I deliver to them it's

16:43 here's the runner that I went through

16:45 that tested it against all the switches

16:47 and then here's the actual configuration

16:49 it was testing against you're good to go

16:53 so the smart auditor will come and say

16:56 well

16:58 there's a whole nother control plane

17:00 less authenticate that the folks making

17:05 the changes because we're no longer

17:07 making switch level changes we're not

17:09 going into the switch to configure

17:11 changes

17:12 this whole other team is doing this

17:14 platform team how do we ensure who has

17:17 rights to make changes if this

17:19 quote-unquote system is making changes

17:22 right I mean

17:24 because it depends on how you've secured

17:27 the workflow

17:28 so a fairly typical process is

17:31 everything goes through code you're

17:34 following sort of a git Ops process so

17:36 the way that I make changes in a system

17:38 is that I submit my changes via code to

17:42 the repository and that kicks off via a

17:45 web hook some CI CD Pipeline and in that

17:48 pipeline will be an approvals process

17:50 for the changes and so someone whether

17:54 it's an automated process or a manual

17:55 process needs to vet those changes

17:57 determine whether or not the changes

17:59 should be allowed and then approve those

18:01 changes and what you have in the

18:04 repository is a record of exactly who

18:06 committed the code and when they

18:07 committed it and in your pipeline you

18:09 have a record of exactly who approved

18:11 that code and when they approved it and

18:13 so you can trace the full change of your

18:16 environment through that entire process

18:18 now that doesn't mean that sometimes you

18:21 don't need to break glass in the case of

18:23 a a hard down situation where you need

18:26 to make immediate changes but that is is

18:29 hopefully an infrequent event and that

18:32 you have a well-defined process for

18:34 getting approval to break glass and make

18:36 changes

18:37 so this all starts with you can't

18:39 automate you can't code processes that

18:43 don't exist

18:45 well the the at the end of the day the

18:48 system is there all these systems that

18:51 we've talked about in this series those

18:55 systems are there to automate or codify

19:00 the things we've already written down on

19:03 paper the processes that we've already

19:06 talked about the operational

19:10 issues we've controlled for

19:12 I've saw I've seen cic CI CD processes

19:18 break entire systems because people

19:21 didn't sit down and write down their

19:25 existing processes and then build a CI

19:28 CD pipeline that supported their

19:31 existing pipeline uh pipelines they

19:34 tried to recreate their will and break

19:38 literally 30 years of integration test

19:42 processes Etc without really thinking

19:45 through and I think that that summarizes

19:48 the whole series what we're trying to do

19:51 is scale our operations in a way that uh

19:56 meets the bell of today the CTO

19:58 advisor's premise is that hybrid

20:01 infrastructure is here to stay we cannot

20:05 afford to have a bespoke approach to any

20:09 infrastructure whether that's network

20:11 storage compute or public Cloud we have

20:14 to have processes that scale take humans

20:18 out so we can put our people on smarter

20:20 and harder problems such as Network to

20:25 Cloud networking Cloud to Cloud

20:29 networking Cloud to Cloud security these

20:32 are problems we need to rededicate our

20:34 staffs to solving net any last comments

20:38 for our audience I think you hit the

20:41 nail on the head there it really is a

20:43 matter of automating existing processes

20:46 but most importantly you don't have to

20:50 twist the tool you don't have to twist

20:52 yourself out of shape to fit the tool

20:54 all these different tools that exist are

20:57 extensible and customizable and so you

21:00 should customize the workflow and select

21:02 the tool that meets the existing shape

21:04 and workflow of your organization

21:08 all right with that said you want to

21:10 find out more about the CTO advisor you

21:12 can follow us on the web the

21:13 ctoadvisor.com visit our friends Juniper

21:16 Network can folks find you if you're

21:19 looking for me the easiest way is to go

21:21 to my website Ned in the cloud.com all

21:24 of my links and other content are all

21:26 hosted there all right until then we'll

21:29 talk to you next video series

Network as Code Advanced Topic, Part 3 of 3

Advanced concepts of network as code

You’ll learn

Who is this for?

Host

Guest speakers

Resources

Experience More

Experience the Intent-Based Data Center with Apstra

Juniper Apstra Demo: Time Voyager: Undo network configuration with just a few clicks

Juniper Apstra Demo: Prevent network deviation with continuous validation

Juniper Apstra: Policy Assurance and Traffic Segmentation for a Zero Trust Data Center

Transcript