A powerful network monitoring solution that takes your troubleshooting up a notch.
The Problem
If youāre supporting a large remote workforce, youāve had an executive call on Monday morning to say: āI joined a meeting over the weekend and audio quality was terrible. Can you make sure it doesnāt happen again?ā
Pre-pandemic, it may have been acceptable for IT to dismiss work-from-home problems without much digging. Troubleshooting was a crapshoot because of so many variables with little visibility into the traffic path. What device was the user logged in on? Do they have reasonable wireless coverage? Who is their ISP? Better yet, is their ISP having problems, or is the service theyāre trying to connect to even up? This was less a can of worms, and more like Pandoraās Box made into a Rubikās Cube. So, like most difficult problems with limited tools to work through, it was easy to attribute it to something uncontrollable. ThousandEyes is an excellent tool to combat this though.
The Challenges Solved by ThousandEyes
Today, with work-from-home now the new-normal for large quantities of workers, end users ā from front line to executive level ā expect an equivalent experience at home to that which they get at the office.
Traditional network monitoring provided a good grasp of in-house corporate infrastructure, but effectively treated the internet as the ultimate āblack boxā ā it either worked right or it didnāt, without much attention as to the āwhyā or āhowā of a problem, or with any useful information for correcting it. So, whatās to be done now that large quantities of your user base are relying on it day-to-day to get their jobs done?
In this article Iāll address that question by examining one of my favorite ThousandEyes use cases: Using it to demystify Internet black-box in order to proactively support end-users.
Before we get into how that works, first letās look at what ThousandEyes is and how it works.
What is ThousandEyes, anyway?
āThe internet is not a single network. For many, the internet has become a “black box” that is too complex to manage, too big to maintain visibility, and too vast to monitor. It is unpredictable, and composed of thousands of independently managed service providers, any of which can impact the experience of users connecting to an application or site. Even if the enterprises may not directly control the internet, they are ultimately still responsible for the reachability of their service and user experience.ā
Courtesy of Cisco
ThousandEyes is a ānext genā network monitoring tool, focused on providing granular monitoring across the internet. Think of it as our guide to working through Pandoraās Rubikās Cube, helping map specific traffic flow across the internet for your end users and devices. Iāll elaborate on that a little later.
ThousandEyes monitoring is accomplished by three different agent types:
Enterprise Agents ā Installed on a virtual machine or a container ā including a container on a Cisco Switch.
Endpoint Agents ā Installed on PCs, most useful for the work-from-home scenario. These are very lightweight but more limited than Enterprise Agents.
Cloud Agents ā These are basically Enterprise Agents installed on Ciscoās infrastructure ā to give perspectives from places you canāt place your own Enterprise Agents.
Weāll be focusing primarily on Endpoints Agents for our current discussion, with a light review of Enterprise Agents. Cloud Agents are out-of-scope.
Endpoint Agents for ThousandEyes
The endpoint agent is installed on your end-user PCs. Itās lightweight and out-of-the-box it gathers data on the PC itself, the local network (including wireless details), and VPN usage. Customized tests are then typically configured to provide monitoring.
Tests fall into two types on endpoint agents:
Scheduled Tests – to test the ability of the agent to ping hosts, open TCP connections, or test the success of opening a web page. Scheduled tests also pull underlying network data, more on that below.
Browser Session Tests ā to monitor the performance of pulling a web page. This is done passively when a user visits domains the administrator has specified, to save on unnecessary overhead of scheduling accessing a web page theyāre likely to have visited anyway.
Adding New Scheduled Tests in ThousandEyes
Letās start with Scheduled Tests and cover how to configure them. The first thing weāll do is setup a test to open an HTTPS connection to Outlook on Office 365:
In reality I added this test weeks ago. These are the current results:
This is the last two weeks of data. Letās zoom in on a few of the details:
Clearly, the section of dense red (Agents with errors) alongside the drop of the dark green (availability dropping) indicates an outage. All ThousandEyes tests gather ālower levelā data automatically. The most obvious example is that a traceroute is run with this web test. As such, we have a network path for every host running the test. This is helpful for identifying regional or ISP-level outages. ThousandEyes makes a fantastic visual of this data as itās gathered. See below:
Itās difficult to get the full picture in a blog, but this diagram can be drilled into and reorganized in real-time, to help drill into problem areas. You can get incredibly granular with this. For some insight, each number within the circles you see above represents ājumpsā you can drill down to for the route of the packet.
I also have a variety of other tests running on Endpoint Agents towards Microsoft services:
For brevity, I wonāt dive into them individually.
Endpoint Agent Browser Sessions
As mentioned above, one of the brilliant things about the Endpoint Agent is that business domains can be monitored passively. Rather than having the agent open an āinvisible browserā itself and pull down a website to test granular web performance, it works on the assumption that if the website is vital to the business, youāll be browsing there yourself pretty regularly. The agent has an integration with Chromium-based browsers (Chrome and Edge) that will monitor web performance as the user browses the page.
The first step is to setup the domains we want to monitor, and the IP space we want to monitor from.
Our domains:
And for simplicity, we monitor from everywhere, as most of our employees are remote anyway. An easy example of where this might not be ideal is if youāre running Enterprise Agents at corporate and donāt want to gather a bunch of extra data from users in the office.
Weāll focus on the same timeframe as we did above ā where Microsoft had problems ā January 25th:
My first reaction to seeing this ā being a network engineer – was āwhat the heck is an experience scoreā? It turns out traditional metrics such as delay or jitter arenāt as valuable when measuring an end-user experience, so ThousandEyes built a new one. If youāre curious, here is how itās calculated.
Enterprise Agents
So weāve seen how to gather data from our endpoint clients. Now letās briefly introduce Enterprise Agents and cross-reference data. Enterprise Agents are a big topic, as they can measure far more situations than the deliberately-lightweight Endpoint Agent.
Enterprise Agents can also perform these tests:
UDP Tests
Bandwidth tests (Use carefully!)
Agent-to-Agent Tests
How is the performance between two enterprise agents?
Layer 2 discovery & monitoring
Discover, auto-diagram, and monitor inside your network
BGP Tests ā Particularly path changes which indicate instability
Web page load tests ā How fast does a page load, and how are individual web objects loading?
Web page transaction tests ā This is Application Performance Monitoring (APM). The ThousandEyes Recorder will record your mouse clicks and data entry on any website, and then replay it on schedule to test performance.
For this particular topic, letās look at BGP changes, as itās the one thing the Endpoint Agent didnāt give us:
Microsoft had a series of BGP advertisement changes during this timeframe and experienced significant route table instability.
What we learned from this incident is that Microsoft themselves had the problem ā not our end users. Our monitoring with ThousandEyes enabled us to send out a proactive notification to our users that Microsoft had instability and to be patient during the outage, rather than opening any tickets.
But what if it is one of our users?
The example above gave a great way of correlating data, which is why I used it as the primary example. However, we have had three end-user issues in the past weeks that were reported as āMy computer isnāt working right.” Two turned out to be network issues, while one was resolvable with minimal effort. Individual PCs can be pulled up for performance metrics. Let’s explore some individual scenarios.
Scenario 1 – Employee having continued audio problems on conference calls
Checking the image above we see:
Memory ā 63% used. No problems there.
CPU Load ā 4.2% used. No problems there.
Network access ā 144 Mbps, 79% signal. Not ideal.
Gateway loss ā 0.4% – this is the problem.
In this particular case, we found out that the employee was using the free wireless provided by their landlord, and the performance of that wireless in the location his work computer is in is quite poor.
Solution: Recommended buying their own ISP.
Scenario 2 – Employee complained of very poor application performance.
During the problem period, we see a large spike in VPN latency. None of our other users were complaining of VPN problems, nor did they see this spike. This took further investigation.
It turns out, this user wasnāt connected to our VPN. A customer had provided us VPN access for reaching one of their ticketing applications, and it had full-tunnel enabled. As such, the entirety of this end-userās traffic, including that which wasnāt related to this end-customer, was being tunneled through the end-customerās network. When the end-customer had a network problem, it impacted our user.
Solution: We instructed the user to only connect to that VPN when accessing that one application.
Scenario 3 – An end-user was complaining of poor web browsing performance on their computer.
One look at the memory usage tells us this is a systems problem ā in this case, the user had a run-away process, and our desktop support engineers got in and resolved the problem. No network issue this time!
Solution: Local device issue, no network troubles found.
Another Potential Use Case
Residential ISPs are not always known for their reliability. Is a residential ISP having problems, impacting a group of users? ThousandEyes can export the data to non-ThousandEyes users to show the problem ā export it, and give it to the ISP and ask for a repair.
Conclusion
ThousandEyes can greatly decrease the time-to-detection, time-to-root-cause, and time-to-resolution, with remote users, saving a great deal of IT manpower as well as decreasing loss of productivity of the remote employee.
Looking to deploy a solution like ThousandEyes? Contact us here to get started.
In both my personal education and in work projects, thereās been a slow but steady move into network automation. This document is written from the angle of a network engineer, and as such, the document approaches the topic from the angle of moving from the CLI to a true programmatic interface in an efficient manner.
What you can expect to gain from reading:
The ācliff notesā version of RESTCONF
The ācliff notesā version of YANG
The ācliff notesā version of the pyang tool
Basic use of Postman
A quick & dirty way to implement working
RESTCONF on a Cisco device
An elegant way to implement RESTCONF on a Cisco
device
What you should not expect:
Any Python (or any other programming language)
education. There are countless trainings for Python elsewhere on the web.
A deep dive of REST. This article assumes the
reader has familiarity already.
Much detail on NETCONF. While a lot of the
information with RESTCONF overlaps with NETCONF (as RESTCONFās origin
technology), I chose to focus on RESTCONF due to almost all APIs being
REST-based now.
A thorough explanation of YANG. While
researching this article, I read some unbelievably good deep-dives of YANG, but
this article is about shifting from CLI to RESTCONF, and only a mid-level
understanding of YANG is needed.
Some things you’ll need
Postman
A Linux machine or VM
With that said, whatās NETCONF?
Although just recently gaining traction, NETCONF has
actually been around quite a long time ā the RFC was published in 2006. NETCONF
is an XML-based interface to configure and monitor network devices. One of the
primary drivers for NETCONF is to augment SNMP. SNMPās original use case was
meant to be both read and write, but the āwriteā element never gained wide
adoption ā primarily because of the difficulty in navigating MIBs to figure out
how to trigger the appropriate outcome. NETCONF typically works over an SSH
session to TCP port 830. NETCONF can be informally thought of as SNMPv4.
What is YANG?
Building off the idea of SNMP, if MIBs are the index for SNMP, then YANG is the index for NETCONF. Thatās overly simplifying YANG however, which is a very deep topic indeed. YANG is a hierarchical language, built in a tree-format, that defines in a readable format the generalized models required to configure a network. Understanding YANG at a high-level is necessary to use NETCONF.
Interesting note: YANG stands for āYet Another Next Generationā. Strange name if you donāt know the origin. The competing technology was SNMP-based. SNMP uses SMI as its back-end data structure, and before YANG was created, SMI Next Generation (SMIng) was being created. Reference RFC 3780: https://tools.ietf.org/html/rfc3780. When Yet Another format was created, it was called YANG.
So, whatās RESTCONF?
RESTCONF swaps the SSH session that NETCONF uses and instead
uses a REST-based API. The YANG models used are identical between NETCONF and
RESTCONF. An easy way to think of RESTCONF is just putting a web API on top of
the long-standing NETCONF framework. Additionally, RESTCONF expands on
NETCONFās XML interface by optionally offering JSON as a data format (XML can
still be used as well). I personally enjoy using RESTCONF because Iām already
familiar with REST APIs and therefore the interface is very familiar.
What else is different between NETCONF and RESTCONF?
NETCONF technically has a few more functional benefits than
RESTCONF. The most obvious is that streaming telemetry (example: polling the
CPU utilization every X seconds) requires a session to stay open. Thatās
possible with an SSH session, but with REST, every command is transactional and
there is no session to keep that kind of data flowing. There are a few other
benefits which are beyond the scope of this document.
So why would I want to use either of these?
The main use case is fairly obvious. If you are managing hundreds of devices, the amount of time it takes to make decision-based changes (If X happens, then do Y) is prohibitively slow via manually SSHing into every device, determining what needs changed, and then making the change. A well-written script and an API can do in minutes what a human would take hours to perform, and at the cost of zero man-hours.
Another more advanced use case is infrastructure-as-code.This is the idea that intent should define the network configuration, which is then deployed via software. This is beyond the scope of this document.
That
certainly can be done, but think of using NETCONF/RESTCONF as the ānext levelā.
The CLI was written for humans to interpret. Imagine the output from āshow ip
bgp neighborā ā easy for you to read as a human, but try to parse that with
automation. It can be done, but itās very clunky. Or, imagine trying to
dynamically configure an extended access-list with CLI commands, with a
computer making the decisions. It works, but itās clunky. The ideas behind
NETCONF/RESTCONF + YANG are to take those same tasks and make them more
computer readable/writable, instead of human readable/writable.
YANG in just a little more detail
Weāre going to come at these topics in little bits, and the next step requires understanding YANG just a little bit, so that we can give some simple RESTCONF examples.
Some quick intro knowledge is that there are several different creators of YANG models. The first, and from my understanding, the original, is the IETF. IETFās goals are idealistic ā create a series of models that work with all manufacturers of network equipment. You could re-use the same code against Cisco, Juniper, Arista, etc, and end up with the same outcome on all of them. Sounds great, right? The problem becomes apparent the more you work with programmatic models, vendors just ādo things differentlyā, and even though all networking is generally standard, the way things are handled inside a router are completely different. An obvious example is youāll never see an EIGRP or PFR IETF YANG model.
CALLOUT: Another vendor-neutral model is from Openconfig. It has similar goals to the IETF models but is backed by a group of manufacturers instead of the IETF: https://www.openconfig.net/projects/models/
Next are the native models. As illustrated above, no matter how good an industry standard model is, itās not going to cover anything vendor-specific (and many things that arenāt vendor-specific). Iāve not looked at any other vendor besides Cisco, but the Cisco native models are very extensive, complex, and can basically perform any router task youād like. Side note ā itās my understanding that the vendor-neutral models are translated into the Cisco native models before processing, but I have no specific way of showing this.
Letās get some basic samples going
Iāve never cared for reading learning material that doesnāt let you get your hands dirty until all the ālearningā is done. So, before I go on any longer, letās get this thing rolling.
Youāre going to need a sample IOS-XE device. I strongly recommend a CSR1K, as it exhibits some different behavior than physical routers. Iām using v17.2.1, for reference. Iāll explain more on that different behavior later in the article.
Why Postman? While it does far more than Iām going to write about here, it takes the ācode writingā complexity out of testing an API. Writing code (presumably Python) adds a layer of complexity in dealing with data formats and logic. As mentioned at the beginning of the article, this isnāt about teaching how to program, itās about teaching practical RESTCONF. Postman allows you to interact with a REST API without writing any code.
Assuming you have those things running, letās make RESTCONF do something.
Prepping your router is very straightforward.
First, since weāll be using TLS, you need an encryption key: csr1k#crypto key generate rsa
Then youāll need to enable the secure HTTP server and setup local authentication: csr1k#conf t
Enter configuration commands, one per line. End with CNTL/Z.ļ»æ
csr1k(config)#ip http secure-server
csr1k(config)#ip http authentication local
After that enable RESTCONF:
csr1k(config)#restconf
Youāll also need a local user thatās privilege 15: csr1k(config)#username cisco priv 15 secret cisco123
Now, letās load up Postman and see if we canāt get restconf to do something.
After youāve downloaded and signed into Postman, you should get a page that looks something like mine
The next page will look like this. Be sure to select the GET field as you see below.
Next,
click on Authorization, change the type to āBasic Authā, and put the username
and password you created into the Username and Password blank.
Press the Send button in the upper-right
If you configured the router correctly, the response field should look like this:
NOTE: Nothing too useful here other than it tells us that RESTCONF is working. Note the output is in XML. If you prefer to get it back in JSON, make the changes in the following steps.
Click on the Headers tab:
Once here, uncheck the default “Accept” header:
Create a new Accept header at the bottom specifying application/yang-data+json:
Press āSendā again, and the output should now return in JSON:
Iāll proceed with using JSON from here on out of personal preference.
Expanding upon the idea
Now that weāve confirmed that RESTCONF is running on the router and shown how to change to JSON output, letās do a few more simple interactions to show what weāre trying to accomplish here.
I want to specifically call out that my next examples are on a CSR1K. I have found the GET differences ā on both IETF and Cisco Native models ā to be considerably different between virtual platforms and physical platforms. So, if you want to replicate my results be sure youāre on the CSR1K. Again, Iām using v17.2.1. Iāll show more on this later.
First, perform a GET on: https://10.200.200.100/restconf/data/ietf-interfaces:interfaces/interface=GigabitEthernet1
Since Iāve preconfigured my GigabitEthernet1 we get back some configuration details:
Letās break down what we asked for in the GET: https://10.200.200.100/restconf/data/ietf-interfaces:interfaces/interface=GigabitEthernet1
10.200.200.100 = The hostname
/restconf/data/ = This path will be specified for RESTCONF config data. (differs for RPCs, more below)
/ietf-interfaces = Weāre using the ietf-interfaces YANG module (more on YANG modules below)
:interfaces = Specifying the āinterfacesā container inside /ietf-interfaces (more on containers below)
/interface = Specifying the list āinterfaceā
=GigabitEthernet1 = For the list āinterfaceā, the key is the string ānameā, and the name is GigabitEthernet1
If that seems like a lot to absorb, Iāll break it all down in greater detail later in the article.
Thus far weāve focused on using GET, letās change the IP address using PUT.
In this case, weāre going to re-use a lot of what we just did (authentication, URL, etc), so duplicating the tab in Postman is the easiest way to create a clone of what we just built.
Right-click on your current tab and press āDuplicate Tabā:
On the new tab, change your GET to a PUT:
As I had mentioned, this isnāt meant to serve as a REST tutorial, but while GET retrieves data, and POST creates new data, PUT is used for modifying existing data.
Weāll also need to go and modify the headers so that weāre sending JSON.
Uncheck the default Content-Type:
At the bottom of headers, as we did above for āAcceptā, create a new Content-Type of application/yang-data+json:
To start preparing to send JSON to the CSR, click on āBodyā and select ārawā:
Copy the output from your earlier GET of GigabitEthernet1. Building off this example, Iāve grabbed the JSON contents of it and modified one field ā the IP address from .102 in the fourth octet to .103. Iāve also enabled the interface. This is what I pasted this into the Body field:
Press Send again, and you should get:
You can check your work by running the GET from your prior tab again, or you can just log in to the router and look:
Letās also go ahead and create some data. Clearly you canāt create a physical interface, but you can certainly make a logical one. Letās craft a new Loopback.
Duplicate your tab again. Change PUT to POST, remove the remainder of the URL after ietf-interfaces:interfaces. In the body, change the name to Loopback and a number of your choosing, change type to softwareLoopback, change the IP address to something that doesnāt overlap with other interfaces, and (optionally) change your netmask to a /32.
Press Send.
I think this example speaks for
itself outside of why we trimmed the URL. We canāt POST to a list (an
interface, in this case) that doesnāt exist yet. Iāll show more examples on
this as we proceed.
The last HTTP verb to demonstrate would be DELETE. Letās wipe out that Loopback we just created. Duplicate your tab again.
Change the POST to DELETE. Add the list back in at the end of our URL: https://your-ip-address/restconf/data/ietf-interfaces:interfaces/interface=Loopback1001
Something to note: The body is irrelevant in this type of request. Since we duplicated the tab, we inherited the body from the POST, and we could leave it there, or you can erase it. It doesnāt matter.
Additionally: The debugs on the router are near useless. When I first started on this topic, I was hoping for a translation of RESTCONF into CLI to show what was actually going on behind the scenes, but no such luck.
Debugs are turned on with: csr1k#debug restconf level debug
The output from creating a Loopback looks like this (I have trimmed it slightly for brevity and privacy):
%DMI-5-AUTH_PASSED: R0/0: dmiauthd: User 'cisco' authenticated successfully from <myIP> and was authorized for rest over http. External groups: PRIV15
%SYS-5-CONFIG_P: Configured programmatically by process iosp_vty_100001_dmi_syncfd_fd_179 from console as NETCONF on vty63
%DMI-5-CONFIG_I: R0/0: dmiauthd: Configured from NETCONF/RESTCONF by cisco, transaction-id 189
So basically, the debug shows that I logged in using an API and made a changeā¦ but no real details.
Now youāve seen the basics on retrieving data, changing data, creating data, and deleting data. This is the easy part. Next, the real challenge begins in trying to figure out how to craft the body without having internet examples.
So ā isnāt there some documentation? Wellā¦
Well, there is none.
This hasnāt changed in the last five years. For writing code
around RESTCONF, youāre on your own. Instead of documentation, you need to
develop strategies to understanding creating the body. There are two strategies
that Iāve used, one of which lacks finesse but is very fast, and another which
is more likely what the YANG developers intended, but takes some patience and a
deeper understanding of YANG.
Letās start with the fast method.
Important Note: For some preliminary understanding, itās not possible to configure the router in its completion with the IETF models or Openconfig models. However, the Cisco native models have a representation of all standard configuration. So weāre going to swap off the IETF example above and on to the Cisco native models.
When I first started working with RESTCONF, I found myself looking for the equivalence of snmpwalk for RESTCONF. The question I asked myself is āHow do I index this thing?ā
My natural tendency was to perform a GET at the highest URL ālevelā:
Think of this as the RESTCONF version of āshow running-configā
ā204 No Content”
This threw me off for quite a while until, on a
lark, I tried it on a CSR1K:
As you can see, it works fine on a CSR, but not on an ISR ā I would love an explanation if anyone knows why this is. I couldnāt find any information on it. Note, I did try multiple ISRs.
For brevity, I couldnāt show the entire config here, so Iāve just shown another relevant snippet from below:
As an example, letās create a banner on the CSR: csr1k#conf t Enter configuration commands, one per line. End with CNTL/Z. csr1k(config)#banner exec 1 Restconf Banner 1
I deliberately picked ābannerā as
itās towards the top of the config, and makes the example easier in
screenshots.
Getting the JSON down just takes some practice, but the body looks like this:
And the proof can be seen from the CLI or from another GET: csr1k(config)#do sh run | s banner exec banner exec ^C NEW Restconf Banner ^C
As I mentioned, this is quick, dirty,
and inelegant. Having to build all your config to understand how to address it
in the API just isnāt a clean method. The elegant way is to become familiar
enough with the YANG files to be able to interpret them as a form of
self-documentation.
You Mentioned a More Elegant Method?
In order to go further with this,
we need the YANG files. Since weāre also going to be using a tool that only
works in Linux, youāll need yourself a Linux box or VM from here on in.
All the YANG models are available for download via github.
One of the cool things about this is that even the vendor native models are
also on github, so you get all the relevant YANG files in one shot!
For illustration purposes, Iām
going to swap back to the IETF models for now, as theyāre not as daunting to
read as the Cisco native ones.
jeff@linuxlab:~$ cd yang/vendor/cisco/xe/1721
jeff@linuxlab:~/yang/vendor/cisco/xe/1721$
Of Note: While Iām demoing on XE, there are XR and NX-OS models in the same folder structure
Taking a Referencing our prior example above: https://10.200.200.100/restconf/data/ietf-interfaces:interfaces/interface=GigabitEthernet1
Letās take a look in ietf-interfaces and try and gain some basic understanding. As a reminder from the top of the blog, I am not intending to teach YANG thoroughly, but to give enough understanding that you could take the information and interface with the RESTCONF efficiently.
Pop open ietf-interfaces.yang in your favorite text editor: jeff@linuxlab:~/yang/vendor/cisco/xe/1721$ vi ietf-interfaces.yang
ietf-interfaces.yang is one of the smallest āmajorā YANG files, but itās still 725 lines long. It is considerably more readable than SNMP MIBs are, but itās a lot to digest. I struggled finding a way to illustrate this without bloating the blogā¦ and didnāt come up with anything. So seriously, pop these files open and take a look. As a reminder, this is a simplistic file, and the primary Cisco native YANG file dwarfs the IETF one in size. Weāll come back more on the solution to this shortly.
As I mentioned above, the files are laid out in a tree. Iām going to pick out key bits of the file to reference how this works.
Letās start by trying to figure out the URL we used earlier: https://10.200.200.100/restconf/data/ietf-interfaces:interfaces/interface=GigabitEthernet1
Youāll note the first line in the file defines the module name: module ietf-interfaces {
Scrolling down a bit, weāll find the interfaces container:
As mentioned /hostname/restconf/data is in every RESTCONF URL on IOS-XE.
The important bits are after that: ietf-interfaces:interfaces/interface=GigabitEthernet1.
Let’s pause and talk about data types for a moment
These are definitions to be familiar with for the purpose of this article. Note, this is not exhaustive, itās just the bits needed to get through the common RESTCONF use cases.
Containers: Contains other nodes types, including other containers. This is basically just a logical grouping. List: Contains a sequence of list entries, which is uniquely identified by leafs. The unique identifier is the Key, defined in the list. Leaf: Contains a single value (Leaf types are the end of the tree) Leaf-List: Contains a sequence of leaf nodes
Comparing
this back to our earlier example:
Iāll show this in a better visual when we get to demoing pyang. This probably doesnāt seem too complicated just yet, but if youāre looking closely, there were a lot more IETF files.
Hereās a first major point of understanding: The files are not standalone. They work as a group.
Reference back to our first IETF example:
Go back to the text edit of the ietf-interfaces.yang file and search for āipv4ā:
I can assure you weāre viewing the right top-level file in ietf-interfaces.yang, but thereās no mention of IP addressing. This is where YANG gets trickier to decipher.
The YANG model weāre looking for is actually in ietf-ip.yang.
Letās take a look inside the ietf-ip.yang:
augment "/if:interfaces/if:interface" {
description
"Parameters for configuring IP on interfaces.
If an interface is not capable of running IP, the server
must not allow the client to configure these parameters.";
container ipv4 {
So the container for ipv4 is in a separate file from
ietf-interfaces, even though it augments it. The potentially confusing matter
here is that the augmenting file (ietf-ip.yang) refers back to the augmented
file (ietf-interfaces.yang).
Letās demonstrate
Run this GET in Postman: https://10.200.200.100/restconf/data/ietf-interfaces:interfaces/interface=GigabitEthernet1/ipv4/address
This is the same URL weāve been using for our example, but with /ipv4/address at the end.
Youāll get this more-specific subset of the body:
With ietf-ip.yang augmenting ietf-interfaces.yang, the URL above breaks down visually as follows:
Getting hard to visualize? Hopefully youāre following along
in the actual files. The IETF files are some of the easiest to interpret via
plain text, yet itās easy to demonstrate how complex this can be to read in
plain text.
Introducing pyang
NOTE:Itās worth mentioning that Cisco has tools available that are potentially more powerful for these particular operations than pyang is.
Yang Explorer is end-of-support ā it was flash based. I have not tried installing it. Yang Suite is brand new, as in it launched while I was typing this document. I attended the kick-off. It looks rather impressive, and according to the webinar I attended, it apparently sorts out the confusion around augments. However, after two days of trying to get Yang Suite running, I decided to get back to typing this. Inevitably, if you have the time to figure it out, Yang Suite is potentially a better tool for this operation than pyang.
With that covered, back to pyang.
As I mentioned above, pyang only runs in Linux, so back to your Linux box!
Installation varies slightly from Linux distro to distro, but the basics are simple: jeff@linuxlab:~$ pip install pyang
pyang does more than Iām going to cover here, but what we basically want it for is to summarize YANG files in tree format (as well as help with augmentsā¦)
Our initial usage of pyang will be: pyang -f tree <file1.yang> <file2.yang> ā¦ <fileX.yang>
Run this against ietf-interfaces.yang
Now we can easily conceptualize the YANG module in a tree:
That sure simplifies reading a large YANG file, but it
doesnāt get us the IP address information that we noted above is missing. This
requires a little bit of interpretative work.
I have already pointed it out, but itās pretty obvious from the file structure that IP address information would be inside ietf-ip.yang. Now we just need to see them both in the same tree. Note Iāve asked pyang to create a tree for both ietf-interfaces.yang and ietf-ip.yang simultaneously.
One benefit is pyang is smart enough to process the augment in ietf-ip and insert it into the correct spot in the ietf-interfaces tree. Compare to the prior screenshot of pyang that didnāt have the ipv4 tree information in it.
Now itās much easier to figure out the needed URL: https://10.200.200.100/restconf/data/ietf-interfaces:interfaces/interface=GigabitEthernet1/ipv4/address
Thatās an easy way to show some simple usage. Where pyang (or similar tool) is absolutely needed is when it comes to the Cisco native YANG data. For reference, all the Cisco-supported IETF YANG files combined are less than 14,000 lines combined. However, on 17.2.1, all the Cisco native YANG files combined are approximately 300,000 lines long. While itās great that itās human-readable, 300,000 lines is not a readable length, summarization is necessary.
Letās take a quick look at the Cisco-IOS-XE-native.yang file with pyang: jeff@linuxlab:~/yang/vendor/cisco/xe/1721$ pyang -f tree Cisco-IOS-XE-native.yang
This looks great at first glance, but if you run the same command in your lab, youāll find that the tree index alone for just Cisco-IOS-XE-native.yang is 34,709 ***lines long (just shy of three times the size of all the plaintext data from the IETF files combined!). Referencing above, this doesnāt include any of the other augmenting files, which are absolutely necessary to do most functions.
We need to narrow this down further before we start adding in more files.
This is where the tree-depth argument comes in handy: jeff@linuxlab:~/yang/vendor/cisco/xe/1721$ pyang -f tree Cisco-IOS-XE-native.yang –tree-depth=2
Tree-depth limits how deep the tree is displayed. When youāre searching for a starting point in building RESTCONF, its not necessary to have all the various containers, lists, and leaves displayed ā just a high level of where to begin is what youāre after. A tree depth of 2 is a little small to be useful, but it made for a better screenshot.
Much like the IETF YANG files, thereās quite a lot of additional Cisco YANG files augmenting the Cisco-IOS-XE-native module ā on IOS-XE 17.2.1, thereās 306 of them! Letās start by trying to find BGP.
The logical place to start would be to see if itās include natively (no pun intended) inside the main module. Weāll want to start piping the output to a file to make this manageable.
jeff@linuxlab:~/yang/vendor/cisco/xe/1721$ pyang -f tree Cisco-IOS-XE-native.yang –tree-depth=3 > native.out
jeff@linuxlab:~/yang/vendor/cisco/xe/1721$ vi native.out
Search for ābgpā
Letās take a look at the other Cisco native YANG files in the directory, filtering for the word ābgpā in the file names:
The correct file is fairly obvious:Cisco-IOS-XE-bgp.yang.
Letās add it in to our pyang tree:
jeff@linuxlab:~/yang/vendor/cisco/xe/1721$ pyang -f tree Cisco-IOS-XE-native.yang Cisco-IOS-XE-bgp.yang --tree-depth=3 > native.out
jeff@linuxlab:~/yang/vendor/cisco/xe/1721$ vi native.out
Searching for ābgpā produces several hits, but having a working knowledge of networking, and a basic understanding of YANG, makes the correct one obvious:
This requires scrolling up a bit to figure out the tree leading up to router, and frankly, you should be pulling the files out to notepad++ or a similar tool to make following a large tree easier.
So, if Iām crafting a URL for this, I would use: https://10.200.200.100/restconf/data/native/router/bgp
Note the small trick there, Cisco-IOS-XE-native:native can be abbreviated as just ānativeā
Letās say our goal is to turn up the BGP process and add a neighbor. We still need to know more than what we have, because ideally, we should be able to build the full PUT or POST straight off the YANG data and our own pre-existing network know-how. What we want is a deeper view of the tree starting at that one location.
Introducing ātree-path: pyang -f tree Cisco-IOS-XE-native.yang Cisco-IOS-XE-bgp.yang –tree-path /native/router/bgp –tree-depth=5
Inspecting the outcome from the data, we can find the next key elements:
Futher down the output, we find how to create neighbors:
Note the ā201 Createdā. Double checking our work at the command line:
csr1k#sh run | s router bgp
router bgp 100
bgp log-neighbor-changes
neighbor 4.4.4.4 remote-as 101
neighbor 5.5.5.5 remote-as 102
Some final thoughts…
What’s up with lists?
I referred to lists throughout the document without really covering why they exist. The BGP example is a good use case. Each BGP neighbor, and all the config associated with it, is a list. An element in a list is usually not a 1:1 match up with a single line of IOS configuration.
Thatās two elements in a list āusernameā. The key to the list is ānameā, which must be unique, so that it can be independently referenced, modified, or deleted.
Each element equals one line of configuration in IOS:
csr1k#sh run | s username
username admin1 privilege 15 secret 9 <omitted>
username admin2 privilege 15 secret 9 <omitted>
The BGP example is also a good one, where a list can create more than one line of IOS configuration. Letās say on neighbor 5.5.5.5 we also wanted to enable ebgp-multihop.
The POST wouldāve looked like this:
Now in the user example, one list = one line of IOS config. However, in this example, one list = multiple lines of config:
csr1k(config)#do sh run | s router bgp
router bgp 100
bgp log-neighbor-changes
neighbor 4.4.4.4 remote-as 101
neighbor 5.5.5.5 remote-as 102
neighbor 5.5.5.5 ebgp-multihop 255
This takes a little practice to
wrap your head around, but itās really not too bad. Going back to my original
statement that the CLI was built for humans and APIs are built for code, it
really makes a lot of sense.
Read-Only vs Read-Write
This blog has focused entirely on read-write configuration. Thereās actually quite a lot of read-only YANG models that can be referenced by RESTCONF and is specified in YANG. Think about a BGP neighbor state, or an interface error count ā things you wouldāve perhaps previously monitored with SNMP. All the samples Iāve pasted above have had a ārwā next to them for read/write as my blog focus was about creating configuration, but thereās a whole side of this just for programmatically monitoring statuses.
For a quick example:
If
youāre looking inside the YANG file itself, this is denoted differently:
container interfaces-state {
config false;
description
"Data nodes for the operational state of interfaces.";
āconfig falseā is what denotes
read-only.
Remote Procedure Calls (RPC)
If youāve tested SNMP writes, youāve probably seen the example of why never to leave unguarded āwriteā SNMP access on: you can actually write a value to reboot the router. Thatās an example of an SNMP-triggered RPC. NETCONF and RESTCONF have their own rich set of RPCs.
A brief introduction can be had by performing a GET on https://your-router-ip/restconf/operations: (RPC operations are underneath /restconf/operations, instead of /restconf/data)
For simplicityās sake, letās just demonstrate rebooting the router:
In closing, with the increasing use of network automation itās important to familiarize yourself with RESTCONF and YANG. As shown in this article you can use the RESTCONF protocol to simplify and manage network configurations and operational features. Iāve always been a believer in working smarter, not harder. While this article was written with a high level overview, there are a myriad of resources to take a deeper dive into YANG, the pyang tool, and how to implement RESTCONF on Cisco devices if youāre wanting a deeper look into these great tools.
Ciscoās DNA Center appliance is generally talked about in the context of SD-Access (SDA), but SDA is a complex technology that involves significant planning and re-architecture to deploy.Ā DNA Center is not just SDA, though ā it has multiple features that can be used on day 1 that can cut down on administrative tasks and reduce the likelihood of errors or omissions.Ā From conversations with our customers, the most asked-for capability is software image management and automatic deployment, and that is something that DNA Center handles extremely well compared to many other solutions out there.
Wait…I can manage software updates with DNA?
Managing software on network devices can be a substantial time burden, especially in businesses that have a substantial compliance burden and require regular software updates.Ā Add to this the increasing size of network device images ā pretty much all the major switch and router vendorsā products now have image sizes in the hundreds of megabytes up to several gigabytes, and software management can now take up a significant chunk of an IT departmentās time.Ā One of our customers is interested in DNA Center for this specific purpose ā with 500+ switches, being able to automate software deployment saves several weeks of engineer time over the course of a year.
That may leave you asking…
So, what devices can I manage?Ā
DNA Center can manage software for any current production Cisco router, switch, or wireless controller.Ā Additionally, some previous-generation hardware is also supported.Ā Of this hardware, the Catalyst 2960X and XR switches as well as the Catalyst 3650/3850 switches are the most commonly used with DNA Center. Now let’s talk about how DNA Center does this.
Neat! Now, tell me how to do it.Ā
First, be sure that every device you want to manage is imported into DNA Center.Ā Once thatās done, the image repository screen will automatically populate itself with available software image versions by device type.
Here’s an example:
From here, select the device family to see details. Once youāve decided on the version you want to use, click on the star icon, and DNAC will mark that as the golden image (aka the image you want to deploy). If not already present on the appliance, the image will also be downloaded as well.
Next, go to Provision > Network Devices > Inventory to start the update process. From here, select the devices you want to update, then click on Actions > Software Image > Update Image. Youāll be given the option to either distribute the new images immediately or to wait until a specific time to start the process. Different upgrade jobs can be configured for different device groups as well.
Here, Iāve set DNAC to distribute images on Saturday the 19th
at 1pm local time for all my sites. This
process is just the file copy, so no changes are made to the devices at this
time. The file copy process is also
tolerant of slow WAN connections, though not poor-quality connections. Weāve tested this process in our lab and
found out that itāll happily work even over a 64k connection (though itāll take
quite a while). Poor quality
connectivity, however, will cause this process to fail. Finally, once the image is copied to the
target devices, a hash check is performed to ensure the image hasnāt been
corrupted.
The next step is to activate the image. Activation here means āinstall the image and reboot the deviceā.
Like the distribution process, DNAC can either install immediately or wait until a scheduled time. Note that for IOS XE devices, this process will do a full install of the image vs. just copying the .bin file over. Once the software activation is complete, the devices will show their status in the inventory screen. As you can see, DNA Centerās software image management capability can save substantial time when updating software as well as ensuring that no devices fail to receive updates through error or omission.