Zero Trust in the Campus

Zero Trust in the Campus

Controlling Network Access

Securing Infrastructure Access

When looking at what the major risks are to the security and functionality of IT infrastructure, near the top is access to that infrastructure.  Being able to ensure that only authorized devices and users can connect to the network is one of the most effective ways of protecting your infrastructure and data.  Users who pick up malware from outside the corporate network can easily bring it in via their machines, or bad actors can attempt to place a device within the network to easily launch attacks on the infrastructure from the inside.  And it’s not just a security thing – one of the most aggravating components of ‘shadow IT’ is the random devices users bring in and attach to the network.  Printers are probably the most common, but other things like networked music players are a notable hassle.  And the worst of all is users bringing in network devices like unmanaged switches that can actually cause an outage.  Situations like this are what network access control (NAC) systems are for.

So what is a NAC system?

In a nutshell, the primary job of a NAC system is to authenticate users and devices so that they can use the organization’s network and then log those accesses.  More advanced NAC systems can also provide conditional access for devices – a device joining the network can be profiled and screened for things like an up to date OS, the presence of antimalware software, or even force an antimalware scan prior to gaining full access to the network.  The most complex NAC deployments will add network segmentation to all of the above – users are dynamically assigned to a VLAN or SSID, or have a dynamic ACL applied based on their role in the organization.

How do NAC systems function?

NAC systems, regardless of vendor, all rely on the same core technologies to function, which are RADIUS and 802.1x.  This is meant more as a short overview, not a deep dive into the intricacies of RADIUS or 802.1x.  We’ll also avoid looking at vendor-specific capabilities, such as Cisco’s Scalable Group Tags (SGTs), as those don’t see much use and rely on a single vendor environment.  First is 802.1x, the core technology for allowing or disallowing access to the network. Keep in mind that 802.1x is a function of the access device (switch port or wireless AP), not the NAC software – it’s common to see a NAC system from one vendor and access switching from another vendor in the same environment. 

NOTE: All modern OSes (Windows, Linux, MacOS) have 802.1x client (aka supplicant) functionality built in, so no 3rd party software is necessary on the client side.

Besides standard client OSes, 802.1x client functionality also exists in a number of specialty devices, including network infrastructure like routers, switches, and APs.  We’ll address that use case later.  The other thing to keep in mind about 802.1x is that it’s a layer 2 protocol – authentication information is sent even before a device can get an IP address.  That way, it’s simply not possible for unauthenticated devices to obtain an IP address in a well-designed NAC deployment. 

What if the device in question is a printer, IP phone, or other device without 802.1x awareness? 

There is a feature available called MAC Authentication Bypass (MAB) that forgoes the 802.1x process and instead, uses the device’s MAC address (learned from a single frame) as the credential to authenticate against.  Keep in mind that MAB is not very secure – spoofing MAC addresses is very easy to do, especially with wireless devices.  MAB shouldn’t be used except as a last resort, and additional security, such as network segmentation via a firewall, should always be considered when dealing with devices that must use MAB.

Further explaining the authentication process

The other part of the authentication process is the access device communicating with the NAC server.  This back-end communication is done using RADIUS.  RADIUS has a very long history as an AAA protocol, dating back to the days of dial-up Internet, and has stuck around because its capabilities are broadly useful for all manner of network access control regardless of media type.  Once an access device has received 802.1x information, it will translate that to a RADIUS request and send it on to the NAC server.  From there, the NAC server will consult its internal user and machine database or an external user database (e.g. Windows Active Directory) to determine if the device should be granted access to the network.  In the case of MAB, that will be a pre-populated list of MAC addresses that the NAC server will refer to.  RADIUS doesn’t just handle authentication, either.  RADIUS can be used to push a dynamic ACL based on user role or device type, or dynamically assign a port to a specific VLAN based on user role or device type.  This kind of dynamic segmentation is quite advanced, though, and is definitely not for a NAC beginner.

Now, let’s discuss NAC deployment

Our prior topics have generally been invisible to end users and limited to key points within the enterprise network.  Thus, even with areas that have complex security needs, like machine-to-machine security in the datacenter, there is minimal worry about messing with the end user experience or causing high-visibility problems.  NAC changes that dramatically – now we’re directly impacting end user experience if things go wrong.  To be honest, we’ll be directly impacting end user experience when things go right – remember that a NAC system is not just security, but shadow IT control. Expect user complaints when the printer they snuck in no longer works.  It still beats having people sneak in printers, in my opinion.

Deploying NAC is going to be complex

This article is not trying to build a complete guide to NAC deployment.  Rather, I want to discuss general principles and best practices to make sure that your NAC deployment doesn’t cause too much pain.  The first thing to consider is where 802.1x should be enabled.  Best practices for NAC say that any means of network access (switch ports, APs) that aren’t in a secured environment should be enabled for NAC.  That’s why APs and switches come with 802.1x supplicant capabilities – a user-accessible AP or small switch in an office could otherwise be unplugged and an unapproved device can then be attached to the network. You’ll also want to inventory network devices and determine where using MAB will be necessary. Be sure to record those MAC addresses, too.

So what’s next?

The initial deployment is the next step. Build the NAC servers, integrate with any external resources like AD, and start with a small test deployment.  The IT department always makes for a good guinea pig.  This test deployment should start in open mode – network access is always allowed, but all 802.1x and RADIUS exchanges are logged.  Review the logs and address any errors, then move to closed mode, where network access is now conditional on successful authentication.   There will likely be some issues that weren’t caught or only appear in closed mode.  Make note of anything particularly troublesome for when the wider deployment is carried out.

Next, prepare for the general rollout

Since this involves client machines, make sure the helpdesk team is in the loop and is trained in how to deal with the inevitable issues that will surface and work closely with the desktop team to make the needed changes to enable 802.1x (for a Windows shop, traditional GPOs or InTune can be used to do this).  Procedures will also need to be updated – don’t forget that.  It’s way too easy for NAC to turn into an unmanageable mess without good procedures.  Above all, communicate with the users about the process well in advance.  Be sure to have a good answer as it relates to things like printers and other devices brought in by users.  Just cutting them off causes more problems than it solves, no matter how satisfying it may be.  This is, honestly, the hardest part of a NAC deployment and the biggest cause of failures.  Nothing sinks a project faster than agitated users and their managers.  Now is also the time to remediate any potential issues that have been found in the pilot deployment if they appear to be widespread.

You’re done prepping, its time to deploy

Now that the preparatory work has been done, it’s time to move on to the large-scale deployment.  Feel free to split this up further for large deployments or if there are lots of remote sites.  Just like the pilot, start with everything in open mode and log errors or failed authentications.  Remedy those issues as appropriate.  Once the technical issues have been resolved and any ruffled feathers have been un-ruffled, it’s time to move to closed mode.  Once again, communications from IT to the rest of the organization is key.  I can’t state enough how important it is, really.  No matter how much work is done prior to this step, there will be issues.  Address the users with respect and courtesy, and always be flexible.  It’s not uncommon for some devices to just plain refuse to work with 802.1x no matter what, short of a full reimage. 

Some final thoughts

NAC systems are a powerful tool in your journey towards zero trust, but the user impact should always be top of mind.  This is definitely one of those projects where a good services partner can have a big impact – a team that’s done dozens of NAC deployments has seen numerous ways things can go wrong and can streamline the NAC deployment dramatically.

By Chris Crotteau

User to Machine Security

Zero Trust in the Datacenter – Protecting Your Servers from Your Users

For the first part of our explorations of the zero trust philosophy, we’re going to look at the datacenter. 

user to machine security, protecting the datacenter

It’s All in the Flow

When we look at the datacenter we have two types of traffic flows, each of which needs to be looked at from a security perspective.  First is user to machine security.  Protecting one’s datacenter resources from the users has always been a necessity, however the types of threats and what we consider a user have changed a lot over the years.  Second is machine to machine security.  This area of datacenter security is much newer and has historically been challenging and expensive to implement.  We’ll be focusing on user to machine security for now – machine to machine security will be discussed in a future post.

Note: What we discuss here can easily be applied to servers located on-prem, co-located, or even in the public cloud. 

On to User to Machine Security

The primary type of user to machine security is what’s commonly referred to as north-south security, where the focus is on Internet users.  Exposing necessary resources to the Internet is a requirement for obvious reasons, but Internet-based threats are omnipresent and can be of considerable sophistication.  It may seem obvious, but it bears repeating that any security policy should be built with only the necessary access privileges granted.  For Internet-facing users, this is usually easy – modern websites/applications usually just require that HTTP/HTTPS traffic coming in on ports 80 and 443 is allowed.  Legacy applications can complicate this process, though, so always work with the application team to understand all requirements for the application to function and build access policies appropriate to your environment.

Beyond simple access controls, it’s important to also consider what the users are inputting into the application. 

For example, putting malicious information in an HTTP POST request is a very common way of trying to get the application to give up information, grant inappropriate access requests, or otherwise misbehave in a way beneficial to the attacker.  How to abuse application inputs is going to be firmly out of scope for this post – whole books have been written on how to exploit things at the application level.  Addressing this kind of abuse is also more complicated, too.  Port and protocol filtering is really a go/no go type of rule, while inspecting inputs is a lot more complicated due to the much more open nature of user inputs.  There are application best practices for sanitizing user inputs, but especially with proprietary applications, it’s not always possible to do so.  It’s also better to be able to stop malicious traffic before it touches the application.  For this, we most commonly use a web application firewall (WAF). 

What is a WAF?

With a WAF, we can look directly at the payload of interesting packets and filter based on their contents.  For example, a field on a website that’s meant for name input shouldn’t ever have SQL syntax appearing in it.  On the WAF, a rule is created (using lots of regexes!) to ID anything like this and block it. WAFs and similar application specific security tools are, unfortunately, a Very Hard Thing to implement.  The nature of a WAF means that HTTPS traffic needs to be decrypted, which presents challenges in not breaking TLS.  Once that’s been dealt with, building appropriate rules needs close collaboration between the security and application teams to ensure that the WAF is blocking everything it should be, and that rules are updated as applications are updated or as threats emerge.  The infamous log4j vulnerability is one that a good WAF rule can easily block, but building that rule requires good Javascript knowledge and an understanding of how the vulnerability is exploited.  To see what a sample WAF rule blocking log4j exploit attempts looks like, F5 has a ready to go iRule available here

Other Machine to User Security Considerations

The next part of user to machine security is a somewhat newer topic in security – protecting your datacenter resources from your own users.  This is also referred to as internal segmentation.  Historically, trying to firewall your users from your datacenter was difficult, expensive, and of limited value.  Times have changed, and we’re at a point where your users can be as dangerous to the business as an Internet based threat.  Studies done on where attacks gain their initial foothold show that 90% or more attacks begin with a user opening a malicious email or running a bad executable file.  Once run, the malware will begin to crawl the network looking for other vulnerabilities to gain a foothold in the datacenter and then go to work exfiltrating or encrypting data or otherwise disrupting business operations.  With a traditional setup where internal users are considered trustworthy, their traffic is considered good by default and doesn’t warrant being firewalled.  This is not a good stance to have considering the above statistic. 

Callout: Internal segmentation of some variety should be considered a need to have in a modern security-first network design. 

Implementing Internal Segmentation

This is one of those tasks whose complexity is hard to determine.  On the simple end of the spectrum, putting some basic security ACLs in place at the border device between the DC and the users is surprisingly effective for the amount of effort it takes to implement.  Most organizations should be looking at a firewall for this purpose, though.  Modern firewalls will have many more options for inspecting traffic, detecting threats, and alerting IT staff when something is detected.  Some features are actually easier to implement on an internal segmentation firewall, too.  SSL inspection is one of the best examples of this.  Cracking open TLS traffic is a notoriously resource intensive task and if done for Internet-bound traffic, can easily overwhelm a firewall, break websites, or potentially cause HR issues when specific types of user communications are inspected (banking and health information are two major no-nos for deep packet inspection).   SSL inspection, when done between users and the datacenter, has none of these concerns.  Traffic volumes between users and the DC are often well-known and don’t change much, so firewalls can be effectively sized to avoid performance issues.

For user information concerns, interaction with internal resources and data is pretty much open season for whatever security you want to implement – there’s nothing in, say, an employee’s interactions with the organization’s ERP system that would be out of bounds to inspect, log, and audit.

Some Final Considerations

One final thing to consider is how to treat access for 3rd party contractors and how to properly categorize their traffic.  The prevailing wisdom these days is to treat 3rd party contractors as equivalent to Internet-facing users, as several high profile intrusions were launched via a 3rd party with direct access to sensitive resources.  It’s a little more difficult, though – simply just allowing ports 80 and 443 through to a list of servers isn’t enough.  3rd party contractors may need specialized access or just require a large number of permit rules compared to either an employee or Internet user, so close coordination with your 3rd party is required to keep access requirements to a minimum.  Due to the complexity of building network-based security policies suitable for contractors, another option that we’re seeing adopted more lately is deployment of jump servers with enhanced access management software, such as what Bomgar or SecureLink provide.  This type of software provides additional capabilities to the IT staff, such as full session monitoring, access notification, and the ability to only allow logins at specific times or only if explicitly approved by the IT staff.  With this level of control on the jump server, it becomes easier to build other security policies in a much more general way.  Since all users are accessing resources via a jump server, access control rules only have to permit a small number of hosts from known subnets, and detailed application based or port/protocol based rules can often be omitted or reduced in complexity.

Zero Trust Philosophy Index

This series in security philosophy will explore the areas of security that need to be addressed in order to make your plan a reality and to discuss specific areas of focus on how to apply a zero trust mindset.

  • Datacenter
  • Route/switch
    • Features that ensure integrity of network operations
  • Wireless
    • Capabilities for detection and mitigation of RF based attacks
  • Endpoint
    • Network Access Control (NAC) – Ensuring that endpoint activity is controlled and that security threats are detected and mitigated before an exploit can occur
  • Network Security
    • Ensuring that all traffic through the network is controlled and monitored for malicious activity
  • Cloud Security
    • Ensuring that user to cloud access is controlled and that cloud resources are appropriately provisioned and accounted for.
  • The Human Factor
    • Ensuring that end users are aware of security issues and responsible for their security choices.