Table of Contents
In our prior track, we discussed several ways to keep east-to-west traffic secured using Faucet, especially with the now-common threat of compromised servers attacking deeper into a datacenter network. In this track, we will be expanding on network security and how security-centric controller northbound APIs may be implemented and used to deal with the evolving threats in a datacenter.
WIP: rework intro
Before getting too much into a generic security API, let’s discuss a bit on how ACLs are generally implemented now. ACLs are usually defined in a configuration file or database and then converted to flow entries in the appropriate tables of the switches they apply to. ACLs are often applied on a port-by-port basis, requiring the network engineer to consider every port on every switch in the network. By manually defining every port in every switch and which ACLs should be applied to them, mistakes are easier to make and changing the network topology can require the network engineer to parse through thousands of lines of configuration to find the specific ports that need to be reconfigured. This makes for brittle security and networking.
If we look at the most common type of ACLs in use, especially for network security, these ACLs apply to forwarding traffic. This is quite common on a dedicated firewall, but these ACLs can (and should) be implemented between each host when possible, and at every switch port if not. There are many cases where a single server in a network is compromised and can then start attacking other hosts behind the classic router/firewall. These forwarding ACLs only really need to be applied on ports that have hosts directly attached to them. More specifically, they should be applied on “edge” ports in the controlled OpenFlow domain. Links between switches managed by the same controller should not need to apply the ACLs and often spine or aggregator switches allow for fewer ACL entries as a tradeoff for higher switching bandwidth. Furthermore, not only should forwarding ACLs be applied to edge ports, but in most networks, they should be applied to all edge ports. Wouldn’t it be nice if you could just tell the controller that, no matter the topology, apply a specific ACL to a certain port classification?
Let’s say for a moment that we can tell the controller to enforce certain ACL rules based on the port class. We’ll discuss how the controller will know what the port class is later. The port class can include whether or not the port is on the edge of the network as we mentioned above, but it may also include information such as “untrusted” for WLAN access points or a guest network. Now when we define ACL rules, we can say something like “apply the general ACL where the port is on the network edge” rather than “apply the general ACL on ports 1, 2, 3, 4… on switch DPID 0x…, …”. Not only does this make it easier for the network and security engineer to specify how ACLs are enforced, it also provides a separation of concerns. The security engineer can write the appropriate ACLs to match their security policy and the network engineer can ensure the controller provides any custom classification that isn’t already provided by the controller’s internal topology discovery process.
Topology Management and Port Classification
Just about every major OpenFlow controller framework provides network topology discovery in one form or another. In this set of articles in this track, I’ll be using Ryu’s
ryu.topology library, but the methods I describe should apply equally to ONOS, Open Daylight, and any other OF controller. All that matters is that there is a data structure available to the part of the controller that converts ACLs in to Flow Entries that provides a list of all managed switches, links between the switches, and perhaps hosts discovered on each port. The underlying implementation doesn’t matter so long as the information is there.
There are other port attributes that might be useful when describing where an ACL should be applied. In addition to a port being an “edge”, a classification such as “untrusted” might be useful. This would most likely be manually defined attributes in the controller’s configuration. The main point is to have a set of attributes that can be checked to determine which ACLs are applied. As an example, here is what the configuration fragment might look like for an ideal controller if we wanted to apply an ACL called “allowPrinting” to a port attached to a wireless access point (AP) determined by the custom port class “guestAccessPoint”.
allowPrinting: rules: [
# Match on incoming connections for
# IPP ports to all printer IPs
ip_dst: eachOf: variable: 'printerIPs'
tcp_dst: eachOf: [631, 443]
tcp_flags: macro: 'tcpFlagsIncoming'
# Reverse match for IPP traffic on printer IPs
ip_src: eachOf: variable: 'printerIPs'
tcp_src: eachOf: [631, 443]
tcp_flags: macro: 'tcpFlagsOutgoing'
printerIPs: as: 'set'
accessPointPorts: as: 'set'
printerIPs: add: [
accessPointPorts: add: [
'mainOffice:5' # named
'0123456789:2' # dpid
This configuration requires that the controller know that a specific port is considered a “guestAccessPoint” and would most likely be defined by the network administrator responsible for that portion of the network. Although it’s entirely possible just to apply the “allowPrinting” ACL to specific ports rather than a port class, this allows the netadmin to add more APs without having to bother the security team.
This type of programmatic configuration demonstrates some the awesome power of Software Defined Networking. The switch itself doesn’t need to be manually configured to allow IPP (or some other printing protocol) in a set number of ports. It can be abstracted in the controller to meet the requirements of the deployment itself with separate concerns for managing the network and security, while still having both. Of course, this is just an example configuration for a fictional ideal controller, but it should get the point across of controllers supported this kind of abstraction.
Static and Dynamic ACLs
Now that we discussed how forwarding ACLs could be defined and applied based on port class, let’s discuss two major classifications of ACLs: static and dynamic. Static ACLs are the kind we used so far. They are generally defined before any network traffic is observed and cover the general security policy of the network. There might be several ways static ACLs can be defined and where they could come from. In the example in the previous section, we had a configuration file that defined some rules around printing that would be written by the network or security team for their own organization. ACLs could also come in the form of IP Reputation Lists obtained from third-party organization, which we call External Intel. These IP lists would most likely be in the form of blacklists that contain IPs known to be observed by other organizations to be actively attacking networks and might be updated on a daily basis. A controller might allow these lists to be imported and automatically generate flow entries in the switches that block traffic from these IPs, or funnel their traffic for further processing elsewhere.
Dynamic ACLs are ACLs that are generated in response to monitoring the current network traffic for ongoing attacks or misuse. A great example is the simple but powerful utility Fail2Ban. This application monitors log files on a host to check if there are repeated authentication failures and perform actions that would block traffic from the attacking IP for a period of time. The IPs provided by Fail2Ban and other applications like it which monitor your internal network is considered Internal Intel. Fail2Ban, for example, could be told to tell the controller about these attacks on the individual host so the intelligence can be expanded to protect that specific host or all hosts in the network. This would stop the attack traffic from even reaching the host to reduce local load. More advanced setups using security monitors like Bro could also provide specific connections to block within the network once an attack is detected.
Most of the ACLs, especially those dealing with IP reputation or created by network monitoring tools, can have the matching condition simply defined using a 5-tuple of source address, source port, destination address, destination port, and IP protocol. Those dealing with just IPs only need to match on the source IP, but applications like Bro can be specific on the actual network flow that should be matched. These applications should not need to know anything about the network topology in order to add these ACLs for blocking rules. This is where a security-centric API can be beneficial for a controller to provide. A controller’s northbound API may include endpoints that allow more specific ACLs to be specified, but a simplified version should be available for easier integration with other parts of the network security infrastructure. The controller can then take advantage of the topology discovery and port classification to automatically apply these ACLs where they are needed, which is usually the edge ports.
Later in this track, we will describe one way such an API could be implemented and can be used as an example for your own controller applications and should be applicable independent of your chosen controller framework.
Datacenter Test Environment
Having all these ACLs is fine and dandy, but it does little good if we can’t test the mechanisms that create and enforce them. This is where are virtual networking environments comes in to play. The next article, which will include practical examples of working with IP reputation and static ACLs, will also include a Mininet-based virtual network with a network topology similar to the Datacenter Topology used in previous articles. In addition to the topology itself, tools will be included to test IP blacklists in a simulated routed network. A “WAN” host can be configured to simulate many IP addresses simulating the external WAN link for a network. In addition, testing tools will be provided to automatically ping or connect to each of the simulated IPs and check against a blacklist that is to be interpreted by the OpenFlow controller.
Introducing IOF Security Switch
For this security track, we’ll be using a custom proof-of-concept controller based on the Simple Switch Reimagined controller: the IOF Security Switch Controller. This will allow us to explore security concepts in a learning-friendly environment. Since some of the concepts will be simplified for this purpose, not all edge cases will be covered. Instead, this is meant as a stepping off point for working with (or expanding) fully functional controllers like Faucet, ONOS, and OpenDaylight. My hope is that our educational controller will provide a simplified code base that is easy to follow and read.
IOF Security Switch will use the topology discovery built in to Ryu to provide some of the information needed to implement some of the concepts we covered above. By doing so, we don’t have to reinvent the wheel and we can still implement some features like the shunt flows which prevent the controller from being sent traffic on hosts that it has already learned, much like Faucet’s implementation.
One of the biggest features (or limitation by design) is that all configuration and interaction with the controller will be through a Northbound API over REST and WebSockets (WS). This will provide a single entry point and event source so features like adding an ACL doesn’t need to be implemented multiple times, such as through both a configuration file and through a northbound API. More specifically, this is following the programming concept of Don’t Repeat Yourself (DRY). In keeping with that concept, we will be using the REST/WS implementation already used in Ryu. There will still be configuration files, but these will be loaded by shims that “speak” to the API. That same API will be used to allow other specialized shims to quickly load IP reputation (blacklist) files and interface with applications like Fail2Ban and Bro. This concept allows great flexibility in a network where existing services are already in use and are not specifically designed to work with our controller.
Finally, by using both REST and WebSockets, the controller can allow for event-based interaction with the shims. For example, a shim could be listening for host-add events and automatically trigger orchestration software to either add more ACLs to the switch or work with some other system entirely. Using standard web-based API design, shims can be written easily in other programming languages completely and even allow a single-page web application to act as a graphical interface to the controller. Also, having a well-defined API will allow applications to continue to interface with the controller even if the controller’s internal implementation is significantly changed. This sort of abstraction is important for any controller to work in a greater system.
Now that we’ve discussed how topology can be applied to security and how we plan on demonstrating that, I hope will join us in our next article covering IP Reputation and the setup and interaction with our test environment and the IOF Security Switch controller. Sign up to our mailing list to be notified when new articles are published and please be sure to share and like this article if you found it useful. If you have questions or ideas for improvement of these articles, leave us a comment below. We appreciate any feedback from our readers! =)