Avatar

This is Part IV in a four part series of blogs. This blog has been co-written with Vincent Esposito (@vesposit)

This is the last of a series of blogs dedicated to explaining some of the use cases that can leverage ACI Micro Segmentation capabilities. In the first blog we described how to use ACI micro segmentation to implement a 2-tier web application on a single flat subnet. In the second blog we illustrated how to leverage the APIC API to dynamically create a sandboxed environment for Development and Testing for that application, and how to use VM-attribute based micro segmentation to easily promote workloads from Dev to Test to Production environments, including automation of L4-7 services. On the third blog, we looked at operating the environment from previous blogs, and covered how the ACI integrated overlay approach and various operational capabilities of the APIC facilitate Day-2 operations in a micro segmentation environment.

In this post, we look at how we can use the ACI Policy Model and the micro segmentation features in order to enhance the security posture of the physical server infrastructure by minimizing the attack surface.

This blog has been written with my friend and colleague Vincent Esposito (@vesposit). Vincent also did the demos for this blog. So let’s look at how Acme will enhance security for its infrastructure.

Acme enhances security for its infrastructure!

If we think of the previous blogs referenced above, we can imagine that in order to run the online shop that was showcased, Acme Co. require a server infrastructure running a hypervisor. In those examples, the application was implemented on Virtual Machines running on vSphere 6.0 DRS clusters. In our lab, these clusters run on UCS C220 M4 servers running ESXi 6.0. Just like with the previous examples, we can extend the principles of the design discussed in this blog to other virtualization and physical server environments as well.

Of course Acme Co. is also very concerned about security, and is aware that any server infrastructure in itself can be subject to vulnerabilities and therefore attacked. If the infrastructure would be compromised, this could lead to a complete control of its assets by the attacker.

What we mean here by server infrastructure is in essence the following 2 components:

  • The Baseboard Management Controller (BMC) of each of the servers. In this case, it’s the Cisco Integrated Management Controller (CIMC ) on UCS platforms. This component is part of the Intelligent Platform Management Interface (IPMI) specification and is in itself a micro-server embedded on the motherboard of the server. It is running its own Operating System and set of applications, and as such, has its own attack surface
  • The hypervisor and its “utility” interfaces for management, live migration, storage access. Those interfaces can be used to access and compromise the hypervisor through vulnerabilities affecting the kernel and/or the services or applications it runs.

Compromising any of those 2 components could have catastrophic consequences. At a first glance, the idea of compromising the infrastructure at this level may sound highly unlikely to happen, but this is not such an incredible scenario, since vulnerabilities on those components are found periodically on all vendors, and some of them even have a price tag attached for writing the associated exploit. A quick Internet search helps finding cases where exploits targeting the infrastructure layer have been used.

While we cannot entirely prevent such a scenario from happening, we can limit the exposure of those components in order to reduce their attack surface and to minimize risk. This is what we will work on for the remainder of this blog post, by using the ACI Policy Model and its micro-segmentation capabilities.

 

Lab Physical Setup

The setup that we are going to use for this in the lab is very simple: we have 2 ACI leafs connected to the ACI spines. We also have 2 UCS C220 M4 servers, running ESXi 6.0 implementing a DRS cluster, that are connected to the ACI leafs with the following topology:

pic1

 

  • CIMC interface: each UCS CIMC interface is directly connected to an ACI Leaf. We use the “dedicated port” mode for the CIMC here, but we could also use the “shared LoM” or “shared LoM extended” mode that would enable us to share either the 1Gbps LoM ports of the C220 M4, or the 10Gbps ports of the VIC 1225/1227 card plugged in the server.
  • ESXi mgmt. interfaces: we use a pair of vmnic interfaces connected to vSwitch0 in active/standby mode for management purposes. This would be the case if the vSphere admin wants to have an out-of-band management for the ESXi host.
  • ESXi vmotion & storage interfaces: we use another pair of vmnic interfaces connected to a Virtual Distributed Switch (vDS) created by APIC by using a VMM Domain. We use an APIC-controlled vDS so that we can easily set different EPGs and the corresponding dvPortGroups for vMotion traffic and Storage (NFS) traffic.
  • Virtual Machines data interfaces: we use the last pair of vmnic interfaces for Virtual Machines traffic. We configure them in LACP mode and connect them to the Application Virtual Switch (AVS) in order to have the most complete set of features and capabilities.

While this setup is functioning properly, it is clearly over complicated. I hope the reader realizes that this blog and design is not at all about providing recommended designs. Our objective is mostly about showing the capabilities and options of the ACI Fabric with both standard vSwitch and Distributed Switches (both VDS and AVS). In a real deployment many customers configure the ESX vmkernel for management on the same VDS used for all other types of traffic, and this could be the native vSphere VDS or the AVS. Also, many customers connect the CIMC to a separate network or perhaps to a FEX. Again the intention here is to share ideas and show possibilities, not actual design recommendations.

 

Logical Setup

Now that we have everything connected physically, let’s now move on to the logical setup.

For the CIMC connectivity we know exactly the protocols required for it to work. We also know that from one CIMC interface you do not need access to other neighboring CIMC interfaces of other servers. Therefore, we can benefit from a white-list model, where we use a very simple Application Profile called “SERVER_MGMT” that contains a single EPG called CIMC to group all these interfaces. This EPG is mapped to a physical domain. This EPG is configured with intra-EPG isolation in order to prevent CIMC-to-CIMC communication. It also consumes a set of contracts for allowing access to shared services like DHCP, NTP and DNS servers, and provides a contract for the server management station to be able to access the CIMC interfaces (using HTTPS, SSH and KVM):

 

pic2

This reduces the exposure of CIMC interfaces to the protocols strictly required, and also minimizes the lateral movement in case one of the CIMC gets compromised.

We follow a very similar approach for the ESXi management interfaces. This time using a different Application Profile called “VSPHERE_INFRA”. We again have a single EPG, “VSPHERE_MGMT”, mapped to a physical domain with intra-EPG isolation turned on to prevent ESXi-to-ESXi communication over the management vmk interface. This EPG again consumes the same set of contracts for accessing shared services like DHCP, NTP, DNS servers, and it provides a couple of contracts consumed by both the vCenter and the management station to be able to access the management vmknic using SSH, vSphere Agent and the Console:

 

pic3

 

Finally, we build on the same principles for the vMotion and Storage vmkernel interfaces, using the same Application Profile “VSPHERE_INFRA”. We have 2 different EPGs, one for storage traffic called “NFS” and one for live migration traffic called “VMOTION”. Both are mapped to the VMM Domain with our Virtual Distributed Switch (vDS), so that the corresponding port-groups are automatically created in vCenter. We leave intra-EPG communication allowed for the “VMOTION” EPG for live migration traffic to work properly, but we enable intra-EPG isolation for the “NFS” EPG, since storage traffic is only between ESXi and the NFS server. The NFS EPG is the only one consuming the contract that allows access to our filers to have access to the centralized storage:

pic4

 

If we tie all of those logical setups together, here’s the entire view of what it looks like in the APIC GUI:

 

pic5

 

The application profile diagram from above represents the same traffic flow diagram that we can find in VMware’s Network Port Diagram for vSphere 6.x. By using the ACI Policy Model and its micro-segmentation capabilities, we’re able to secure the entire vSphere application using a white list model in order to minimize exposure and lateral movement.

Notice how we did not discuss subnetting at all. All of the EPGs, and therefore all interfaces, could be on the same subnet or on different ones and the security model will be the same. The policy is not tied to IP Addressing.

In the following video, you can see how this security is enforced for both physical endpoints like the CIMC and ESXi management interfaces, and for virtual endpoints like the Storage and vMotion interfaces:

 

 

 

Automation: adding a new hypervisor to the cluster

 

One of the major benefits of the ACI Fabric and its policy model is how it enables complete physical and virtual networking automation through the API exposed by the APIC. Using this API together with the UCS CIMC API, we have built a simple web application that takes care of provisioning the network connectivity for a new server using the model described above for maximum security (or put in another way, for minimal lateral movement options).

This way, as soon as a new server has been racked and cabled, an operator can use this web application to indicate the leaf and interface IDs that she or he used to connect this server to, and our simple application will take care of provisioning everything automatically: it configures all network interfaces, the server policies (BIOS, boot order, etc.), it powers on the server and boots it using PXE. Then the hypervisor is automatically installed using a kickstart file, and then we also automatically join the cluster in maintenance mode. All interfaces are automatically connected to the right EPGs as per the model shown above.

The following video demonstrates this simple application to automate adding a third server to the cluster running Acme Co.’s secure infrastructure in a matter of minutes:

 

Once again, this quick demo is to show how simple it is to do full network automation thanks to the APIC. This could also naturally be achieved using more sophisticated automation tools like Cisco UCS Director, that uses the APIC API and the UCS CIMC API, and enables the creation of sophisticated workflows encompassing network, server, storage and virtualization aspects from multiple vendors.

 

Wrapping it all up …

The use of Micro Segmentation combined with a white-list policy model contributes to enhancing the security posture inside the perimeter of the data center for three main reasons:

  • By allowing only the protocols and ports required to each micro segment, we minimize the exposure to vulnerabilities.
  • By creating segments that can be as small as a single endpoint, the lateral movement possibilities are greatly reduced.
  • And by dynamically assigning endpoints to the right micro segment based on a number of the endpoint attributes, automation can be accomplished in simpler ways.

Over the course of the last four blogs we have seen that these benefits are available to applications that run in virtual machines and to the infrastructure itself, as well as to any type of physical endpoint. We have also seen that APIC provides complete visibility of the applied policies and endpoint location, as well as automatic correlation of events, statistics and configuration changes in order to simplify audits, compliance and Day-2 operations in general in a single tool and interface.

But perhaps more important: we have seen that you do not need to deploy and operate two networks, one physical and one virtual, in order to achieve all these benefits. A single programmable fabric is all that is required. This translates into a lower TCO when compared to alternatives, and also ensures that the security benefits provided through micro segmentation are not restricted to a single vendor’s virtualization platform or operating system.

Security has to be everywhere or it is not security at all.