How does Openflow, SDN help Virtualization/Cloud (Part 2 of 3)

Using Openflow – state of the ART

In my last article I discussed the components of Openflow and building blocks of a Software Defined Network. In this part, let me discuss some of the things people are doing to make it all work. One of the pieces that needs to be discussed beforehand is the various ways in which a packet can be matched against a flow and what kind of actions can be taken.

Flow Classification and the split between Hardware and Software

A flow is a simple mechanism to identify a group of packets on the wire. So a packets coming from a particular machine can be identified by the machines MAC or IP addresses which appears as source MAC in L2 header or source IP in L3 header. By putting a flow rule around either of those fields and just counting the packets going through the switch that hit that rule, we can determine the number of packets being sent by the machine. Its useful information. To make it more useful, one could add another flow to measure the packets going to our target machine. Adding a destination MAC or destination IP rule based on the machines MAC or IP address will accomplish that and using our 2 flows, we can find out how many packets are coming and going out of the machine.

The next question is, who is implementing the rules we just discussed above. There are several options and advantages/disadvantages of each approach:

  • Server Based Approach – It is the easiest way since the server already has to process the packets and it can keep easily keep track of some statistics as well. The issue is when its not a real server but a virtual machine on the server that we want to track. We can still let the hypervisor track the packets or ask the Virtual Machine to track it. The big disadvantage of this approach is that asking the server to do things on your behalf needs certain level of trust (security holes and digital certificates come to mind), depends on Operating System capability and is a directly proportional to lower performance. Since the server hypervisor has to classify or assign packets to flow and take some actions, it needs to see the packets making hardware based virtualization (SR-IOV) impossible to adopt. Most of the data center bridging standards and IO virtualization standards are going towards hardware based switch in the server and doing things in the S/W layers of the server is not going to be possible. The other major disadvantage is scaling issues since as the number of servers grows, orchestrating policies becomes more difficult.
  • Server based with H/W offload – There is more talk around this than real implementations but its worth mentioning that people have have discussed putting special capabilities in the NICs on server to offload some flow processing. The advantage is performance and security (since the Hypervisor controls the NIC, the Virtual Machine can’t circumvent it). The disadvantage is cost and scaling issues. The chips capable of doing this (TCAMs etc) are expensive and trying to orchestrate across large numbers of servers severely limits the scale. We are already seeing Intel Sandy Bridge architecture coming to life which is integrating 10GigE NICs. Adding TCAMs will increase the basic cost by $800-900 and also add significant complexity.
  • Probe Based Approach – Have probes in the network analyze flows. There are companies out there that specialize in inserting probes in the network and collecting the data that can do this well as long as you only want to observe things. If redirection, traffic shaping, header modification or other actions are needed, these passive probes will not work. In addition inserting them requires intrusive work in cabling etc. Needless to say, this is one of the least favorite approach.
  • Switch based approach – Since all the traffic passes through the switches anyway, having them deal with flows and take associated action makes lot of sense. The modern switch chip have H/W based CAMs and TCAMs which can take a rule and do the needful without adding to the latency or throughput of the packet stream. In my past life, as Architect of Solaris Networking and virtualization, I have done the software based approach but given the growing Virtual machine density, SR-IOV type features and growing need for analytics and traffic shaping with performance, I think the switch based approach is far superior. So the CAM and TCAM measuring flows is the Hardware piece. The software piece is ability to add and delete rules on the fly. And Openflow provides a pseudo standard that allows a programmer to work and program any switch. But the biggest advantages are scale, ease of use, and administrative separation of this approach. The scale comes from orchestrating your flows and policies across less number of devices (one switch for approx 50 servers). Also, the people in charge of networks and storage networks are at times different and keeping the administrative separation is useful although not required.

So needless to say, I have currently taken the approach of solving this problem on the switch in conjunction with coordinating with the host using standards like EVB/DCB etc which we will discuss at a later time. Given that new generation switch chips are pretty similar to the server CPU and have same complexity as a server CPU, the problem begs a real Operating System on top which can help us write openflow based applications. This is where we step in. One part of Pluribus Networks effort is around implementing a distributed network Hypervisor (called Netvisor™ a key component of the  nvOS™ Operating System) to give openflow programmers real teeth. We treat any switch chip the same as server chips and most of the code is platform independent with very little that is written to the chips instruction set. Just the same way same Linux code (with little platform specific stuff) runs on x86 and Power or Opensolaris code runs on x86 and Sparc.

Current implementations

So a little overview of projects and people who are leading the charge in the brave world of flows and Software Defined Networking. Before raking me over coal on missing things, the stuff below is what I consider mainstream implementations that apply in world of data centers today (Disclaimer: I have purposely left out most of the research efforts that didn’t reach a mainstream product since there are too many):

  • The discussion has to start with project Crossbow which I believe is the first flow implementation with dedicated H/W resources approach that was available in OpenSolaris in 2007 and finally shipped in Solaris 11 (delayed due to Oracle/Sun merger). The virtual switching in Host and H/W based patents (7613132, 7643482, 7613198, 7499463, etc) were filed by me and fellow conspirators from 2004 onwards and awarded in 2009 onwards. Keep in mind that when Crossbow had virtual switching with a H/W classifier running in OpenSolaris, Xen etc were just coming out with S/W based bridging. The 2 commands – flowadm and dladm allow users to create Flows and S/W or H/W based virtual NICs that can be assigned to virtual machines. This is the Server Based Approach discussed earlier that ships in main stream OS and is pretty widely deployed.
  • A approach similar to above was later adopted by our fellow company Nicira in form of their NVP Architecture. They enhanced the offering by allowing a Openflow based Orchestrator to control the virtual switching in the host although their focus has primarily been on virtualizaton size and not so much on application flows side.
  • Another of our sister and partner companies, Big Switch Networks has taken a hybrid approach of orchestrating any openflow capable device which can be a switch or a virtual switch inside a hypervisor. Since they are still in stealth partially, it would not be my place to talk details.
  • Obviously, every existing network vendor claims that they are working on SDN and openflow. But by definition, SDN requires programmability and Operating Systems to run your programs on. Most of the existing Network vendors lack the know how or the ability to do this. They have rich bank balances and if they can acquire the right companies and leave them alone, then they can potentially bridge the chasm (although it is going to be painful).

And then its our effort at Pluribus Networks. Its a well kept secret that we are building Server-Switches that runs  Netvisor™ which has massive flow capabilities and would be ideal for all the people developing things in SDN space. But then we are in stealth mode and there is lot more to us which we will get around to discussing in coming days.

April 16, 2012 at 7:18 am 3 comments

How does Openflow and SDN help Virtualization/Cloud

Introduction to Software Defined Networking and OpenFlow

Often time I hear the term Openflow and Software Defined Networking Networking used in many different context which range from solving something simple and useful to literally solving the world hunger problem (or fixing the world economy for that matter). I often get asked to explain the various aspects of how Openflow is changing our lives. So here goes a explanation of the religion called Openflow (and Software Defined Networking) and various ways its manifesting itself in our day to day life. Again its too much to write in one article so I will make it a series of 3 articles. This one focuses on the protocol itself. The 2nd article will focus on how people are trying to develop it and some end user perspective that I have accumulated in last year or so. The last article in series will discuss the challenges and what are we doing to help.

Value Proposition

The basic piece of Openflow is nothing more than a wire protocol that allows a piece of code to talk to another piece of code. The idea is that for a typical network equipment, instead of logging in and configuring it via its embedded web or command line interface (the way you configure your home wifi router), you can get the Controller from someone other than the equipment vendor. Now technically and in short term, you are probably worse off because you are getting the equipment from one guy and the management interface from other guy and there are bound to be rough edges.

Openflow creates a standard around how the management interface or Controller talks to the equipment so the equipment vendors can design their equipment without worrying about the management piece and someone else can create a management piece knowing well that it will manage any equipment that support Openflow. So people who understand standards ask whats the big deal? I still can’t do more than what the equipment is designed to do!! And that is the holy grail around any standard. By creating the standard, you are separating the guys who make equipment to focus on their expertise and guys doing management to make the controllers better. This is in no way different than how computers work today. Intel/AMD creates the key chips, vendors like Dell, HP etc. create the servers and Linux community (or BSD, OpenSolaris, etc) creates the OS and it all works together offering a better solution. It achieves one more thing – it drives the H/W cost lower and creates more competition while allowing a end user to pick the best H/W (from their point of view) and the best controller based on features, reliability, etc. There is no monopoly, plenty of choices and its all great for end user.

Specially in the networking space where innovation was lacking for a while and few companies were used to huge margins because users had no choice. One trend that is driving the fire behind SDN is virtualization. Both Server and storage side (H/W and OS) have made good progress on this front but Network is far behind. By opening up the space, SDN is allowing people like me (who are OS and Distributed Systems people) to step into this world and drive the same innovation on network side. So Openflow/SDN are great standards for the end user and people who understand it see the power behind it.

Key Features

Openflow Spec 1.1.2 is just out with minor improvements while 1.1.1 has been out for few months. Most of the vendors only have 1.0.0 implemented. So if you look at the spec, you will see data structures and message syntax needed for a controller to talk to a device it wants to control. Functionality wise, its can be grouped in following parts (understand that I am trying to help people who don’t want to read hundreds of pages of specs):

  • The device discovery and connection establishment part where you tie in a controller to a device that it wants to control.
  • Creating the Flows. In a typical network, there is different type of traffic mixed in, packets for which can be grouped together in the form of flow. If you look at layer 2 header, packets for the same VLAN can be a flow, packets belonging to a pair of mac addresses can be a flow and so on. Similarly packets belonging to a IP subnet or IP address plus TCP/UDP port (service) can be termed as a flow. Any combination of Layer 2, 3 and 4 headers that allows us to uniquely identify a packet stream on the wire is term as flow and Openflow protocol makes special efforts to specify these flows. A Openflow control can specify a flow to a switch which can apply to specific ports or to all ports and ask the switch to take special actions when it matches a packet to a flow.
  • Action on matching a flow. As part of specifying the flow, the protocol allows the controller to specify what action to take when a packet matches the flow. The action can range from copy the packet, decrement Time to Live, change/add QoS label, etc. But the most important action (in my view) is the ability to direct the original packet (or a copy) to specific port or to the controller itself.
  • Flow Table where the flows are created. For actual device, this is typically the TCAM where the flow is instantiated and applied to incoming packets. Most of devices are pretty limited by this and can typically support a very small set of flows today. The protocol allows for specifying multiple tables and the ability to pipeline across those tables but given the state of today’s and mid term hardware, single table is all we can work with.
  • The the last piece is the Counters. Most of the devices support port level counters which the openflow controllers can read. In addition the protocol supports flow level counters but the current set of devices are very limited on that as well.

Putting it all together

So now we understand the components, we can see how it works. A controller (which a piece of code) running on standard server box starts and discovers a device that it wants to manage. In today’s
world, that device typically is a ethernet switch. Once connected, it puts the device under it control and sets flow with actions and reads status from the device.

As an example, assume that a user is experimenting with new Layer 3 protocol and he can add a flow that makes the switch redirect all matching packets to the controller where the packet gets modified appropriately and redirect through a specific egress port on the device. Much easier to implement since controller itself is a piece of code running on standard OS so adding code to it to do something experimental is pretty easy. The most powerful thing here is that the user is not impacting the rest of the network and doesn’t need his/her own dedicated network.

My own favorite (that we have experimented with) is debugging application for a data center or enterprise where the user needs to debug his own client/server application. The user can try and capture the packets on multiple machines running his clients and server but the easier thing would be to set a flow on the switch based on server IP address and TCP port (for the service) and a action that allows a copy of all matching packets to be sent to the controller with a timestamp. This allows the user to debug his application much more easily.

Again, the power of Openflow and Software Defined Networking is that it allows people to innovate and requires someone to solve their problem by writing simple code (or use code provided by others). Its important to keep in mind that switch is a really powerful device since everything goes through it and allowing it to be controlled by C, Java, or Perl code is very powerful. The control moves from the switch designer to application developers (to the discomfort of the switch vendors 🙂

So finally, how does it help Virtualization and Cloud?

This is the reason why I am so excited and ended up spending time writing the blog. The key premise in world of virtualization is dynamic control for resource utilization. Again, network utilization and SLA are important but the key part we need to solve is the utilization of servers. The holy grail is a large pool of servers each running 20-50 virtual machines that are controlled by Software which optimizes for CPU/memory utilization. The key issue is the Virtual Machines are grouped together in terms of application they run or the application developer that controls them. To prevent free for all, they typically are tied together with some VLAN, ACL code, have a network identity in terms of IP/MAC addresses, and SLA/QoS etc. For the controlling Software to migrate the VM freely, it want to manage the VM network parameters on the target switch port as well. And this is where the current generation of switches fail. They require human intervention to configure the various network parameters on the switch that match the VM.

So in order for a VM to migrate freely under software control, it still requires human intervention on the network side. With Openflow, the Software orchestrating the server utilization by scheduling the VMs based on policies/SLA, can set the matching network policies without human intervention.

Just the way a typical server OS has a policy driven schedular which control the various application threads on dozens of CPUs (yes even a low end dual socket server has 6 core each with multiple hardware threads), the Openflow allows us to build a combined server/storage/network scheduler that can optimize the VM placement based on configured policies.

Again, Openflow is just a wire protocol and a pseudo standard but it allows people like me add huge value which wasn’t possible before. In next article, we will go deeper into what people are trying to build and look at some more specific use cases. Stay Tuned and Happy Holidays!!

December 21, 2011 at 3:04 am 3 comments

Network 2.0: Virtualization without Limits

So the theme of the day is Network Virtualization, Software defined networks and taking virtualization to its logical conclusion i.e. server, storage and network in a giant resource pool that can be allocated/assigned any which way. Although its easier said then done. Server and Storage virtualization were a bit simpler since we were dealing with one OS that needed to provide the right abstraction layer. The H/W resource pool (disk, cpu, network, memory, etc) was managed by the single OS so provisioning it between various virtual machines or storage pool was a bit simpler. The network by definition is useful only when multiple devices are connected and trying to treat them as a single resource pool is harder. A virtual networks has to deal with not just links, bandwidth, latency and queues but also
higher level functionality like routing, load balancing, firewalling, DNS, DHCP, VPN, etc. etc. And we haven’t even talked about how this all will hook up together along with virtual machines and virtual storage pool in a easy manner. Now before you argue that every component is already virtualized (which is very true), one could argue that it still doesn’t give me a virtual network. It is same as someone wanting a dinner and is instead served raw potatoes, onions, tomatoes, eggs, etc and shown the stove to make his own Omelette.

So having pioneered virtual switching and resource control in the server OS (Solaris to be specific – the project was called Crossbow that I started in 2003 and got integrated in OpenSolaris in 2008), I set out to do the same for larger networks in the form of Pluribus Networks Inc and apply the hard lessons we learned from enterprise customers. This is what we call Network 2.0: Virtualization without Limits. The real reason it is a tough problem to solve is due to switching needing to be very high performance and low latency. It forces all the switching functionality to be inside a very highly complicated ASIC which does all the hard work in shuffling 1.2 Terabits per seconds of data and sub micro second latencies and as such doesn’t need much software on top. The embedded OS controlling the switch is mostly used for just configuring the switch chip using a cli (command line interface) that allows the administrator to control and configure each component on the switch but almost nothing else. So when we started playing with some of the prototype next generation boxes that our friends at Fulcrum and Broadcom gave us, we just kept asking if I could have a real OS running the chip to be able to do something more useful. So we asked our friends again if there was someway to put a full fledged OS on top (being the OS person I have been for most of my life). And that was when I realized that to solve the network virtualization problem, we really need a OS that understand resource pools and virtualization on the chip. But a single switch by itself is not very interesting so we need a OS that controls all the switches. Hmm – one OS that controls them all (borrowing from LOTR which reminds me to ask Peter Jackson whatever happened to the prequel)!! So before we can even start building anything more complicated, we built a network hypervisor that has semantics similar to a tight coupled cluster but controls a collection of switches and scales from one instance to hundred plus instances.

The Network OS is finally taking life and is able to treat the network exactly as a one giant resource pool. Please don’t confuse the Network OS with typical management layer that manages a collection of devices. We do still need a management layer to configure and manage the OS but the policy enforcement, congestion control and resource management across all devices is done by the OS. It is same as a server cluster doesn’t get rid of the management layer but actually gives the management layer something that is more manageable.

June 19, 2011 at 6:49 pm 2 comments

Solaris as an Open Source alternative to Linux

When I left Solaris after the Sun/Oracle marger, it was because I wanted to try some new things in life possibly based on OpenSolaris. I had led Solaris in networking and network virtualization space for a long time and wanted to make a bigger mark in that space compared to what Oracle might have wanted. But my hope was that Solaris as a Open Source Operating System would continue to prosper and I could possibly use OpenSolaris as a base for whatever I decided to do next. Well, the exodus from Solaris has continued over the past few months and now Mike has also decided to call it quits. Mike was one of my counterparts, running the storage side of the house (other leaders in storage and filesystem space, like Jeff and Bryan had already bailed out of Solaris few months after I left).

So at this point, I am forced to consider the fact that Solaris and OpenSolaris are on the brink of death unless something serious is done about it. Having spent so much time and energy in last 15 years on Solaris (including bringing it back from life after the last tech bust when Solaris had been labeled Slowlaris), I would personally like to see it go on. Given the richness of Solaris and what it offers to developers, the opensource community doesn’t deserve to lose it. We had relentlessly added APIs for all the networking and virtualization code (Crossbow and Zones) in past few years to name a few. Dealing with creating VNICs, walking links, creating Zones, etc from a developer point of view is very easy (more on what’s there for developers some other day).

So the question I have been pondering for last few weeks is what does it take to create a truly vibrant OpenSource kernel as an alternative to Linux. During Sun days, we had tried to set up Solaris as an open source alternative to Linux and we moved all development, process, architectural review, etc in the open but somehow the community never truly believed us. But now with Oracle having closed source the OS and struggling to keep it alive, there might be truly an opportunity for OpenSolaris to be reborn as an true opensource alternative to Linux. There seems to some effort already in form of illumos led by Garrett and an OpenIndiana distro. Now Garrett works for Nexenta who has a business based on Solaris I do believe they can throw some resources to keep it running.

What does an open source OS need?

There are several things that are needed on short term that doesn’t take too many resources. Not in any particular order of importance:

  • Drivers for new device
    I have seen Garrett personally dish out drivers faster then people can install and test them so he can at least keep one part of the OS
    alive i.e. drivers for new devices.
  • Packaging, Delivery and Install
    Then there is the packaging, delivery etc. which someone has to pick up. Perhaps the OpenIndiana guys can make that their core competency. Maybe they can finish the IPS system and make changes for the file based URI that it had already started to go towards. Things like: 

    • Allow someone to make a non network clone of the repo (at least for the true opensource packages including the kernel). This allows enough people to feel that they truly have control over all aspects on the kernel without need a repo server that they don’t control. Maybe something as simple as installing a server from the network repo along the lines of
      % pkg image-create file:///var/pkg file:///my_repo
      % pkg image-create -g origin_server file:///my_repo
    • Allowing someone to save a copy of a package locally that he can later install. Something like this
      % pkg clone [-g origin_server] pkg_fmri_pattern /local_path
    • And allow some to install the saved package over riding the dependency check if needed.
      % pkg install [-f] file:///net/hostname/local_path
      % pkg install [-f] [-g origin_server] pkg_fmri_pattern

    Of course, the directions above are just some thoughts – the details of which need to be refined based on input from the end users. The hosting etc necessary for repo servers for delivery is perhaps the easy part. The install is another story altogether, given that a large amount of code for automatic install has not even been opensourced but I think we can go by for a few years before that has to be addressed.

  • Bug fixing
    Again, given the Solaris talent now outside of Oracle, this is easy. For Bryan, who works at Joyent which again depends on Solaris, doing some P1 fixes is easy. I can do some critical fixes when necessary and there are so many others now outside of Oracle that we need to reach out to
  • New Platform Support
    As new chips come out, you need to add support at the minimum. Now we are getting into some tricky business. I have been discussing the idea of an OpenSolaris Foundation with some companies that support such open source initiatives. I have discussed that with some of the ex Sun DEs and Sr. Engineers as well and it seems appealing to people. Two things needed to happen for this to take off. One was Oracle having truly killed OpenSolaris so there is a clean fork (which I think has already happened). The other is a harder problem and which is the last thing on my list.
  • Mission and Innovation
    The open source OS for the sake of another OS is not very palatable to people who can fund the foundation. I got suggestions around mobile
    space for which I don’t think Solaris is ideal OS (yet). There seem to be interest along the direction of moving Solaris as a distributed OS in the cloud space. This gels with what I have been working on along with few others – a distributed network operating system geared
    towards clouds. Maybe moving my effort on top of Solaris would provide the mission in one direction (and my own requirement to use Solaris
    has been now met i.e. Oracle should kill it so there is no ambiguity). There is a revolution happening in that space already with Openflow, Trill,
    etc. and I am trying to figure out how to break the last barrier in the datacenter.

So to answer the question What have I been upto? –  now you know. And to answer the question Will I allow my work on Solaris to die?  I guess the answer is resounding- No. Will I port things like Crossbow and Zones to freebsd? – the answer is if OpenSolaris truly dies, then hell yes. And before someone points out that Zones is really same as BSD Jails, you should look again carefully.

So Solaris users and Solaris lovers, love to hear your thoughts.

October 23, 2010 at 9:23 am 15 comments

Its not a goodbye. Leaving Oracle but not Solaris!!

This is probably one of the most difficult entries I have ever written. I have decided to leave my job at Oracle. Don’t have a forward destination yet but I intend to take some time thinking about it before I take the next step. I am leaving Oracle but I will still be involved with Solaris and OpenSolaris in some form or the other. Having spent 14 years writing million+ lines of code and architecting some of the most complex subsystems, I don’t intend to just walk away.

The last 2-3 days have been a very emotional journey for me. I thought I was a very strong willed person but it was amazing how many times I came close to tears when so many people stopped by. All I can say is that I am so grateful that the community feels that I had done something useful (both personally and professionally) for Solaris. The journey has been nothing but wonderful and I will surely miss everyone. But I have learned one thing in last several years – to not say goodbye ever because our paths will cross again!!

Best of luck to everyone in the Solaris community who help it to be the best operating system on the planet and especially to the people who help write it. Keep the flag flying!! With Oracle, Solaris will reach more places …

April 2, 2010 at 7:17 am 9 comments

Network in a box

Crossbow Virtual Wire allows us to create a full fledged network comprising of Hosts, Switches and Routers as a Virtual Network on a laptop. The Virtual Network is created using OpenSolaris project Crossbow Technology and the hosts etc are created using Solaris Zones (a light weight virtualization technology). All the steps necessary to create the virtual topology are explained.

Continue Reading March 25, 2010 at 7:10 am 1 comment

CrossBow: Solaris Network Virtualization & Resource Provisioning

Crossbow provide network virtualization and resource provisioning to OpenSolaris Networking Stack. An architects overview of the stack to understand the high level working and why it works and performs way better than other architectures.

Continue Reading March 25, 2010 at 6:32 am 7 comments

Older Posts Newer Posts


Top Rated

Recent Posts

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 177 other followers