FCoE vs. iSCSI: The Cagefight! – Performance

In Storage, Technology by J Michel MetzMarch 24, 201018 Comments

In a previous post, I posed a series of questions with respect to understanding the nature of FCoE and iSCSI marketplaces. In this post I’m going to address one of those questions:

Is iSCSI fast enough at 10GbE, 40GbE, and 100GbE – even with overhead concerns, to rival the performance and efficiency of FCoE?

[2012.08.09 Update: This has finally been tested and I’ve written an update about the performance of the two protocols on my Cisco Blog]

Defining The Terms

I want to try to avoid the “yeah, but” or fanboi comments from the outset. First, I understand FCoE much, much better than I understand iSCSI. So, there may be some specifics or details that I may be missing and I highly encourage corrections or additions. My motive here is to examine the technologies in as detached and unbiased as possible to get to the true performance numbers.

Also, I’m looking here at the question of performance. By itself performance is a pandora’s box of “it depends,” and I understand and accept that burden from the get-go. Performance, like price, must be handled as a purchase criterion in context, so I’m not suggesting that any recommendations be made solely upon any one element over another.

Having said that, what exactly are the performance concerns we should have with iSCSI vs. FCoE?

The Nitty Gritty

At first glance, it appears that FCoE provides a more efficient encapsulation method using standard transmission units. There is no need to travel as far up and down the OSI layer stack, for example, which means that there is less processing required on either end of a point-to-point network for dealing with additional headers.

If you’re new to this, think of it this way: You have a letter you want to send to Santa Claus. You write your letter and place it in an envelope and then drop it in the mail. That letter then arrives at the North Pole (if you addressed it properly) and Santa’s helpers open the letter and hand it to him. That’s the FCoE metaphor. (Actually, here’s a much better – and visually appealing – description).

How many layers?

The TCP/IP metaphor (with respect to layers) means that you have to take that letter to Santa Claus and then place it into a larger envelope, and then put that larger envelope into a box before sending it on its way. The extra layers of packing and unpacking takes time and processing power.

iSCSI requires more packing and unpacking in order to get to the letter, the argument goes, so over time that would mean that Santa would – in theory – be able to open fewer letters in the same amount of time.

There is evidence to suggest that this conventional wisdom may be misleading, however. There are a lot of factors that can affect performance to the degree that a properly-tuned iSCSI system can outperform an improperly configured FC system.

In fact, an iSCSI storage system can actually outperform a FC-based product depending on more important factors than bandwidth, including the number of processors, host ports, cache memory and disk drives and how wide they can be striped. (Inverted.com).

Ujjwal Rajbhandari from Dell wrote a blog piece comparing the performance between iSCSI, FCoE and FC in which he found that iSCSI’s efficiency can be profound, especially when enabling jumbo frames.

Dell’s measurements are somewhat difficult to place in context, however. While the article was written in late October, 2009, only 4Gb throughput was used even though FCoE cards running at line speed had been available for more than half a year. (Also, the graphs are difficult to turn into meaning as well: one of the graphs included doesn’t really make much sense at all, in fact, as it appears that CPU utilization is a continuum from reading to writing rather than a categorization of activities.)

It seems to me that the whole point of understanding protocol efficiencies become salient as the speeds increase. The immediate question I have is that if Dell points out that iSCSI efficiencies at 1GbE are inappropriate when compared to faster FC speeds, why would Dell compare slower FC speeds and efficiencies to 10 Gb iSCSI?

For instance, when moving from 4Gb to 8Gb HBAs, even within a pure 4Gb switching environment using 4Gb storage, the overall throughput and bandwidth efficiency can increase significantly due to the improved credit handling.

Nevertheless, there is plenty of evidence to suggest that iSCSI performance is impressive. In February Frank Berry wrote an article about how Intel and Microsoft are tweaking iSCSI for enterprise applications, improving CPU efficiency as well as blasting through some very impressive IOPS numbers. Steven Foskett has a very interesting article on on how it was done and rightfully asks the more important question, can your storage handle the truth?

Now, it’s very easy to get sidetracked as far as looking at other aspects of a FCoE/iSCSI decision tree. “Yeah, but…” becomes very compelling to say, but for our purposes here we’re going to stick with the performance question.

How much performance is enough?

Ultimately the question involves the criteria for data center deployment. How much bandwidth and throughput does your data center need? Are you currently getting 4 GB/s of storage bandwidth in your existing infrastructure?

There is more to SAN metrics than IOPS, of course; you need to take it hand-in-hand with latency (which is where the efficiency question comes into play). Additionally, there is the question of how well-tuned iSCSI target drivers have been written.

So, obviously iSCSI performance can be highly tuned to deliver jaw-dropping performance when given the right circumstances. The question that comes to mind, then is…

How does performance scale?

iSCSI best practices require a completely separate iSCSI VLAN or network, which help with dedicating the bandwidth for SAN traffic. Nevertheless, what’s not clear is what happens to the performance at larger scales:

What happens with boot-from-SAN (e.g., PXE) environments?
What is the theoretical maximum node count?
What is the practical maximum node count?
What is the effect of in-flight security (e.g., encryption) upon performance? What is the threshold for performance degradation?
How does scaling affect the performance of the IQN server/management?
Where is the retransmission threshold for congestion and what is the impact on the performance curve?

This is where my limited experience with iSCSI is likely to get me into trouble. I’m having a hard time finding the answers to those questions as it relates to 10Gb iSCSI, so I’m open to input and clarification.

Bottom line.

Even with these additional questions that arise regarding issues that affect performance, it’s clear that iSCSI does have the performance capability for data center storage traffic. There are other considerations, of course, and I’ll be addressing them over time. Nevertheless, I think it’s quite clear that all things being equal (and yes, I know, they never are), iSCSI can easily put up the numbers to rival FCoE.

You can subscribe to this blog to get notifications of future articles in the column on the right. You can also follow me on Twitter: @jmichelmetz

Comments

March 24, 2010 Reply
Steve Marfisi

Regarding the argument of type of packet (iSCSI, FCoE) is similar to older posts we have seen comparing OS streaming that uses UDP (Citrix) versus iSCSI. Bottom line we have seen is that performance seems to depend a lot more on your network and disk configurations / capabilities than the type of packet being sent.

Boot from SAN (using PXE to iSCSI) or otherwise – here is where Citrix Lead Architect Daniel Feller’s advice is widely applicable, especially when considering boot-from-SAN for desktop environments. Boot storms can arise from hundreds of simultaneous users starting up at one time. An XP OS can require about 75 MB of data to be streamed from start to logon screen, Win7 110 MB or more. Is your network pipeline able to cope with this traffic?

Maximum # of nodes? Lower end iSCSI hardware targets may be limited to 128 or 256 hosts (client sessions) if using exclusive (one initiator-to-one target volume) mode access. Scaling a large # of desktops may force a requirement where shared images can be used for booting, by redirecting individual user writes to caches located in RAM, local disk, or server disk. Or, this can be handled by fronting desktops with software iSCSI targets (that remove this physical cap on max # of nodes) that themselves can be contained on hardware iSCSI backends if one desires the additional benefits of the hardware environment.

Real world numbers? I’ll provide a few examples – 205 XP clients booting simultaneously in 33 seconds, 100 mbps network. The catch here was using SSD disks to contain the individual device or user caches. In another case, where we are providing iSCSI boot for Microsoft 2008 HPC R2 compute nodes, a software-based iSCSI target on $1500 of hardware can easily handle hundreds of simultaneous boots and provisioning(s) of the 2008 HPC R2 compute nodes.
March 25, 2010 Reply
Etherealmind

I absolutely DISAGREE with the statement “iSCSI best practices require a completely separate iSCSI VLAN or network”. This is a false perception created by FC oriented storage people who cannot grasp the concept of oversubscription or QoS.

FC storage is a lazy and dumb networking eco-system and has led to a lot of lazy thinking by storage admins. The FC/FCoE myths of performance or reliability advantage are the result of massive over investment in storage switching equipment from Brocade and Cisco.

Once you start sharing resources in a data centre, your networking costs reduce in orders of magnitude and you can focus on expanding your storage capacity. So far, that seems to be thinking that is far too advanced for storage administrators,
1. November 6, 2012 Reply
  jimmy
  
  ??? I absolutely AGREE with the statement “iSCSI best practices require a completely separate iSCSI VLAN or network”. This is NOT a false perception. What is less complex, setting up QOS for iSCSI or simply putting iSCSI into its own VLAN? The foundation of ethernet is CSMA/CD, carrier sense multiple sense / collision detect – the CD being of the utmost importance. I dont want to rely on integrity of my storage data being affected by Joe Bloggs looping his network cable in the wall and causing a broadcast storm.
Author

March 25, 2010 Reply
J Michel Metz

Steve – excellent. Thank you for putting those numbers into perspective. Part of the reason why I wrote the post was that I knew that it would be much more reliable to have someone “in the know” help place things into context than rely on Google or vendor manuals. I really appreciate the input and it helps a *great* deal.

Greg – speaking of Google and Vendor manuals – I have found a number of best practices guides for iSCSI and each one has brought up dedicating a separate network (or VLAN) for iSCSI traffic. What you’re suggesting goes against conventional wisdom, but it’s not exactly clear the logic behind this grain-defying assertion is beyond a conspiracy theory.
March 25, 2010 Reply
Jeff Darcy

One size never fits all, so it’s generally a good idea to segregate *any* two kinds of traffic with wildly varying needs and characteristics. The claim that the combination of iSCSI traffic patterns with other common IP traffic patterns is somehow uniquely immune to this rule, or that current QoS implementations make it inapplicable, would require some pretty strong proof. Maybe somebody a little less insane with bitterness could provide that proof.
March 25, 2010 Reply
Etherealmind

To riposte to Jeff Darcy and address your own doubt:

The most recent example of network convergence is IP Telephony. When first introduced the Telephony people howled that shared backbone and traffic could never work, that packets would be lost, and the sky would fall. Today, IP telephony is the default solution.

Way it was mainframe people who howled that SNA traffic could NEVER be used on an IP network and must be a dedicated network. Today, TN3270 is the default solution and even DLSW (SNA over IP) worked fine for many years.

Even further, the MUX people claimed that a packet network would never work and must have dedicated circuits.

In between these extremes were videoconferencing, the fax, Microsoft NetBIOS over IP (instead of NetBEUI), IPX, AppleTalk, Banyan Vines and countless other systems that INSISTED that a dedicated network was a requirement.

I’ll be cautious and say that, to date, not a single one of these objections ever proved true and all systems are now converged onto IP networks (and almost all of that on Ethernet – which isn’t perfect)

The people who write about iSCSI best practice often have vested interests in selling you a separate network. Particularly Brocade and HP. And because current storage technologies are relatively immature and hard to use, it’s easier to integrate a “closed” solution to feel confident about your overall solution.

Which is childish.

But it would be better if people admitted to that rather than demand a separate infrastructure. It’s not mandatory, and its not best practice. It’s only current practice.

Clouds won’t achieve success until everything is converged. It’s time to do just that.
1. December 9, 2011 Reply
  Ray
  
  Having a dedicated iSCSI VLAN is analogous to a VoIP network which should be in its own broadcast domain. Typically, IP phones DO NOT reside in the same VLAN as the user’s PC.
  
  You are arguing against yourself.
  1. August 22, 2012 Reply
    mark
    
    LOL. Exactly what I was thinking.
  2. November 6, 2012 Reply
    jimmy
    
    I’ll second that too! hahaa!
Pingback: FCoE vs. iSCSI: The Cagefight! – Flexibility « J Metz's Blog
September 19, 2010 Reply
daniel shoultz

Jmetz,

Totally agree that FCOE and ISCSI are going to continue to fight for those precious data center storage dollars.

It’s definitely neat to break down the cost per iop which we have done a lot for the company I work for. A 200k Iscsi san almost always outperforms a 200k FCOE san which is kind of cool seeing how 95% of the applications out there will run fine on ISCSI.

~daniel
October 26, 2010 Reply
Phil

It *should* all come down to price when considering what protocol to use:

1. If iSCSI can handle the workload, whatever it is, at a comparable price, why not use it? What constitutes the price? This is what a company should figure out first; admin skill sets, h/w, s/w, existing infrastructure investment etc.

2. If the conversation is just a protocol performance one – then take the 2 protocols and measure them: who xmits and processes 1k packets faster from one end to the other?

Both protocols work right?.. If that’s a question, then you better start your eval process over.

2 cents –
1. Author
  
  October 26, 2010 Reply
  J Michel Metz
  
  Price and cost are not the same thing. Opportunity cost, investment cost, migration cost: all increase the “price” beyond just capital equipment.
  
  If iSCSI can handle the workload, and there are many situations where it can (obviously), customers should be encouraged to explore iSCSI as an option. But anyone who looks solely at the price tag (for any purchase, not just within the realm of storage-over-ethernet) will more often than not find that they get what they paid for.
October 27, 2010 Reply
Phil

Ok – replace the word price with cost (same thing to me ie; ‘the price was high and cost many American lives’), and the statement still makes sense (thus my reference to skill set investments etc, yes soft costs are part of the ‘price/cost’.) My point was you cannot look at just performance when making the decision between iSCSI and anything else.
Author

October 27, 2010 Reply
J Michel Metz

This article focused on performance, and as it turns out there is quite an argument for iSCSI performance, particularly tweaked the way that Intel and Microsoft did it (it begs the question as to whether or not such tweaking as advertised is advisable in a real-world environment, but that’s best left for another time).

There *is* a huge issue that pertains to flexibility over time, however. There are many applications that are suited to Fibre Channel infrastructures – hence the reason why there are billions of dollars invested in FC. In this particular case we are looking at an “all things being equal” approach (which, of course, they never are). And if performance is your baseline measurement, then there is obviously a strong argument to be made to look at iSCSI as a solution.

But what if you want the flexibility of being able to continue to run your applications in a lossless, guaranteed environment? Or, what if you currently have data centers that are fundamentally based upon guaranteed, in-order delivery? Moving to iSCSI has an additional cost to those environments. After all, 10Gb isn’t “free,” and if you’re moving to 10Gb *anyway* why not provide the flexibility to run lossless as well as lossy and preserve your investment?
August 25, 2011 Reply
ash

Really looking for some options between ISCSI and FCoe and which is better suited for broad range of applications. Currently we operate 4GB FC SAN, coming up 6 years and time to look at replacing as the mainteance is nearly as much as a new one will cost over 18months. Medium size busines approx 220 servers. 30TB storage.
September 19, 2011 Reply
Wim

The VLAN argument is a best practice but not from a performance POV. VLANs are there for security: they reduce broadcast domains and limit the type of L2 attacks that can bring down a network. As iSCSI uses TCP/IP, it is unfortunately exposed to all security stuff that can go wrong with TCP/IP.

Coming from a network & security integration company, we often see customers overlook this point. So please also take into consideration your network design!
Pingback: FCoE ‘versus’ iSCSI – The Mystery is Solved - WKSB Solutions