Re-Examining FCoE and iSCSI Pros and Cons

In Storage, Technology by J Michel Metz10 Comments

It should be obvious that I’ve been doing a lot of thinking about FCoE over the past 2 years, despite only blogging about it for a couple of months (hey, I’ve been busy!). But someone brought up something recently that makes me second-guess the role that FCoCEE will play in the marketplace in comparison to, say, iSCSI. In particular, with vendors now offering 1 Million IOPS in their NICs, is there enough of an advantage performance-wise?

Now, this is more of an open-ended, brainstorming thinking-out-loud kind of blog post, as opposed to a declarative “this is my carefully considered analysis” kind of post. I mention this up front because this is usually when I make boneheaded statements about things about which I actually know better, but somehow forget when thinking through the process.

I’m not the first to do it on this subject. Scott Lowe did it with a “thinking out loud” post, to which Chad Sakac wrote a fantastic response.

I say this specifically because I have no intention of throwing any person, technology, company, or strategy under any fast-moving public transportation vehicles. Having said that my thoughts, then, examine a couple of different aspects about FCoE adoption as it relates to the overall business case.

Here’s what FCoE has going for it:

1) It’s Fibre Channel (FC).

Despite running over 10GbE, the protocol is pure FC, which is well-understood. The tools and management, provisioning, administration, etc. are familiar to enterprise data centers and does not require a wholesale shift over to alternative SAN protocols.

2) It’s cool.

Large, enterprise data centers with oodles of servers, networks and storage have a wee problem of power and cooling. As I’ve mentioned before, it wasn’t uncommon for customers to have servers with up to 16 NIC/HBAs in their servers.

For grins and giggles, I decided to take a quick cost comparison of Cat 6a vs. TwinAx® power consumption (TwinAx® is the copper cabling appropriate for TOR switching used for FCoE).

Without even looking at the costs of the FC side of things, just by comparing the Ethernet cabling costs was striking. Cat 6a runs at 16w of power, TwinAx® runs at .1w (yes, that’s point-one watts).

The cost of running one Cat 6a Cable for a year was $14.02 (assuming a very generous $.10 kw/h price tag).

The cost of running one TwinAx® cable for a year was $.09.

Skip to the punchline: One data center with 1000 servers, cost of power and cooling:

  • Cat 6a: $70,100/yr
  • TwinAx®: $225/yr

This is just for cabling. That’s not even including the power reduction from not needing to have both NICs and HBAs (let alone multiples of each) in every server.

3) It’s got a distinct performance advantage.

Despite the hype about IOPS (which are important, don’t get me wrong!), it’s not the whole story. The issue with latency becomes a big deal, especially the farther up the TCP/IP stack you go. As Chad points out, anything that is dependent upon TCP/IP (like iSCSI you’ll have to deal with ARP traffic, which can delay transmission in terms of seconds.

For those applications with strict SLAs, that’s simply unacceptable. FCoE keeps the link negotiation to a minimum, expanding the Ethernet frame size to 2.5 k (which means that you should look at the results at the next highest IOPS measurement, 4k, when comparing vendor capabilities for FCoE).

With the link rate increasing from 10Gb to 40 to 100, it seems to me that you’re looking to ask the TCP stack to multiply that number of overhead transactions per second. The flow control mechanisms – particularly variable flow control and windowing – adds even more overhead parameters that FCoE simply doesn’t have to encounter. If I’m correct about this (and there’s no guarantee I am, after all I’m only a lowly CCNA not a CCIE or iSCSI guru), it seems that the faster you go with iSCSI the bigger a performance hit you will take. Am I wrong about this?

4) It’s evolutionary, not revolutionary.

There’s no “R n’ R” when it’s Rip and Replace. FCoE allows Data Center managers to implement FCoE from the perspective of “adding on,” not create an additional SAN silo that can’t be connected to existing infrastructure, nor must they replace thousands of NICs and HBAs with CNAs.

What this means is that FCoE becomes a more longitudinal, strategic approach to the data center. In many ways, this is different than the situational approach to determining connectivity that has been in practice for some time. Planning for long-range integration of legacy (and future – 40, 100GbE) systems becomes key, but ultimately rewarding in providing a unified business plan, not just a unified technological fabric.

5) It’s “wire-once.”

As Chad points out in his blog post, CEE (and by extension FCoE) means that once you wire for FCoE it’s automatically applicable to a broader market (LAN/NAS/iSCSI). Given #4, this means that FCoE infrastructure opens up the data center to broader use of ethernet storage.

Because of these reasons – and several others – FCoE has been touted as “the Next Generation Data Center.”

However, the prediction of a victorious FCoE, triumphant over a hapless iSCSI seems a bit premature. For one thing, as noted above, there have been several recent announcements of iSCSI/Hyper-V performance records that are, indeed, impressive.

Here’s what iSCSI has going for it:

1) It’s cheap.

Let’s face it. iSCSI is a fantastic technology for the non-enterprise space. The overall sandbox for where iSCSI plays is much, much larger than enterprise data centers that will need the type and kind of connectivity that FCoE is built for.

As David Dale pointed out over a year ago, iSCSI’s “sweet spot” is “in support of windows or Virtual Server environments: in tier 2 and tier 3 data centers in large organizations; in the core data center of small/medium enterprises; and in remote offices. In other words, iSCSI is usually chosen for environments that FC has had difficulty penetrating due to cost, complexity, functionality, and support issues.”

Indeed, FCoE is geared towards maintaining backwards compatibility with existing FC infrastructures, and is “unlikely to replace or displace continued iSCSI growth in these sweet spot environments.”

Nevertheless, 10GbE is not free. It involves purchasing 10GbE switches, NICs, and other CapEx considerations that indicates that the same type of budgetary constraints will be placed on iSCSI, even if the initiator portion of it is “free” in software.

2) It’s in a growth phase.

Because iSCSI was closely tied to Windows growth in Tier 2 and Tier 3 data centers (as opposed to FC in Unix environments in the Tier 1 and Tier 2 DCs), it’s easy to see that as the Windows deployment numbers increase, so will iSCSI.

According to IDC, iSCSI has consistently grown faster than the overall networked storage market:

Through the third quarter of this year iSCSI was expected to account for 13% of revenues in the networked storage market, with Fibre Channel  accounting for 61% and NAS for 26%. In terms of capacity, iSCSI accounts for 15% of the networked storage market, with Fibre Channel SANs at a 52% market share and NAS with the remaining 33% of the market.

In terms of revenue growth, IDC estimates that the iSCSI market will grow 58% this year, vs. 17% for the overall networked storage market. And in terms of capacity growth, the iSCSI market is expected to surge 117% this year, vs. 90% for the total networked storage market (David Dale, Dec. 21, 2009).

3) It’s popular.

There are still a lot of Open Source people who are fond of the fact that iSCSI permits them to avoid vendor lock-in. It’s almost impossible to look at any FCoE vs. iSCSI conversation (though it’s a false dichotomy) and not find anti-Cisco, anti-Brocade, or any other anti-FC vendor sentiment.

Even leaving aside the emotional aspects, iSCSI has allowed IT organizations to do more with less, utilizing software initiators when necessary and hardware initiators to capitalize on virtualization needs. As CPU speeds, hardware speeds, link speeds all increase, iSCSI is a flexible approach that can be easily configured by nearly anyone with TCP/IP experience.

For that reason iSCSI runs on a much, much wider base of equipment, and for customers who do not already have an existing shared storage infrastructure is a no-brainer for adoption. In fact, I can’t think of any FCoE vendor who is not also a major iSCSI supplier (though I could be wrong – it’s happened once or twice before).

4) It can be wicked fast.

Early in 2010 Microsoft and Intel hit some very impressive numbers with iSCSI performance and SSDs in a highly tuned setup. Even at 4k block sizes, they pulled in over 600k in IOPS performance, which means about 4 GB/s of storage bandwidth. Regardless of how tuned that is, it’s still impressive enough to force FC admins to sit up and take notice.

5) It’s routable

For some reason I find this to be a common thread among anti-FCoE commenters. iSCSI, by its nature of being TCP/IP, allows itself to be routable over WAN. But, apparently it’s a crucial “pro” for iSCSI when comparing it to FCoE.

Personally, it seems to me that “routable” != multi-hop. While FCoE’s multi-hop standards ratification have yet to occur (as of this writing), it doesn’t appear to me to be a strict apples-to-apples comparison.

6) It’s evolving (too!)

It’s important to remember that iSCSI can run at 10Gb speeds, just like FCoE. In fact, it can run on the same CNAs and enterprise switches (not just the CEE switches that Brocade and Cisco have released in the past year). It’s got a mature software initiation for both Microsoft and Linux OSes, and there are other advanced features of IP that have been emerging in the storage world, only a few of which have actually been put into widespread practice.

This entry is long enough without belaboring further, but given these assumptions outlined above, here are the questions worth exploring:

  1. Is iSCSI fast enough at 10GbE, 40GbE, and 100GbE – even with overhead concerns, to rival the performance and efficiency of FCoE?
  2. Does iSCSI provide the same type of traffic flexibility that FCoE can, for the same level of service?
  3. Does iSCSI provide the same type of legacy support and future-proofing that FCoE does?
  4. Does iSCSI’s market maturity create a barrier to entry for FCoE, and if so in which markets?
  5. Does iSCSI provide expansion capabilities similar to FCoE when 40- and 100GbE emerge into the market?

As I mentioned before, my initial interpretation is that FCoE “vs.” iSCSI is a false dichotomy and, quite frankly, asking the wrong questions. CEE is an overarching umbrella that can, conceivably, encompass both protocols in the right circumstances.

Several of these questions have been addressed elsewhere on the Intarwebs but I reiterate them here because I want to re-examine my own initial assumptions in light of where iSCSI is in terms of adoption and where FCoE’s adoption curve can hiccup:

  • If FCoE has a slowdown in adoption just when iSCSI surges forward, could that kill FCoE’s momentum altogether?
  • If FCoE’s momentum would slow, what would the likely reasons be?
  • If you were in charge of a DC, with or without existing FC infrastructure, is FCoE already a foregone conclusion?
  • If the FC vendors are continuing to push forward with 16Gb and 32Gb FC, as QLogic has already admitted, where exactly does FCoE fit in?
  • How should these questions affect DC customers plans for upgrading their infrastructure?

No matter what the answers to these questions are, they should at the very least be asked and understood. So, that’s my (rather lengthy) brainstorming “thinking-out-loud” for the day.

Update: Stu Miniman reminds me of the issues surrounding scalability. iSCSI is easy and manageable for small implementations, but nothing like what you can do for FC at a massive scale (that’s a quote from his tweet, btw). So the question therefore becomes, where is the crossover range for administration capabilities/ease-of-use for FCoE and iSCSI?

You can subscribe to this blog to get notifications of future articles in the column on the right. You can also follow me on Twitter: @jmichelmetz

Comments

  1. iSCSI Point 1 – the costs you specify for point in that iSCSI cost something are false compared with the costs of CNA/HBA required for FCoE. iSCSI NIC are significantly simpler technology and cheaper to CAPEX and OPEX.

    iSCSI Point 5 – the _ability_ to route is vital because the cost to enable FCoE to traverse WAN links is enormous. You must have Gigabit Bandwidth to enable FCoE in the WAN. iSCSI requires no special equipment and is easily accelerated using WAAS equipment.

    FCoE Point 5 – FCoE is not a requirement for DCB (CEE) implementation. That is, DCB has been underway for many years. FCoE is just the latest technology to use the technology. Cisco has put about 1 billion dollars into FCoE and that’s why you are hearing about DCB now, because of the marketing “noise”. DCB is vital to enable ALL functions in the Data Centre and storage traffic is just one.

    FCoE Point 4 – FCoE definitely is RIP n Replace (excluding storage) . All existing networking equipment must be upgraded and the data centre network needs a complete redesign and re-evaluation. If you think that “keeping your existing storage” but “replacing the entire data network” isn’t Rip’n’Replace then, I guess that’s true.

  2. Author

    Thanks for posting your thoughts. It’s great to have someone to bounce ideas off of. With that, I have some thoughts to continue your brainstorming points.

    iSCSI Point 1 – From a 1:1 comparison, I think you are right about the capEx costs, but I’m not so sure about the OpEx. I know it’s feasible, for instance, to get 10GbE iSCSI software initiation using NetXen’s NC375 for about $500, whereas a full-fledged FCoE single-chip CNA runs close to 3x that. As many DC budgets are segmented that may be enough of a deal-breaker. However when combining multiple traffic types, as well as the ability to share bandwidth intelligently between traffic classes (802.1Qaz), you get much more protocol flexibility with FCoE, negating the need for multiple card types per server in FC environments.

    iSCSI Point 5 – It seems to me that you have an ability to route traffic with the use of 10GbE routers, such as QLogic iSR6250 which use the same single-chip 10GbE CNA, so you aren’t prevented from doing SAN-Over-WAN when going with FCoE at all.

    FCoE Point 5 – No argument here at all. I completely agree.

    FCoE Point 4 – I’m afraid I must push back a little. I completely agree that the data center needs redesign and re-evaluation, and that’s not cheap. But it’s also not true RnR. It seems to me that as a TOR solution, implementing new equipment with CNAs and a Nexus 5020 or Brocade 8000 means connecting to existing LAN networks the same way that you’d add additional switches to an expanding network. Unless you were planning on upgrading 1GbE to 10GbE in your servers *anyway*, there isn’t a mandate to do so with FCoE/CEE.

  3. FCoE is way more complicated than iSCSI and requires much more expensive software and hardware. Therefore maintenance costs will be several more expensive than the alternative.

    FCoE needs to be translated to FCIP or encapsulated in VPLS to traverse geographically diverse sites. Neither is trivial.

    ToR is “patch of green” in data centre not an overhaul. Where the TOR connects needs to be overhauled for full benefit. No one is talking about this yet, but you need to look five years ahead to see what’s coming. No one seems to be doing that.

  4. Author

    Excellent points. This is the reason why I started pondering these questions again from a “fork-in-the-road” neutrality, attempting to bypass my own FCoE biases and wanted to re-examine the notion that FCoE was a foregone conclusion.

    I think it’s important to note that any SAN-over-WAN solution isn’t going to be trivial, nor would massive roll-outs be simplified by an FCoE schematic – especially when accommodating legacy equipment and management. Stu’s point about the manageability for iSCSI over a few dozen nodes is an excellent one, and should be taken into consideration when weighing the pros and cons of FCoE/CEE vs. iSCSI. At that point wouldn’t we simply be trading off one complexity for another? Isn’t this the same for *any* refresh of existing TOR connects as well, regardless of whether or not it’s FCoE-based? That’s what I’m doing research on now.

    I do disagree about the comment that “no one” seems to be looking 5 years down the road; I know that’s preoccupying my thoughts quite a bit, for instance (see some of the other FCoE posts on this blog, e.g.). $1B doesn’t mean a technology will necessarily survive. Let’s not talk about the investment in Windoze Vista. 🙂

  5. Great post! I’ve been thinking about this for awhile as well.

    I come from a networking background, so I have a strong bias towards IP-based technologies. The IT trash heap is littered with standalone technologies that have been subsumed by IP. Performance has never been the concern; it’s the flexibility provided by IP that eventually wins out.

    As an example, voice over dedicated circuits is clearly superior to VoIP, from a quality perspective. The best we have gotten (to date) with VoIP is to equal dedicated voice channels. I’m leaving out HD Voice for now, since it is still ‘on the horizon’ for most of us. The primary advantage was consolidating the infrastructure for cost savings.

    An even more direct comparison is SNA. Again, the best we were able to do with DLSW+, RSRB, etc, was to get the same performance over IP that we had with dedicated circuits. The IP value-add was to consolidate infrastructure, especially in the WAN.

    The interesting twist on this discussion is that Ethernet has the same history of winning battles (See Token-Ring, FDDI, etc, which arguably had superior performance than its Ethernet foe). By hitching its future to Ethernet, Fibre Channel has a fighting chance. The one significant sticking point is with the WAN, which basically requires an IP-based transport for efficiency.

    I see this playing out a lot like the FDDI/Ethernet battle in the Data Center back in the late 90s. For the highest performance needs, we’ll see FC transition to FCoE. For everyone else, we’ll see iSCSI. And in short order, iSCSI performance will reach an acceptable level for nearly everything, and FCoE will slowly fade away.

    I blogged about this @ http://www.jeremyfilliben.com/2009/10/thoughts-on-fiber-channel-over-ethernet.html

    Jeremy

    1. Author

      Jeremy,
      Thanks for the contribution! Personally, I have a heavy FC-bias and I confess that many of the non-storage acronyms listed above were alphabet soup, but I certainly understand the direction you’re taking.

      Having said that, the idea that FCoE slowly fading away is something of an enigma for me. There are $Bs invested in FC infrastructure and the elegance and flexibility of FCoE seems to indicate that – for extremely large implementations at least – FCoE and FC management techniques may be somewhat resilient. For instance, in terms of storage systems, which are easier – zoning or ACLs?

      On the other hand, if you’re correct, why are FC manufacturers continuing to pursue 32Gb FC? Aside from the obvious (that FC vendors believe there will be a market or one can be created), what would the compelling reasons be to pursue such an advanced storage technology?

      Thanks for the link, too. I’m looking forward to reading it.

      1. I suppose my networking biases are even more obvious with the acronyms I used above 🙂

        I don’t have real-world experience on the storage side. In fact, I’m only somewhat familiar with zoning, etc. I’m 100% willing to concede that FC/FCoE is superior to any IP-based storage (CIFS, iSCSI, etc). I’m basing my thoughts on recent history. “Better” doesn’t usually trump flexibility when it comes to these sorts of things.

        Without knowing a lot about FC, I’ll venture a guess as to the 32Gb FC question. It is very difficult to get a tiger to change his stripes. For years there were vendors working on 100mb Token-Ring, when it was painfully obvious to all that Ethernet had already won the war. Sometimes even when a company knows that there is little chance of success, they still need to move forward on their present course, because it’s what they know how to do. For example, think about Polaroid over the last decade.

        I certainly could be wrong about the future of FCoE, but if I am, it might very well be the first time a technology went up against IP and won (maybe MPLS vs. L2TPv3 is a counter-example?). The near-term future of FCoE is bright, but the 5 – 10 year outlook is not good, IMHO.

        Jeremy

  6. Pingback: FCoE vs. iSCSI: The Cagefight! – Performance « J Metz's Blog

  7. Pingback: FCoE vs. iSCSI: The Cagefight! – Flexibility « J Metz's Blog

  8. Pingback: A Collection of FCoE Posts - blog.scottlowe.org - The weblog of an IT pro specializing in virtualization, storage, and servers

Leave a Comment