Monday, February 18, 2008

Software iSCSI in VI 3: Multipathing and Redundancy

Last week, I did my first VI (3.5) installation using the MD3000i iSCSI SAN. Don't expect many features or a Navisphere-like interface, but expect a light-weight, cost-effective iSCSI solution that is up and running in a matter of minutes.  Moreover, it is a supported storage device for VI 3.5.

We set up the storage network, just like I usually do with a Fibre Channel array. The topology is presented below:

image

I always thought that iSCSI is very similar to FC in its network configuration, and in principle it is, as long as you have two HBAs.

With software iSCSI (as opposed to hardware iSCSI), you can only have 1 iSCSI initiatior (think of it as a virtual HBA). Redundancy is obtained by connecting multiple physical NICs to the storage virtual switch. So far so good, replace HBA 1 en HBA 2 with physical NIC 1 en physical NIC 2 from Server 1, knowing that both pNICs are connected to the storage vSwitch.



Scanning the SAN reveals ... 2 paths (instead of the naively expected 4). Doing failover testing reveals that no failover occurs when disconnecting for instance the link between pNIC 1 and the physical switch. The SAN simply disappeared!

We quickly realized that the whole problem is caused by the fact that there is only one iSCSI initiator (with a specific MAC and IP address) and no real load-balancing (originating port teaming policy is used). Only if we remove the primary uplink of the server, it switches over to the second pNIC, which connects to a different physical switch and also a different NIC on the SAN. In other words, one only sees the third and fourth path in case of a link or NIC failure!

In order for the server to see 4 paths to the SAN, and have complete redundancy for every physical component, one needs an interlink between both physical switches. This effectively solved our issue.

Note: one might be tempted to think that setting the teaming policy to IP hash would solve the above situation of having the second NIC on standby. This is true, only in that case one would need a NIC bond across the two physical switches which also requires an interlink. The effect, in other words, is the same.

No comments:

Custom Search