Wed Jun 24, 2009 12:03 pm
I'd like to see how you handle write cacheing in combination with HA.
Basically, I would want a way to force both synchronous replication and cache mirroring. The overall objective is that if an initiator issues a write command, it's not notified that the write is committed until both starwind boxes have it their write cache. Write cacheing is useless if it increas the risk of data corruption, which would be possible if its combined with mirroring in the wrong way!
I guess it should work something like this:
Initiator issues a write to the iSCSI target via it's primary path as defined by MPIO
Starwind gets the write data and sticks it in RAM. It communicates with the other Starwind server, which then also has the data in cache.
When both servers agree that they both have the same data in cache, the write is acknowledged to the initiator.
Independently of each other, the starwind servers write the cached data to disk. Once both servers have written the data successfully, the cached data is marked as safe to delete from RAM, but isn't deleted until either a new write of the same area of disk comes in, or space is needed for other writes. That way if a read request comes in, it can be served from cache without having to come from disk.
As for UPS, that and other conditions (e.g. a shut down command) should automatically disable cacheing on the affected node, which should concentrate on writing the contents of cache to disk. Affected iSCSI targets on the other node should also have cacheing turned off, and replication disabled, until the failing node has shut down completely. There should be a way of limiting the size of the cache so that you can make sure that the UPS will give you enough run-time to write the cache to disk. It might be a good idea for the still running node to do a snapshot when it gets notification that the other node is going to shut down, so that the changes can be resynced more quickly when the node comes back up again.
cheers,
Aitor