OpenSolaris 2008.11 – A Preview For The Storage Admin

Many reviews have been written about OpenSolaris since its release, but all of them barely tread beyond the desktop aspect, with the obligatory screenshots of the GNOME environment and a high-level description of only the major features most are already familiar with, or at least have heard of.

I’d like to take a different approach with this review, one that descends below the GUI to highlight aspects that server administrators in particular would be more interested in.

OpenSolaris is just a few months old now, with its first version being 2008.5, released two months ago. Since then, Sun engineers and community members alike have been making bi-weekly updates to its various components (currently at build 93) in preparation for the next full release, 2008.11. One who has a OpenSolaris 2008.5 installation may track these intervening builds as they become available using the Image Packaging System‘s update features. Haters of SVR4 package management rejoice.

Before I begin, I’m going to assume that you, the reader, have already heard plenty about the big, oft-quoted features such as ZFS, DTrace, and SMF. If you haven’t, get on over to those links and read up as you’re missing out on some really good stuff. If you have, I’ll show off some compelling new sub-features of those systems, as well as other, unrelated ones. So sit back and read along and try this stuff out when you’re done!

OpenSolaris Is A Storage Multi-Tool

This preview focuses on the new storage-related components of OpenSolaris. This is a category that has received quite a bit of attention over the past two years with many new components being integrated into the OS.

  • ZFS. Prior to the availability of ZFS, Solaris was pretty much on-par with the built-in storage management features of peer OSes, with advantages in some areas of storage management (multipathing and fibre channel) and deficiencies in others (UFS and LVM were getting tough to manage in ever-growing, multi-TB environments.) The inception and inclusion of ZFS drastically improved the situation and brought the concept of storage management to a level of capability and availability (free!) that the industry had yet to see.But don’t think that the designers of ZFS took a break from the action.

    ZFS as it exists in OpenSolaris sports major performance improvements and several additional features of note:

    • GZIP Compression. GZIP compression may be applied to a ZFS filesystem of ZVOL in addition to the original LZJB scheme. GZIP compression offers better compression ratios when compared to LZJB, but at a higher cost in terms of CPU power needed. If the best compression possible is required and CPU capacity is not an issue, this new compression method will make you happy.
    • Case Insensitivity. One may set a ZFS filesystem to operate with no regard to the case of file names, a la Windows NTFS/FAT and default Mac OS HFS filesystems. This feature was added in conjunction with the new CIFS server (see below).
    • Autoreplace. This is a new property of a zpool, which defaults to “off”. What it allows you to do when set to “on” is if a drive in a zpool dies and is subsequently pulled and replaced, ZFS will detect this and automatically bring it back into the pool and do what it needs to do with it to restore the pool to an optimal state. When left in the default state, manual intervention after the drive replacement is required.
  • Easy iSCSI. One of the first big feature adds after ZFS went GA is the ability to easily create LUNs for export via iscsitgtd. Now, remember that iscsitgtd serves out block devices, not filesystems, so one must create and use raw ZVOLs that get their own device entries in /dev/zvol/… so that they can be treated like any raw disk device. After creating a ZVOL, exporting it out via iSCSI is as simple as:
    zfs create -V 256G pool/my/zvol
    zfs set shareiscsi=on pool/my/zvol
    - or do it in one command -
    zfs create -o shareiscsi=on -V 256G pool/my/zvol

    You can also create LUNs that are hosted on UFS, or any other supported filesystem for that matter. Management of the iSCSI LUNs presented by iscsitgtd is accomplished via the iscsitadm(1M) utility so you can set up things such your custom IQNs, ACLs, CHAP or RADIUS auth, iSNS properties, and so on.
  • Served Via CIFS. No, this is not a re-packaged Samba with Solaris-specific tweaks. This is the real deal – a native, fully integrated CIFS server that implements the CIFS/SMB LM 0.12 protocol and MSRPC services. It can run in the simple Workgroup mode, or as a member of a Windows AD domain with the full ability to use a domain controller for conferring access and other rights, including the mapping of AD users to UNIX users (so this means that the ZFS or UFS filesystem that comprises a CIFS share can also be exported via NFS in dissimilar environments.) This makes OpenSolaris a truly viable alternative to Windows Server for high-performance, integrated CIFS share serving. Combined with the filesystem management of ZFS, this new CIFS server feature is very compelling. Have a ZFS filesystem that you need exported to some Windows (or Mac, or Linux) boxes? Just like it was with iSCSI, it’s this simple:
    zfs set sharesmb=on pool/my/fs
    You can set additional share parameters, such as its advertised name by replacing “=on” with other arguments. See the section for the set option in the sharemgr(1m) man page. Management of LM users, groups and server mode are accomplished with the separate smbadm(1M) command.
  • NDMP Backups. That’s right, a new NDMP service is now present for all your enterprise backup needs. Have some of those expensive Legato Networker NDMP licenses to burn, or want back up a NetApp or other NDMP-capable device to a Sun Fire X4540 so you can pitch it and its support contract out the window? Fire up the NDMP daemon and go to town with the ndmpadm(1M) command. This new service in OpenSolaris supports NDMP versions 2, 3, and 4. There is also a nifty way to get statistics on your NDMP sessions with the ndmpstat(1M) command.
  • COMSTAR. Or Common Multiprotocol SCSI Target, is quickly becoming the in-kernel nexus of sorts for exporting a generic “block device” outside the system over an array of protocols and transports. What this subsystem allows you to do is take a ZVOL and export it over things such as Fibre Channel and FCoE. That’s right, if you have a system with Qlogic QLA/QLE24xx cards, you can turn them into a target rather than an initiator and serve LUNs over FC on your SAN. Your OpenSolaris box is looking like a classic storage array now, but with far more features and flexibility. Fiber Channel over Ethernet (FCoE) is also an option. Future plans include bringing the aforementioned iSCSI server under COMSTAR’s domain as well. Earlier Qlogic HBAs such as the 2Gb models (QLC23xx) are not supported as targets as those cards lack features required in their firmware to put them in such a mode.
  • Replicate With AVS. Sun has offered a filesystem-agnostic block-level replication software suite for some time called StorageTek Availability Suite, or AVS. This was a pay-for product, but Sun has graciously donated it in full to OpenSolaris, so now it is free to use. AVS allows you to configure synchronous or asynchronous replication over the network to a remote sever, with additional capabilities such as shadow images (termed as “Instant Image” in AVS). AVS lives in-kernel, and situates itself between the filesystem (ZFS, UFS, etc) and the disk devices, copying blocks off to their configured destination. Here are some demos by lead developer Jim Dunham that demonstrate the use and capabilities of AVS.
  • SAM-QFS. As with AVS, SAM-QFS was formerly an unbundled, pay-for product from Sun, but has been open sourced and provided as a integrated part of OpenSolaris. SAM-QFS has two major components – QFS, which is a SAN-based multi-writer and reader filesystem, and SAM, a hierarchal storage management (HSM) system which sits on top of QFS.QFS in particular is found in both the HPC and service provider data center role where multiple nodes require concurrent read/write access to the same file system over fibre channel or iSCSI. It can be utilized in a single-writer setup, where one node can write but all others are read-only, or a multi-writer setup, where > 1 nodes require write access. In the latter case, additional infrastructure is required in the form of a metadata server, the role of which is to manage and coordinate locks and write access amongst the involved nodes.

    Along with QFS, there is the SAM component. This allows one to age data off to cheaper bulk storage (cheap SATA arrays) and/or tape-based long term storage.

That’s a snapshot of the major additions, but there are still a lot of smaller projects that have been integrated, or are on the verge of being made available such as pNFS, MMS, ADM, and Honeycomb. As you can see, OpenSolaris offers quite a bit for those looking for highly flexible ways to store, manage, and export data to and from other systems… and it’s all built in and fully functional. No licenses or hidden costs.

The OpenSolaris Storage Community is a great place to keep track of what’s new, and provides a very nice visual representation of the various storage layers and components in OpenSolaris.

18 Replies to “OpenSolaris 2008.11 – A Preview For The Storage Admin”

  1. “Earlier Qlogic HBAs such as the 2Gb models (QLC23xx) are not supported as targets as those cards lack features required in their firmware to put them in such a mode.”

    If you’re saying these cards don’t feature Target mode… you’re wrong.

  2. Will the next version of OpenSolaris have better SATA support? I have an Intel DG965RYCK motherboard with SATA DVD and hard drives – every Linux distro I’ve tried over the past 18 months installs on this board – but not OpenSolaris 2008.5. Difficult to find more mainstream hardware than this. Looking through the OpenSolaris forums shows folks are having to ask which add-on SATA card works with OpenSolaris. I can’t learn and benefit from ZFS if my Intel disk controller isn’t supported.

    http://forums.opensolaris.com/search.jspa?q=SATA

  3. @Peter Griffen

    The ahci driver supports Intel ICH6/7/8/9, VIA vt8251 and JMicron AHCI controllers (per its man page) and according to Intel’s documents, your mobo has the ICH8 controller, so the ahci driver in theory *should* be attaching to it.

    Are you using a RAID mode on it, perchance? If you have any error strings, or is your machine booting from it but is using the disks in IDE compat mode?

  4. Could I install OSOL right now and guarantee forward compatibility with all these new ZFS features and the new CIFS stuff?

    I want to take advantage of them, but I don’t want to install OSOL now and be SOL when the time comes to upgrade. I assume you can, but perhaps the internal ZFS format will change and it will be only backwards compatible?

    Also – do you have any good recommendations for 6-8 drive chassis that will be quiet so I can throw 8x 1TB disks in it and use it for ZFS/home storage/daily snapshots of my remote servers I administer?

    Thanks a ton. I’m getting more excited and I’m contemplating attempting to run OSOL now, I’ve always been a Linux guy and a small amount of FreeBSD…

  5. @rocky, AFAIK, Qlogic owns and writes the drivers (so you won’t find it in opensolaris)

  6. Just a note on the reference related to SAM-QFS… The author references that SAM (HSM/ILM aspect) “sits on top of QFS”. Technically this is not correct. Unlike all other HSM type products, e.g., DMF, StorNext, ADM (work in progress) with ZFS, the SAM functions and features are fully integrated WITHIN the file system structure of QFS. The choice to use SAM with QFS is a feature-driven decision, not a product level one. Only SAM-QFS allows all metadata about the file (Unix and HSM-type) to be fully stored in the inode of the file. All other systems use the more communication intensive approach of a file system linking to HSM executables via API calls (often set as DMAPI). The SAM-QFS approach also allows the system to provide all relevant file metadata (Unix and HSM file copy status/location, etc.) in a single 512-byte inode “open/read” whereas the other HSM approaches normally require much more communication between file system and HSM with database lookups and queries to gain that same information.

Leave a Reply

Your email address will not be published. Required fields are marked *