Shouting at disks and Sun's Fishworks

Sun Microsystems releases a video about a source of disk latency that is at once funny and a nice demonstration of the underlying technologies, including DTrace.

I just ran across this video from Sun Microsystems that demonstrates a source of disk latency that we usually don't think about. It's short and quite amusing. Take a look.

But being the analyst that I am, I also wanted to highlight some of the pieces that make the demonstration possible.

The technology that's under the covers, measuring the disk latency and so forth, is DTrace, a component of Solaris, originally developed by Bryan Cantrill. By way of background, here's what I wrote upon its introduction in Solaris 10:

DTrace stands for "Dynamic Tracing" and, indeed, it's that dynamism that most distinguishes it from other approaches. A developer, administrator, or performance tuner uses the DTrace scripting language to dynamically establish monitoring points of interest, whether in the OS kernel or user processes.

A probe is a location or activity to which DTrace can bind a request to perform a set of actions, like recording a stack trace, a time stamp, or a function argument. Think of them as programmable software sensors that gather interesting information about the system, and report it.

DTrace in Solaris 10 comes with something like 37,000 predefined probes; users can also define their own. Probes come from a variety of kernel modules that Sun calls providers, each of which creates a unique type of instrumentation.

For example, there's a provider that creates a simple time-based counter (profile) and another to understand lock contention and other sorts of locking behavior (lockstat). But essentially any function entry or exit is a potential probe location; DTrace hot-patches the function entry point in memory to insert the probe.

DTrace is an incredibly powerful tool in that it can do all this monitoring and measuring while minimizing its impact on a running system. (SystemTap is a Linux analog that's considerably less mature.) This allows DTrace to be used in conjunction with production systems--and thereby look for performance bottlenecks under real loads rather than synthetic test cases. The downside of DTrace is that you have to be a bit of a Solaris performance guru to actually make effective use of it.

Enter Fishworks, which Bryan calls DTrace-based appliance analytics.

With analytics, we sought to harness the great power of DTrace: its ability to answer ad hoc questions that are phrased in terms of the system's abstractions instead of its implementation. We saw an acute need for this in network storage, where even market-leading products cannot answer the most basic of questions: "what am I serving and to whom?" The key, of course, was to capture the strength of DTrace visually--and the trick was to give up enough of the arbitrary latitude of DTrace to allow for strictly visual interactions without giving up so much as to unnecessarily limit the power of the facility.

You can also find a more detailed overview of the analytics in this Sun presentation (PDF).

Fishworks is a big part of the secret sauce and value-add of Sun's recently introduced Storage 7000 Unified Storage Systems. It also reflects Sun's new, or at least "tweaked," software strategy. While DTrace is open source (under the CDDL license) along with the rest of OpenSolaris, the GUI dashboard, and other analytics software that makes use of DTrace, is not. This is consistent with a more pragmatic approach to open source on the part of Sun that allows for keeping proprietary modules and components mostly aimed at large-scale production deployments.

Fishworks is one of those--and an impressive one at that. It will know if you shout at your disks!

Featured Video