Temperature variations between my server’s HDDs

In December 2009 I rebuilt my home file server to be a FreeBSD 8.x machine with six 1.5TByte Western Digital HDDs in a ZFS raidz2 configuration. From February 2nd 2010 I’ve been sampling the internal SMART temperature senors of each drive every 6 minutes. This post summarises some of what I’ve captured. The drives /dev/ada0 through /dev/ada5 are housed vertically in a small tower case, as shown in the photos below.

Each drive reports temperature in integer degrees C, collected via the SMART interface using the following line:

smartctl -a /dev/ada${i} | grep Temperature_Celsius | awk -F " " '{printf $10 "\n"}'

To aid visualisation, I’ve smoothed the temperature data by calculating the average temperature in 18hr windows of time, moving across the entire time period in steps of 2.25hrs. A cumulative distribution of the smoothed temperatures recorded between 2nd February 2010 and 2nd April 2013 for each drive is shown in Figure 1. Each drive’s mean temp (in degrees C) is shown in parentheses after their name.

Figure 1: Cumulative distribution of smoothed temperatures per HDD

The drives are physically arranged from ada0 to ada5 (top to bottom). But if ranked by ascending mean temperature we get a different order: ada0, ada3, ada4, ada2, ada5, ada1 Drive ada1 is consistently hotter than ada0 over time. Figures 2 show the smoothed temperatures of ada0 and ada1 versus number of days since 2nd February 2010. You can see what I take to be seasonal variations in 2010 and 2011 (trending lower for winter in the middle of each year), but I am as-yet unsure why this variation seems to disappear from the end of 2011 and during 2012.

Figure 2: Temperature versus time over ~3 years

Figure 3 zooms in on a handful of days, revealing a definite 24hr periodicity to the temperature fluctuations. (Although elided for clarity, the other four drives all show essentially identical 24hr cycles, and maintaining the relative difference in absolute temperatures implied by Figure 1.)

Temperatures versus time revealing clear 24hr periodicity

Figure 3: Temperatures versus time revealing clear 24hr periodicity

How might we explain the slightly different (yet relatively consistent) temperature differences? Could ada0 be coolest because it has the best airflow across the top? And then ada1 the hottest because it has four more drives below and ada0 sitting immediately above, limiting air flow? The next hottest after ada1 is ada5, the drive right at the bottom of the stack — perhaps indicating worse airflow. However, the absolute temperature differences do not seem that great (a few degrees C) so perhaps I’m reading too much into the evidence. Might be that different levels of dust accumulation, and manufacturing differences between the drives themselves, are key influences too.

[Edited April 13th 2014 to fix graphs.]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: