You are here

Data Storage Solution: Hardware & OS

Overview
This entry covers the technical details of the implementation.

I approached this by breaking down the stack into individual steps and then I conduct performance, security and data management reviews.

The Physical Stack

  • Storage Devices
  • Storage Controller
  • Server
  • Network
  • Site
  • User

Storage Devices
100% SSD solution. I provide a separate HDD for boot (1x512GB Samsung 850 EVO) and an array (2xTB Samsung 850 EVO) for storage. This division is mostly historical but it saves me the trouble of setting up a bootable partition on the raid array.

I could not afford >2 discs or >1TB so I will have to make due with a 1TB RAID1 mirror. This is about 3x larger than the existing array so I expect this to give me a 3yr buffer until larger capacity SSD are available.

Storage Controller
This Intel bridge provides 5x3Gbps and 1x6Gbps SATA ports.

The boot drive is attached to the 6G port and the storage array (as well as a DVD drive) are attached to the 3G ports. 3G is more than sufficient for any application including streaming video and the 6G port should provide optimal boot times, desktop latency, app loading, and server response.

Server
This is an Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz. This provides 4x 64bit cores, 1 thread per core. I've installed 4GB DDR3. I find that this is more than sufficient for any task except for processing audio and video. It does well in these applications for all but the impatient.

Network
The network is a gigabit switch and a DD-WRT firewall which also acts as an access point and a gateway. All connections on the network are ethernet except for laptops and smartphones. All servers are firewalled and the gateway is also firewalled.

Site
This is my physical residence so this is subject to power outages, home breakins, busted water pipes, things of this nature.

User
This is primarily for myself and those I choose to share my data with.

The Logical Stack

  • Partitioning
  • File System
  • Raid Controller
  • IO Scheduler
  • Operating System
  • Apps

Partitioning
The boot drive contains the root partition which stores the operating system, a swap partition and a scratch partition. The scratch partition is for project experiments and in no way contains data that should be preserved.

A typical root partition holds user data in /home and /var so this will need preservation. As a result the array contains a 20GB partition for /home, /var and the remainder dedicated to data storage. For details see Data Management.

Boot Drive Partitioning (/dev/sda):

Disk /dev/sda: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000ee05f

Device     Boot    Start       End   Sectors   Size Id Type
/dev/sda1  *          63  41961471  41961409    20G 83 Linux
/dev/sda2       41961472  50350079   8388608     4G 82 Linux swap / Solaris
/dev/sda3       50350080 976773167 926423088 441.8G 83 Linux

Storage Array Partitioning (/dev/sd{b,c}):

Disk /dev/sdb: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 23ABB100-66E6-4A0A-B0A4-E7560756A29A

Device        Start        End    Sectors   Size Type
/dev/sdb1      2048   39063551   39061504  18.6G Linux filesystem
/dev/sdb2  39063552   78125055   39061504  18.6G Linux filesystem
/dev/sdb3  78125056 1953525134 1875400079 894.3G Linux filesystem

File System
EXT4 is one of the most mature, yet state-of-the-art file systems available. It provides the most consistent and still exceptional performance over many workloads, platforms and configurations. It also reserves some portion of the drive for root user in situations where the drive is full and you are trying to recover. For these reasons I chose EXT4 for the root partition.

For the remainder of partitions data tends to be read/write in large chunks. So I looked for a fs that is known for its performance and scalability. Any optimization for large transfers that may adversely affect smaller bits of data is made up for the fact that the hardware is so fast. I looked at JFS, XFS and briefly at the flash specific filesystems. I'm not familiar enough with the flash based filesystems. They are quite interesting, JFS is great but XFS is well known for its performance and scalibility so I chose XFS for the remainder of the partitions.

There were a few specific EXT4 optimizations:

  • Maximum Mount count (tune2fs -c 100)
  • Check internval (tune2fs -i 6m)
  • Reserved Block Count (tune2fs -m 5 (/, /var) -m 1 (all others))

EXT4 by default chose a block size of 4096bytes which for a modern compute environment is supposed to provide a good balance of performance and disc use efficiency (for files < 4096bytes). 4096 is also an even multiple of 512 the HDD block size so alignment is achieved.

XFS itself does not provide many switches so there are no customizations of the XFS partitions.

For the /var and /home partitions XFS chooses a default block size of 4096bytes with no striping.

For the media storage partition I wanted a larger block size to reflect the files that will be stored there (ie almost always >1MB, usually >10MB and sometimes >1GB in size). Therefore I configured the partition to use 64kB blocks and told XFS to optimize for a 2 disk RAID1 mirror.

  • mkfs -t ext4 /dev/sda1
  • mksp /dev/sda2
  • mkfs -t ext4 /dev/sda3
  • mkfs -t xfs /dev/md0
  • mkfs -t xfs /dev/md1
  • mkfs -t xfs -d su=64k -d sw=1 /dev/md2

Raid Controller
This implementation uses software raid. I have always avoided motherboard based / Intel RAID solutions for fear of portability issues and lack of restoration options. So I will continue to use mdadm.

Device Options Metadata
/dev/md0 default v1.2
/dev/md1 default v1.2
/dev/md2 default,64k chunk,bitmap v1.2

Command Summary:

mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdb2 /dev/sdc2
mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/sdb3 /dev/sdc3
mdadm --detail --scan /dev/md{0,1,2} >> /etc/mdadm/mdadm.conf
cat /proc/mdstat
mdstat --detail /dev/md{1,2,3}

/proc/mdstat:

Personalities : [raid1]
md0 : active raid1 sdb1[0] sdc1[1]
      19514368 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sdb2[0] sdc2[1]
      19514368 blocks super 1.2 [2/2] [UU]

md2 : active raid1 sdb3[0] sdc3[1]
      937568960 blocks super 1.2 [2/2] [UU]
      bitmap: 0/7 pages [0KB], 65536KB chunk

unused devices: 

Mounting Options
Because both filesystems contain in their superblocks most configuration options there is not much to fstab except telling the operating system of their existence.

/etc/fstab:

Device Mount Point Mount Options
UUID / relatime,errors=remount-ro
UUID /mnt/scratch defaults,relatime
/dev/md0 /var defaults,relatime
/dev/md1 /home defaults,relatime
/dev/md2 /mnt/files defaults,relatime
# /etc/fstab: static file system information.
#
#                
proc            /proc           proc    defaults        0       0
UUID=3f227b45-acd3-478a-bccd-d700143e9951       /               ext4    relatime,errors=remount-ro 0       1
UUID=2508db3d-ad42-402d-8c8c-c49dbce3679f       /mnt/scratch    ext4    defaults,relatime          0       2
UUID=bc65b7d4-69e3-4ec2-8593-22fd219b133b       none            swap    sw                         0       0
/dev/md0                                        /var            xfs     defaults,relatime          0       2
/dev/md1                                        /home           xfs     defaults,relatime          0       2
/dev/md2                                        /mnt/files      xfs     defaults,relatime          0       2
/dev/sr0                                        /media/cdrom0   udf,iso9660 user,noauto            0       0

The relatime feature is a write optimization that eliminates a filesystem feature that is rarely (if ever) used by the OS or apps and can significantly reduce the number of small writes to a drive thereby increasing performance and lifespan of the drive.

Disk Organization Summary

Device Start End Type Mount Point dev Size
sda1 63 41961471 Linux / /dev/sda1 20.0GB
sda3 50350080 976773167 Linux /mnt/scratch /dev/sda3 441.8GB
sd{b,c}1 2048 39063551 Linux /var /dev/md0 18.6GB
sd{b,c}2 39063552 78125055 Linux /home /dev/md1 18.6GB
sd{b,c}3 78125056 1953525134 Linux files /dev/md2 894.3GB

IO scheduler
The deadline scheduler is known provide best performance on XFS and from the various tests I've seen online it provides a small benefit or at least does no harm on EXT4. The deadline scheduler guarantees minimum latency.

/etc/default/grub:

  • Add to GRUB_CMDLINE_LINUX_DEFAULT: "elevator=deadline"

Operating System
The operating system is Debian "Jessie" 8.0.

File system Permissions:

  • /var and /home inherit distro (LSB) permissions
  • /mnt/files has 775 permissions. This eliminates any unwanted writes (by daemons or accidents).

Apps
There are several types of apps that make use of this data:

  • Dolphin (and desktop apps)
  • Baloo (desktop search)
  • Media apps
  • System Monitoring

Dolphin (and desktop apps)
Not much needs to be changed. Apps will work out of box. But tweaking dolphin can bring you to your data quicker:

  • Set up shortcuts to Scratch, Storage and Projects
  • Enable full path inside location bar
  • Use common properties for all folders
  • Set up Trash: 90days/10%

Baloo (desktop search)
This is KDE's semantic search. The default settings will attempt to archive your entire disk which is incredibly wasteful both in disk size and activity but also tends to make your search results full of noise.

  • Add do not search exemptions in KDE System Settings->Desktop Search.
  • Make sure to Enable Desktop Search in KDE System Settings->Desktop Search.

It should leave /home, /media and /mnt enabled for search. This results in approximately 200k files.

Media apps
Amarok is my primary Linux desktop app but I am rarely on the desktop and usually have my bluray player or my game console open. These use DLNA or plex media server. Both apps are enabled and configured. I also stream from my smartphone using plex.

System Monitoring
The goal is to enable monitoring and reporting at all levels of the stack.

Item Monitor Notes
Storage Devices smartmontools,Munin The hdd are smart enabled and I use Munin to track usage, temperature and other S.M.A.R.T. metrics. I set alarms for critical items.
Partitions backupninja Tool to backup partition information nightly
File System Munin Capacity and load tracking. Does systemd automatically check integrity and fragmentation?
Operating System/Server lm-sensors,atop,systat,Munin,logcheck Environmental, load and system metrics. Logcheck monitors logs for abnormal events, RAID failures, alarms, etc.
IO scheduler NA Not needed
Network firewall Firewall logs abnormal events and an IDS looks for attack signatures
Site NA Not needed/possible
Apps Munin Most services are tracked with Munin. However baloo and service pinging or network pinging are not enabled.
User Munin Munin tracks load status

Notes on Partition Copying
More googling revealed that rsync is a simple to use and extremely effective disc copy utility in situations where partition size or file system are different.

The command I used for copying partitions was as follows. The first performs the actual copy and the second doublechecks.

rsync -avxHAWX --info=progress2   > ~/rsync.out
rsync -avxHAWX --info=progress2  

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <table> <tr> <td> <ul> <ol> <li> <dl> <dt> <pre> <dd> <img> <sub> <sup>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
11 + 1 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.