Why?


Search This Blog

Friday, May 29, 2015

Centos 7 zfs install

Centos 7 zfs install

I have already installed asterisk on this system (see my post on that if you wish). During that install I did a kernel upgrade to 3.19.8, and that is the kernel I am running.

I am using putty/ssh and root user.

I am at home behind firewall and do not use iptables, firewall, or selinux on my Centos7 system, so I have turned them off.

Turn off firewall and iptables if they are on.

# systemctl disable firewalld.service
# systemctl stop firewalld.service
# systemctl disable iptables.service
# systemctl stop iptables.service


Disable selinux if it is enforced.

# vi /etc/sysconfig/selinux
    SELINUX=disabled


reboot if you have changed these

# reboot

Now get the packages and install them.

# cd /root
# yum -y localinstall --nogpgcheck https://download.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm
# yum -y localinstall --nogpgcheck http://archive.zfsonlinux.org/epel/zfs-release.el7.noarch.rpm
# yum -y install kernel-devel zfs


You can now load the ZFS module:

# modprobe zfs

After running the above command you should have seen a list of loaded modules from ZFS.

# lsmod | grep -i zfs
zfs                  2179437  3
zcommon                47120  1 zfs
znvpair                80252  2 zfs,zcommon
spl                    89796  3 zfs,zcommon,znvpair
zavl                    6784  1 zfs
zunicode              323046  1 zfs


You should now make sure the module is loaded persistently on boot. We need to make a new file and add a script to it.

# vi /etc/sysconfig/modules/zfs.modules

Add the following code:

#!/bin/sh

if [ ! -c /dev/zfs ] ; then
        exec /sbin/modprobe zfs >/dev/null 2>&1
fi


Make this file executable:

# chmod +x /etc/sysconfig/modules/zfs.modules

Now reboot and make sure everything loaded

# reboot

After reboot run lsmod again and make sure modules are loaded

# lsmod | grep -i zfs
zfs                  2179437  3
zcommon                47120  1 zfs
znvpair                80252  2 zfs,zcommon
spl                    89796  3 zfs,zcommon,znvpair
zavl                    6784  1 zfs
zunicode              323046  1 zfs



Create pool. I have 4 WD RED 3TB drives on /dev/sdb through /dev/sde. I will create a raid10 pool with these now.

First I make sure I am at the latest firmware on my drives. I use the WD tool for that

# ./wd5741x64
WD5741 Version 1
Update Drive
Copyright (C) 2013 Western Digital Corporation
-Dn   Model String           Serial Number     Firmware
-D0   Samsung SSD 850 PRO 128GB   S1SMNSAG301480T   EXM02B6Q
-D1   WDC WD30EFRX-68EUZN0   WD-WMC4N0J0YT1V   82.00A82
-D2   WDC WD30EFRX-68EUZN0   WD-WMC4N0J2L138   82.00A82
-D3   WDC WD30EFRX-68EUZN0   WD-WCC4N2FJRTU9   82.00A82
-D4   WDC WD30EFRX-68EUZN0   WD-WCC4N7SP4HHF   82.00A82


As you can see I have Samsung SSD 850 PRO 128GB I use as my /boot and OS drive.

Then I turn off head parking on my WD RED drives with the WD tool

# ./idle3ctl -d /dev/sdb
Idle3 timer disabled
Please power cycle your drive off and on for the new setting to be taken into account. A reboot will not be enough!


I do this on all four drives then reboot.

# reboot

I then make sure the setting stuck.

# ./idle3ctl -g /dev/sdc
Idle3 timer is disabled


I check all four drives for the "disabled" value above.

Now I zero out the MBR to remove any legacy info that may have been on them.

# dd if=/dev/zero of=/dev/sdb bs=1M count=1

Repeat for all four drives.

I will now create a raid10 pool, called myraid, for use.

# zpool create myraid mirror -f /dev/sdb /dev/sdc mirror /dev/sdd /dev/sde

Make sure it was created

# zpool status
  pool: myraid
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        myraid      ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            sdd     ONLINE       0     0     0
            sde     ONLINE       0     0     0


Check it is mounted

# mount | grep zfs
myraid on /myraid type zfs (rw,xattr)


# df -h | grep myraid
myraid                5.3T  284G  5.0T   6% /myraid


If you don’t see it mounted try

# zfs mount myraid

Add ZFS to auto mount /myraid with boot if wanted

# echo "zfs mount myraid" >> /etc/rc.local           

To speed things up

# zfs set sync=disabled myraid

Read below before disabling though

sync=standard
  This is the default option. Synchronous file system transactions
  (fsync, O_DSYNC, O_SYNC, etc) are written out (to the intent log)
  and then secondly all devices written are flushed to ensure
  the data is stable (not cached by device controllers).

sync=always
  For the ultra-cautious, every file system transaction is
  written and flushed to stable storage by a system call return.
  This obviously has a big performance penalty.

sync=disabled
  Synchronous requests are disabled.  File system transactions
  only commit to stable storage on the next DMU transaction group
  commit which can be many seconds.  This option gives the
  highest performance.  However, it is very dangerous as ZFS
  is ignoring the synchronous transaction demands of
  applications such as databases or NFS.
  Setting sync=disabled on the currently active root or /var
  file system may result in out-of-spec behavior, application data
  loss and increased vulnerability to replay attacks.
  This option does *NOT* affect ZFS on-disk consistency.
  Administrators should only use this when these risks are understood.

 
 
You can also turn lz4 compression on your pool which speeds things up, at the cost of some cpu though. I have an i5 4590 with 16GB RAM so I have the resources to do so.

Tuning

# cat /sys/module/zfs/parameters/zfs_prefetch_disable
0

# modprobe zfs zfs_prefetch_disable=1


This setting is done in the /etc/modprobe.d/zfs.conf file.

I have i5-6600k with 64GB DDR4 RAM on a Supermicro C7Z170-OCE motherboard, x4 WD RED 3TB drives in mirrored stripes on a Supermicro 8 port 600MB SAS/SATA HBA AOC. Dont use these parameters below unless you know your hardware :)

Below are my settings for my NAS with ZFS foir use in running ESXi VMware guest images off of over 10Gb network between ESXi 5.5 U2 and the NAS via NFS. I am getting 340+MB read and write across the wire with linux client with SSD to NAS ZFS pool using scp. I think I reached my drives/zpool performance. Time to add more drives. I also tested with Windows 10 guest image on the zpool via 10Gb NFS and get read and write of 280MB to/from the Zpool Windows image and the NAS SSD drive. I think the smb/samba adds some overhead there.

edit zfs.conf to reflect:

# disable prefetch
options zfs zfs_prefetch_disable=1
# set arc max to 48GB. I have 64GB in my server
options zfs zfs_arc_max=51539607552
# set size to 128k same as file system block size
options zfs zfs_vdev_cache_size=1310720
options zfs zfs_vdev_cache_max=1310720
options zfs zfs_read_chunk_size=1310720
options zfs zfs_vdev_cache_bshift=17
options zfs zfs_read_chunk_size=1310720
# Set these to 1 so we get max IO at cost of bandwidth
options zfs zfs_vdev_async_read_max_active=1
options zfs zfs_vdev_async_read_min_active=1
options zfs zfs_vdev_async_write_max_active=1
options zfs zfs_vdev_async_write_min_active=1
options zfs zfs_vdev_sync_read_max_active=1
options zfs zfs_vdev_sync_read_min_active=1
options zfs zfs_vdev_sync_write_max_active=1
options zfs zfs_vdev_sync_write_min_active=1


# reboot


Sanity check

# cat /sys/module/zfs/parameters/zfs_prefetch_disable
# cat /sys/module/zfs/parameters/zfs_arc_max


Example commands to see settings

# zfs get all
# zfs get all myraid
# zfs get checksum
# zfs get checksum myraid

*ALWAYS* use Mirror / RAID10 – never, never, ever use RAIDz !

Data compression : LZ4 ( Yes, on *everything*, make sure you have enough CPU though. )
    zfs set compression=lz4 myraid
   
Checksum : Fletcher4
    zfs set checksum=fletcher4 myraid

Use Cache for : Data & Metadata*1
    zfs set primarycache=all myraid

Write bias : Latency*1
    zfs set logbias=latency myraid

Record size / block size : 128k ( This is vital people – we go against the “use record size as in workload” recommandation )
    zfs set recordsize=128k myraid

Update access time on read : disable
    zfs set atime=off myraid
   
Do not use dedupe.
    # zfs set dedup=off myraid

Enable Jumbo Frames

Disable sync
    # zfs set sync=disabled myraid

Have fun!

No comments:

Post a Comment