User Tools

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
selfhost:storage [2024/09/02 14:56] – [Automate RAID at boot] willyselfhost:storage [Unknown date] (current) – removed - external edit (Unknown date) 127.0.0.1
Line 1: Line 1:
-===== Storage Setup ===== 
- 
-You have already installed your Gentoo Linux, fresh and nice. Now it's time to organize and setup the storage. 
- 
-First of all: the disk/partition on which the operating system is installed must be separated from where you will store your data and your services. The basic idea is that if you will need to migrate your services and data to a new server, for an upgrade or a failure, all it will take is reinstalling the OS and re-plugging the storage drives.  
- 
-The best approach is to use one drive for the OS and different drives (usually at least 2) for your data and services. These drives should be physically separated drives, and not different partitions on the same drive.  
- 
-So you will have your **OS** drive and your **services** drive (which will include also the data). 
-//note:// having data and services on the same drive ensures that you can perform hard-links, which in some cases are mandatory to avoid data duplication (i am talking about torrents). 
- 
-The idea is to store your **services** on a redundant software RAID-1 array. Now, there are different solutions you can choose from. You could go RAID-5, RAID-0+1, RAID-6, RAID-10 and many more combinations. I will let you research and find out what works best for your use case and drives availability. 
- 
-RAID-1 has a few advantages for me which are: 
-  * Fast enough on read (reads will be balanced on both disks, writes will take slightly longer) 
-  * Solid enough to survive one disk fail (provided you monitor the RAID status and replace failed disks) 
- 
-It's main drawback is the wasted space: 50% is a big waste, but it's for the peace of mind. Just remember: **RAID does NOT means BACKUP** and **do your backups** anyway. More on this later on. 
- 
-(note: i will always say //disk// but that can be an SSD, doesn't means mechanical hard drive) 
- 
-==== Software or Hardware RAID? ==== 
- 
-RAID can be implemented in hardware or in software, using Linux RAID software implementation.  
-I have been using the software RAID approach for more than two decades and i have never been let down: 
-  * It's solid,  
-  * it's simple,  
-  * it works and it's efficient.  
-  * Each disk can always be mounted as single drive, without the RAID array. 
- 
-If you choose to use a commercial external RAID solution, skip the RAID part ahead. 
- 
-==== Disks Preparations ==== 
- 
-I will assume you have two external drives called **/dev/sdb** and **/dev/sdc**. I will assume that **/dev/sda** it's the drive where Gentoo is installed.  
- 
-The size of the two disks is not important: get the biggest ones you can afford. 
- 
-I suggest, if you can afford them, to use SSDs because more silent and consume less power, which is a plus for a home server, but they are still more expensive than traditional drives.  
-Any way, it doesn't matter whether you choose expensive high-end data-center grade HDD or cheapo Chinese dubious SSDs (well, i assume you factor in the value of losing your data ofc). 
- 
-A good approach to add more drives, when you run out of internal slots in your server, is to use USB-3/USB-C external drives. You can buy a JBOD box (Just a Bunch Of Disks) where you can store 2, 4 or even 8 or 16 disks sharing one USB plug. I have been using this type of setup for the best of the last 15 years without any data loss or corruption. Speed-wise, you will be streaming your data over your home network, which more than often means WiFi. A good USB-3 SSD is more than capable to keep up any data transfer requirement for any streamed media today, even 4K, so there is not need to worry that external disks or USB-3 might be a bottleneck. 
- 
-Note: i will refer to //two// disks, but you can create more RAID arrays if you have //four// disks! 
- 
-===== Partitioning ===== 
- 
-To create a software RAID, you need to first partition the two drives, for this job you can use the good old fdisk: 
- 
-<code bash> 
- > su 
- > fdisk /dev/sdb 
-... do the partitioning ... 
- > fdisk /dev/sdc 
-... do the partitioning ... 
-</code> 
-You will need to be root for fdisk to work. You should be root at this point, tough. The //su// command might be redundant. How to use fdisk? I think you can find out easily. given that these should be new and clean disks, there is not much risk to mess up. Create a GPT partition table, for future-proof support, and one single partition filling up the disk, unless you want a more complex setup.  
- 
-Using //fdisk//, create one partition on each drive to fill it, that will be called **/dev/sdb1** and **/dev/sdc1**, these two partitions type needs to be of type //Linux RAID//. I assume the two drives are of the same size. If not, consider that the bigger one will have wasted space. In this case, create the partitions of identical sizes on both drives: the biggest drive will have free spare space that you can partition again as non-RAID partition.  
- 
-Save the changes and quit from fdisk, since the disks are not being used yet, you will not need to reboot the server. 
- 
-Remember that using Linux Software RAID you can create more than one partition and create more than one RAID-1 from two disks. For example, if you have a huge disk and want to separate two areas (one for data and one for webcam storage, for example) you can create **two** RAID-1 arrays by splitting both disks in two partitions each and mixing them up. Just don't create a RAID from two partitions on the **same** disk as that would be, at best, dumb. 
- 
-(if you need to retain your data and you have only two disks, you can create the RAID only on one of the two, which will be deleted, mount with only one drive, copy the data over from the other disk, then format the other disk and hot-add it to the RAID-1. How to do this in details it's not difficult to figure out, but be careful not to lose your data in the process) 
- 
- 
-===== Creating the RAID array ===== 
- 
-You need to create a new software RAID array out of **/dev/sdb1** and **/dev/sdc1**, for this you will use the //mdadm// command we have installed previously: 
- 
-<code bash> 
- > mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1 
-</code> 
- 
-since this is the first RAID array of the server (and probably the only one) it's called **/dev/md0**. If you have more than one RAID array, the naming will be different reflecting that. 
- 
-If you want to do the trick of copying existing data from one of the two disks, at this point you can create a RAID-1 array with only one drive by replacing the drive you do not want to add now with the work //missing//. You can then add at a later time the disk with the //--add// option of the //mdadm// tool. 
- 
- 
-===== Format and mount ===== 
- 
-You need to format and mount your newly created RAID array. You need to choose which filesystem you want to use. A common choice on Linux, and probably the more straightforward, is to go with EXT4. This might not be your best choice if you have SSDs or you want to leverage the error-correction and balancing of an advance filesystem like ZFS and BrtFS, but i like to go simple and so i will choose EXT4 for you here. I have been running my software RAID-1 on EXT-4 since it become stable well over a decade ago, and again i never lost any data to bugs or corruption. 
- 
-Format the RAID array: 
-<code bash> 
- > mkfs.ext4 /dev/md0 
-</code> 
- 
-Now, i want to describe how i organize my services and data storage, because it's key in how to propery partition and secure your services (and data!). 
- 
-Let's assume the following structure: 
-  * **/**: Gentoo root 
-  * **/data**: mount point for the storage 
-  * **/data/daemons**: will contain all services 
-  * **/data/Film|Tv|Music|Podcasts|Books|Audiobooks**: will contain media files 
-  * **/data/Common**: will contain files shared via WebDAV & Browser & NFS/Samba 
-  * **/data/Common**: root for your reverse-proxy and web services 
- 
-Where **/data** will be the storage mount point, where you will mount your drive. 
-All services will be stored under **/data/daemons**, one folder for each service and one service assigned to one different user. Groups will be used when data needs to be shared between services. **/data/Tv|Media|Etc...** will contain all your movies, music, podcasts, and such while **/data/Common** will be shared as file-access. Of course, you can add all the folders you want. 
- 
-So, here how you can create this structure: 
-<code bash> 
- > mkdir /data 
- > mount /dev/md0 /data 
- > mkdir /data/daemons 
- > for i in Tv Music Film books Audiobooks Podcasts; do mkdir $i; done  
- > mkdir /data/Common 
- > mkdir /data/htdocs 
-</code> 
- 
-This will speed up any hardware failure or reinstallation you might need in the future and will also ensure that your main Gentoo partition will not get filled up by the various tools and the downloaded data. 
- 
-The newly formatted drive needs to be automatically mounted at every boot, so you need to add a line like this to **/etc/fstab**: 
-<code> 
-/dev/md0        /data     ext4            noatime         0 0 
-</code> 
- 
-The //noatime// option will reduce USB traffic and wear-and-tear. Of course, change the filesystem type to whatever you choose. 
- 
-===== Automate RAID at boot ===== 
- 
-You also want to automate linux raid startup, so that upon reboot everything will still work just fine. To do so, the **mdraid** service needs to be started in the //boot// runlevel. Do NOT start it in the //default// runlevel or things will break badly after the first reboot. 
- 
-<code bash> 
- > rc-update add mdraid boot 
-</code> 
- 
-The //mdadm// service is not required, unless you want monitoring of your RAID array (nicluding email reporting) which is a **TODO**. 
- 
-You also want to ensure the **/dev/md0** device doesn't change name upon reboot (it happened when the USB drives change order on boot for example, or because you plug/unplug them), so put this line into your **/etc/mdadm.conf**: 
- 
-<code> 
-ARRAY /dev/md0 UUID=1758bcfa:67af3a42:d3df2d83:ecbb0728  
-</code> 
- 
-where the UUID can be read by the output of the command: 
- 
-<code bash> 
-mdadm -detail /dev/md0 
-</code> 
- 
-One last bit that might be required if you use USB storage is to increase the udev timeout at boot. USB drives might be slow to spin-up and be recognized (even SSDs) so this trick might be needed if you find that upon reboot your RAID array has not been mounted properly.  
- 
-Add this lines to your **/etc/init.d/mdraid** script: 
-<code> 
-start() { 
-        local output 
-        ebegin "Waiting a little bit longer for USB stuff to popup..." # line added 
-        sleep 10                                                                                                # line added 
-        ebegin "Starting up RAID devices" 
-</code> 
-Adapt the ten seconds to your likings. 
- 
- 
-===== Prepare the disk for the media collection ===== 
- 
-Something you can do at this point is ensure your media folders share the proper //group// permissions, since they will need to be accessible cross-services, and all related services will be part of the **media** group: 
-<code bash> 
- > for i in Tv Music Film books Audiobooks Podcasts; do chgrp media -R $i; chmod g+w -R $i; done  
-</code> 
  

This website uses technical cookies only. No information is shared with anybody or used in any way but provide the website in your browser.

More information