Storage Systems

Persistent Space

Modern computers need some type of persistency to do most things, with multuple processes potentially using it concurrently

This is storage is either on a HDD or an NVM (ex. SSD or USB), with some systems even using parts of RAM for this, known as RAM drive

HDDs

HDDs use a spinning disk that is written with a read-write head, separated by a thin layer of gas, with each platter divided into cylinder → tracks → sectors

These rotate at 5.4k-15k RPM, with certain types like CDs being removeable

2023-04-18_03-09.png

Sectors nowadays are usually 4kb, but they can be larger on the outer tracks, increasing data density and capacity

The performance depends on transfer rate (data flow speed) and positioning time (getting to the right place, seek time + rotation latency)

NVMs

NVMs are purely digital, with are more reliable but have shorter lifespans and are more expensive with less capacity

Some busses might even be slower than connecting straight to the system bus, so these are better suited for phones and laptops

These using paging, similar to main memory, making it efficient for reads but inefficient for writes, as they can also degrade the memory cell over time

2023-04-18_03-13.png

Scheduling

These are schedules as 1D arrays with each block mapped to a sector

The OS should deal with these efficiently, decreasing access time through good movement

Processes that need sources wait in an I/O queue and disk requests are held in a queue on disk for modern models

These requests contain read/write info, disk address, memory address, sectors, etc.

Managing the queue intelligently can increase efficiency, with the following being examples of algorithms to use

2023-04-18_03-17.png

2023-04-18_03-17_1.png

2023-04-18_03-17_2.png

2023-04-18_03-17_3.png

There are others as well, including random, LIFO and priority, but all of these depend on the expected workload and most OSes use a combination

Error Detection

We can detect read/write errors with parity bits and ECCs, which you should already know by now

Device Management

On a low level, we divide the disk into sectors, with the OS keeping a record of its own data structures on a disk

We partition this disk into one of more cylinders, each treated as a logical disk (C and D on Windows, /boot /usr and /home on UNIX), with each partition being logically formatted into a file system type

These also contain metadata about bootability, with the computer having bootstraps baked into the firmware

The last thing this program does is reading the MBR, which is the first block on the first partition, which loads the rest of the OS

2023-04-18_03-41.png

Swap-Space Management

When we use swap space for pages, we can either send it to a raw partition or files on a partition, which are subject for the file system structure

Storage Attachment

Disks are attached to the I/O bus and data is transferred on this bus by controllers, the host controller on the motherboard and the data controller on the disk

We can also attach storage to network under LAN

2023-04-18_04-24.png

Cloud storage also exists, which is similar to NAS except it’s over the Internet, including Google Drive and Dropbox

If cloud storage is undesirable (ex. servers) we can use SANs to set up a local array or servers, with disks being added and replaced transparently

2023-04-18_04-26.png

RAID

All disks are prone to failure, with loss being catastrophic, so Redundant Array of Inexpensive Disks (RAID) exists as a way to add redundancy through extra disks

The logic for this exists on both the hardware level (for supercomputers), software level (for desktops) and firmware (ex. BIOS)

There are 6 differnet types, generally

2023-04-18_04-28.png

2023-04-18_04-28_1.png

2023-04-18_04-28_2.png

2023-04-18_04-28_3.png

2023-04-18_04-29.png

Other variants exist as well, such as RAID 2, 3, 4, 50, 60, etc.

All of these RAID disks need to be the same size, as it’s hard to add or shrink space

We can use hot spares to give immediate rebuilds as well, but these can temporarily impact performance