Sage-Code Laboratory

Computer Hardware

This tutorial has two parts. In first part we teach you how to build an repair a personal computer. On second part we teach you about data storage and RAID configurations. Both parts are optional for Software Engineering.

Page bookmarks

Infrastructure

Company infrastructure in general assume servers and desktop computers. These days, some computer enthusiasts can own several computers in a network. To build and maintain your own server is fun and interesting activity. Though cloud computing and data centers are replacing small servers, some companies may continue to maintain private networks for development or private data management, to avoid security issues or higher cost.

In addition to security and cost, another reason why some companies may choose to maintain private networks is for compliance reasons. For example, a company that handles sensitive data, such as financial or medical information, may be required to keep that data on-premises in order to comply with regulations.

Permanent Storage

Permanent data storage is a device that can store data for a long period of time. This is the most most important feature of any computer. Once saved, data can be searched, restored in memory and served on screen or printer. Also data can be source of content for a dynamic website.

The first storage devices used in computer industry ware optic tapes or perforated cards. Later was invented the floppy disks. These devices are no longer used and are replaced by CD-ROM, DVD-ROM and BD (Blue Ray Disk)

Motivation

An Engineer is not just a regular programmer. It is much more. As software engineer you need a large horizon. When you develop software you must know that it has a purpose. One of the purpose of software can be to store data on disk. Usually the operating system has the capability to do so. But, as and Engineer you must know about it.

Before making any software, you must know what your computer is capable of, so that you can organize data streams to and from the devices you are working with to manipulate your data in workflows. Therefore you must learn this that may be considered by other instructors irrelevant. And do not think for a second that this is not an interesting subject.

Magnetic Tape

Magnetic tapes have been used as external devices from the early days of PC. First tapes use to be like sound tapes or video tapes but a little bit more robust. These tapes are DAT tapes. After this several standards of tape storage have been created to increase the capacity and reliability. DLT Tape

Two base types are most important: DLT (Digital Linear Tape) and SDLT (Super DLT) with several variations of different capacity. Usually the variations are not compatible to each other.

Tapes media is guaranteed to store data for 30 years and have a specific color that indicate the format. For example in the next picture we have Super DLD Type 1 (Dark Green).

Picture

There are several companies that produced tapes of different type and capacity specific to a device. So you can’t buy a device from one manufacturer and tape from another. A tape drive is usually using SCSI protocol and is not suitable storage for a desktop computer.

LTO Tape

The latest standard for backup tape is LTO. This standard allow tapes to be interchanged between devices so you can use a Sony tape into a HP device for example.

LTO = Linear Tape Open

The LTO standard has evolved from LTO I to LTO VII. The LTO VI allow a large capacity of 6.25 TB. This tape is for about 25 USD. Today the standard LTO 6 tape has a native capacity of 2.5 TB uncompressed. The LTO 7 magnetic tape has reach capacity of 6 TB uncompressed and 15TB with compression. HP has push the frontier LTO-8 Ultrium to RW standard up to 30TB.

Picture

RDX Device

This is not a tape but is a HDD used as a tape to make backups on it. Actually this technology is not so popular due to high price of each casette. But it has an advantage, it has direct access to files, unlike a normal tape that has only sequential access. Also, RDX is very fast and it can use SSD disks.

Picture

What is CD DVD and BD?

In next section I will present a series of Compact Disks. This can be confusing if you are not familiar with notation. When you buy you must know what it is right? So I will try to explain some of the notations and what advantages one disk has over another.

DVD+RW

Most common disk is DVD. This can be simple or double layer or double side. There is a difference. Double layer do not have to be turn over to write on it. Double side can be written on both sides but can’t have a label written on any side. That means you must turn over the disk to read the other side. DVD-R can be written once while DVD+RW can be written multiple times. However it need to be erased before you can write on it second time.

Picture

>DVD-RAM

Next disc store up to 9.4 GB and can be read and write very fast by using a DVD-RAM capable device. Not all devices know how to handle DVD-RAM standard. This disk is a random access disk and is organized different then the DVD-RW. It can be written multiple times but has random access exactly like a normal hard disk. Is organized in sectors and you can write instantly small number of files at a time.

Picture

BD (Blue Ray Disk)

Blue Ray Disk or BD can be a good data storage. Blue Ray starts from 25 GB, 50 GB up to 100GB. You can find internal or external devices for 50$ up to 150$ and you can use these devices to record also DVD if is a combo device. BD is inexpensive and easy to find in bulk. The BD-RE storage is read write storage and can be used for backups. In the picture below you can see a disk box for (BD-R) that is not re-writable but can be burn only once and read multiple times.

Picture

M-Disk Verbatim

I think this is kind of magnetic-optic disk. Verbatim M-Disc™ optical media is the new standard for digital archival storage. Unlike traditional optical media, which utilize dyes that can break down over time, data stored on an M-Disc is engraved on a patented inorganic write layer — it will not fade or deteriorate.

Picture

Take great care handling a DVD !!! CD, DVD and BD are sensible to light dust and scratches therefore must be enclosed in a box or envelope. Do not put the disk face down on a desk or on paper. Handle the disk careful face down and do not touch with your fingers the active side. After you finish using the disk put it back into the box.

What is SSD?

There is now a new disruptive technology available on the market for storing data. This is called Solid State Disk (SSD) or SSD card accelerator. An SSD is a hard drive that does not have moving parts. Is like a permanent memory storage shaped as a hard drive for convenience. A good SSD is 10 times faster than a normal HDD.

Picture

SSD has a lower capacity (128 GB, 256 GB, 500 GB) and is very appreciated as laptop storage or desktop boot drive where you can install the operating system to improve PC performance.

The SSD read and write fast but delete operation is a little bit slower. After multiple read and writes the disk is loosing some of the performance. For this there is a optimization software that can run from time to time to keep performance up.

NVMe SSD

NVMe = Non Volatile Memory Extension

This device is 10 times faster than SSD. For making the operating system load faster, you can use an internal NVMe These are small factor SSD that do not match older laptops or desktops. So be sure your computer supports NVMe before order. For desktop is possible to install an extension card that support NVMe.

Picture

External SSD

This SSD can be installed inside a mount/enclosure that can be connected to laptop using an USB cable. Then you use it as external SSD for backup. That is very small and easy to handle and is good for both Laptop or Desktop. With the right cable you can even connect it to your tablet. It is much better than external HDD for backups.

Picture

What is ReadyBoost?

You can improve Windows performance using a small SSD as secondary disk. You can use up to 32 GB of SSD disk to be used as cache using windows ReadyBoost technology. For this you select an empty SSD and format as NTFS. Then you use disk/properties/readyBoost to set-up the disk for cache. This may improve performance only if your primary disk is not already SSD.

Picture

LINUX BOOT DISK

If you have a small SSD for Linux this is great. You can install / (root) on SSD format using EXT4 and use /noatime option for mounting. Then you have a very fast boot time.

If you have more then one HDD in your computer or you use a RAID array then you can use SSD to store /boot partition (up to 200 ~ 500 mb). This can be formatted EXT3 or EXT4.

Use secondary disk or RAID array to mount / (root) partition. This will allow you to store larger amount of data on the computer or server.

BOOT PARTITION

It is problematic to install boot loader for operating system on a software based RAID. Therefore usually the primary disk on Windows operating system is not RAID based.

On Linux you can have a very small partition for /boot (~ 256 mb < 512 mb) then the root partition "/" can be installed on software RAID array. However if you create a boot partition on one of the RAID disks then this disk will have less space then the other disks. On RAID it is preferred to have the same disk size. The new memory SD card can have a large capacity it is not sensitive to light or mechanical damage if you drop it and can be used to store data for a long time.

Picture

What is CF card?

CF is Compact Flash storage used for digital camera. It is less and less used today but is an interesting storage that is sometimes used as boot drive. You can use an IDE or SATA adapter to connect your computer internally to a small flash card and you can install the operating system on this card.

Picture

What is a PC card reader?

To read and write card you can you can install a 3.5″ card reader instead of a floppy disk. This is the new standard for a modern PC movable media and the floppy disk or any mechanical device. The new SD card can have a large capacity it is not sensitive to light or mechanical damage if you drop it and can be used to store data for a long time.

Picture

Memory Stick

Is a very popular device used by millions for storing of-line files and transport. There is a problem. These devices are growing in size every year and is so easy to loose. Therefore if you have a memory stick, take great care not to loose it on the street. Your data could be compromised.

Picture
Memory stick: 128GB up to 512GB

Performance Computing

Higher Performance Computing (HPC) is a technology to build computers with higher computation power and improved storage performance. These are used for computing intensive jobs like DML (Deep Machine Learning) and encryption. This kind of hardware is more for enterprise computing or high end game development PC or AI research.

What is RAM ?

RAM = Random Access Memory. But this is old news. What is new is the capacity and frequency of new RAM chips. A computer can have 4GB RAM minimum but more common 8GB RAM. A HPC require larger capacity: 2x16 = 32GB or up to 4x32 = 128GB is possible for a high end HPC.

Picture

Corsair DDR4 - RAM

What is CPU ?

Central Processing Unit = CPU. I will not argue for AMD or Intel brand. What is important then? Compare clock speeds only when you are trying to decide between two CPUs from the same family and same number of cores. There are benchmarks that can check Single-Threaded Performance. This is the most important. Then there is the number of cores and the cache memory.

The more expensive the Chip the more power it has. Logic, but only if the chips are from the same brand. This criteria not work between different brands. Intel has fallen a little bit behind in so far (2020) but will recover in 2021. NVIDIA is making ARM CPU processors for MAC and Servers that have a different architecture. This is cool, but you can not build your own MAC so we do not talk about it in our course.

Picture

AMD vs Intel

What is FPGA ?

CPU is in trouble: AMD and Intel are at war about FPGA = Field Programmable Gate Array. Why? ah ... machine learning of course! There are 2 directions of development. One is low cost entry level for hobby and the other for enterprise.
  1. Intel has acquire Altera in December 2015.
  2. AMD has acquire Xlinx in Oct. 27, 2020
  3. Nvidia already has GPU Titan RTX.

Picture

Experimental

Below is an affordable FPGA solution. This little stick is not a memory stick but an AI accelerator for Neural network deep learning programming. You can use a USB hub to connect multiple such sticks. They work together.

Picture

Enterprise

Below is a eFPGA high end, enterprise class programable accelerator card. As you can see below the FPGA is a big player in HPC domain. We have not yet experimented with this type hardware due to extreme cost. Next card is about $5000 on Amazon. Do not buy!

Picture

What is GPU ?

Graphic Processor Unit = GPU. This is an acceleration card for gaming PCs. But you can also do cool applications for GPU using special libraries. This enable DML (Deep Machine Learning) algorithms to run especially for Nvidia GPUs. AMD and Nvidia this time are competing against each other in this market. Intel does not make video cards for HPC.

Picture

Titan RDX - GPU Video Card

Picture

Titan RDX - GPU Video Ports

What is AIC ?

AIC = Add-in Card. This is a new kind of storage that is Non Volatile Memory Express (NVMe). It is based on MLC (Multi Level Cell) like SSD except that is much faster then a normal SSD. In next picture you can see the brand: "Intel Optane SSD AIC Internal Solid State Drive (SSD)". Also Western Digital makes a similar card called "WD Black AIC 1TB PCI-Express".

Picture

Performance: This memory card is very fast and you can install the operating system on it or you can use this as an accelerator card for a very fast database. It can have 400GB or 800GB or even 1.2 TB and have price between $350 and $850 depending on the model. When you buy one of this make sure your motherboard supports it. You need at least a PCI express x4 free slot. If you have PCI express x8 OR PCI express x16 this will also work.

What is a HDD?

The most successfully storage device and the most popular is today the hard drive (HDD). This device is an internal small box that contains mechanical and electronic parts that are used for storing large amount of data. These devices are very successful due to rapid access to read or write information. These devices are designed for desktop PC, Servers or Laptops. There are 2 standard sizes: 3.5″ and 2.5″. In the next picture you can study the internal components of a HDD but the real HDD is encapsulated in a case, for protection. You can not look inside the case without a screw driver.

Picture

ANY HDD IS A SEALED BOX! THIS DISK IS OPEN FOR DEMO PURPOSE. NEVER USE A SCREWDRIVER TO OPEN A HDD EXCEPT IF YOU WANT TO BREAK IT. YOU CAN'T FIX A HDD.

Technology: these devices are using one or several disk platters that are mounted on a spindle. The patters are easy to magnetize and demagnetize using a small coil mounted on a moving head that is reading or writing data. Some HDD are using a single disk some other are using one 2 or 3 platters on the same axis but new disks can have upt to 8 platters to increase the capacity.

Performance: Depending on the number of disks mounted on the spindle and depending on the rotation speed, the HDD can be faster or slower. The spinning is measured in RPM (rotation per minute). For example a 15000 RPM or 10.000 RPM is used for fast responsive server side HDD while 7500 RPM is used for desktop computer while 5000 RPM disks are used for laptop computers.

Capacity: The capacity of HDD is measured in GB (GigaByte) or TB (TeraByte). 1 TB = 1000 GB. One Byte = 8 Bit.Usually a laptop is using a single HDD of small capacity (GB) while desktop or servers can use 2 or more HDD of large capacity (TB).

Because HDD has moving parts it generate heat, vibrations and is subject to mechanical and thermal degradation. Today you can find HDD of large capacity up to 10 TB for servers using SATA or SAS interface or for NAS storage.

High capacity RAID

RAID = Redundant Array of Inexpensive Disks. This is a technology used for servers to enlarge storage capacity and improve reliability quality for data storage in data centers. This technique is called RAID. There are multiple types of RAID configurations optimized for speed, safety or a balance between the two.

Backup on RAID

A backup disk must be a single disk on Windows. Sometimes you can use 2 independent disks for alternative backups. For windows NT you can specify several disks for backup that are automatically used by the system and do not have assigned a letter like the other disks. We do not recommend RAID array for desktop computers but only for servers or workstations. One large HDD can be superior to a bad RAID array.

RAID Inconvenience

There are 2 ways to create a RAID array:

Software RAID and hardware RAID. For hardware RAID you need a specific card that is to control several disks. New PC motherboards have support for RAID using SATA controller. Software RAID is available as a feature of the operating system. RAID Parameters

There are 2 important parameters you have to consider when you decide what kind of RAID to create:

CHUNK SIZE

To get good performance you must have a reasonable chunk size specific to your work or job.

For video editing: You want a lot of bandwidth, so you want a small chunk size. Anywhere from 512 bytes (one block) to 8 KB is a small chunk size. This will allow multiple disks to contribute to data stream at once increasing the data bandwidth.

For a database: You want to maximize your IOPS, which ideally means sending each I/O to only one disk. So you want a large chunk size – at least 64 KB or more. That large chunk will mean that most I/Os get serviced by a single disk and more I/Os are available on the remaining disks.

RAID LEVELS

RAID capacity, performance and reliability are in balance. Depending on your goal one kind of RAID is better then another. Here are the most important RAID flavors:

JBOD: Just a bunch of disks. is known as "spanned volume" on Windows. It is the best choice for a desktop PC. It increase the size of the volume and keep the performance at maximum. This is the most suitable RAID for a beginner who has too many disks in his computer.

RAID-0: Known as "striped volume" on Windows. RAID-0 is good for speed, you can use two or more hard drives.The storage capacity grow linear with the number of drives. On Linux RAID-0 can be created using MD-RAID, LVM or BtrFS.

Raid Picture

The major limitation of RAID-0 compared to other RAID levels is that does not offer any redundancy. RAID-0 can be used to increase storage capacity, sequential read and sequential write.

RAID-1: Known as "mirrored volume" for Windows, RAID-1 is good for data safety and it increase the reading speed but you can use only 2 hard drives. RAID-1 offer 100% redundancy but will not increase your writing speed or disk space.

This is available only on Windows professional edition and Windows server edition. It is not available on Windows home edition.

Raid Picture

Note: RAID chunk size do not have effect on performance for RAID 1.

RAID-5: is using 3 or more hard drives and has a special algorithm to distribute data across these disks. If one of the disk fail it can be replaced and array is rebuild by the system. The tricky part is to know which hard disk has failed. If you replace the wrong hard drive you will probably loose all the data.

Raid Picture

RAID 6: is using 4 or more hard drives and is using double redundancy. So if any disk fail you can replace the disk but if you replace the wrong disk data is not lost. This RAID is using some more processing power to write and read from the HDD due to redundancy algorithm. This is the most used RAID in enterprise class NAS or server storage.

RAID-10 A server need minimum 4 disks to create RAID-10. The speed for write will be up to 2x faster then a single disk while the speed for reading is 4x faster. RAID-10 can use 4, 6 or 8 HDD for storing videos and pictures. RAID-10 is not good for servers.

Raid Picture

Raid combination

Enterprise servers are using RAID-6, however combinations of RAID-6 and STRIP can create larger had drive arrays that are faster than RAID-6 by using RAID-60. You can find schema for different kind of combinations on the interned. Important to know is that it can be done.

Spare disks

Some RAID arrays have support for one or more spare disks. These disks do not have data but they are in reserve. If one of disks fail, the RAID is able to repair itself, disabling the defect disk automatically and rebuilding the array on the fly. This is a professional solution and is used frequent on critical servers.

Personal computer RAID

Personal computer sometimes enable RAID but most of the time a desktop computer has one single hard disk. If you use two disks then one is for operating system one is for backup. If you have 4 disks then you can use different size disks or same size disks.

First disk is for Windows System, and it can be SSD. Second disk is of large capacity and is for Backup. Disks 3 & 4 can be same size and connected in RAID-1. Do not ever use RAID-0 on a PC it is dangerous to loose data. The RAID 0 can break easy on a desktop computer if you sue SATA disks. On server there is a different class of hard-disks: Enterprise Class, made express for RAID. If you purchase this kind of disks you can have RAID at home.

NAS Storage

A NAS storage is an external box having several HDD installed in RAID. It is in fact a computer that is dedicated for large storage capacity. It can connect using a e-sata cable (external SATA) or fiber optics. In the next picture you can see an example of NAS storage. Feel free to investigate careful for your NAS or external drive solution before you buy. This is not for everyone but it can be useful for a small office to make backups on it or to store a large database.

Picture

Server / NAS RAID storage

RAID is usually required for servers. In the picture below we show you a RAC with several disks that are mounted. Each disk can be extracted easy without any tool. The drivers are installed in special drawers that have ventilation and a special handle that will disconnect the diver and then you can pull the driver out while the server is running.

Of course you must have a good reason to pull out a disk or an entire array. Usually a defect disk has a red light that indicate is defect and must be changed. If does not then you can measure temperature of disk and if one gets hot then it must be changed. Of course this id difficult if the drawer do not have a thermometer on it.

Raid Picture


You are ready! Go back & take the exam.