Feat: complete the docs

2025-04-14 09:02:16 +08:00
commit a7c41e264b
37 changed files with 1261 additions and 0 deletions
@@ -0,0 +1,20 @@
+archlinux:
+  type: Web
+  title: ArchLinux, a simple, lightweight distribution
+  url: https://archlinux.org/
+btrfs:
+  type: Web
+  title: Btrfs, a modern copy on write filesystem for Linux
+  url: https://btrfs.readthedocs.io/en/latest/
+ct1000p3pssd8:
+  type: Web
+  title: Crucial P3 Plus 1TB PCIe M.2 2280 SSD
+  url: https://www.crucial.tw/ssd/p3-plus/ct1000p3pssd8
+wd10ezex:
+  type: Web
+  title: WD Blue PC Desktop Hard Drive - 1TB
+  url: https://www.westerndigital.com/products/internal-drives/wd-blue-desktop-sata-hdd?sku=WD10EZEX
+tufh470:
+  type: Web
+  title: TUF Gaming H470 Pro WiFi Motherboard
+  url: https://www.asus.com/tw/motherboards-components/motherboards/tuf-gaming/tuf-gaming-h470-pro-wi-fi/techspec/
@@ -0,0 +1 @@
+../../bonus/bonus.output
@@ -0,0 +1 @@
+../../q0/q0.output
@@ -0,0 +1 @@
+../../q1/q1.output
@@ -0,0 +1 @@
+../../q2/q2.output
@@ -0,0 +1 @@
+../../q3/q3.output
@@ -0,0 +1 @@
+../../q4/q4.output
@@ -0,0 +1 @@
+../../q5/q5.output
@@ -0,0 +1 @@
+../../q6/q6.output
@@ -0,0 +1 @@
+../../q7/q7.output
@@ -0,0 +1,188 @@
+#import "template/styles.typ": document
+#show: document.with(
+  title: "Memory and Storage System - HW1 FIO",
+  authors: ((
+    name: "Yi-Ting Shih (111550013)",
+    affiliation: "National Yang Ming Chaio Tung University",
+    email: "ytshih@cs.nycu.edu.tw",
+  ),),
+)
+
+= Specifications
+
+All the read operations except bonus are tested on ssd. \
+Model: Micron CT1000P3PSSD8@ct1000p3pssd8 (Crucial P3 Plus)
+
+All the write operations are tested on disk files, which are in a RAID 1
+btrfs@btrfs filesystem.
+
+The tested operating system was Arch Linux@archlinux, which was running Linux
+kernel 6.14.2.arch1-1.
+
+= Questions
+
+== Q0. Example
+
+=== Result
+#raw(read("code/q0.output"))
+
+== Q1. Read vs. Randread
+
+=== Result
+#raw(read("code/q1.output"))
+
+=== Question
+Is there any significant difference between read and randread? Why or why not?
+Please justify your answer in brief.
+
+=== Answer
+_read_ is significantly faster than _randread_.
+
+The SSD controller might be optimized for sequential access, such as leveraging
+internal parallelism more effectively, or have lower command overheader perunit
+of data for sequential reads.
+
+== Q2. Write vs. Randwrite
+
+=== Result
+#raw(read("code/q2.output"))
+
+=== Question
+Is there any significant difference between write and randwrite? Why or why not?
+Please justify your answer in brief.
+
+=== Answer
+There is no significant difference between _write_ and _randwrite_.
+
+The SSD controller will go through the page / block translation process no
+matter there is write or random write access. Therefore there is no significant
+difference.
+
+Also, I use disk file to test the write operation, which is on a btrfs
+filesystem. There are cache on btrfs@btrfs or page cache so the data might not
+be actually write to the SSD.
+
+== Q3. Forward write vs. Backward write
+
+=== Result
+#raw(read("code/q3.output"))
+
+=== Question
+Is there any significant difference between forward and backward write? Why or
+why not? Please justify your answer in brief.
+
+=== Answer
+There is no significant difference between _forward write_ and _backward write_.
+
+The reason is same as the previous question.
+
+== Q4. Buffered read vs. Nonbuffered read
+
+=== Result
+#raw(read("code/q4.output"))
+
+=== Question
+ Is there any significant difference between buffered and nonbuffered
+  sequential read? Why or why not? Please justify your answer in brief.
+ Replace sequential read with random read. Is there any significant difference
+  between buffered and nonbuffered random read? Why or why not? Please justify
+  your answer in brief.
+
+=== Answer
+- _buffered sequential read_ is significantly faster than _non-buffered
+  sequential read_.
+- There is no significant difference between _buffered random read_ and
+  _non-buffered random read_.
+
+The data can be buffered by _read-ahead_ for sequential read operation. This
+cannot be done for random read operation, because the next read operation
+cannot be predicted.
+
+== Q5. LBA
+
+=== Result
+#raw(read("code/q5.output"))
+
+=== Question
+Is there any bandwidth trend between these jobs? Why or why not? Please justify
+your answer in brief. You can also experiment with the size of the file.
+
+=== Answer
+The job where `offset=80%` is faster than `offset=60%`, and `offset=60%` is
+faster than the rest of the jobs.
+
+The identified response presents an analysis of experimental results; the
+underlying theoretical framework remains unclear, necessitating further
+investigation.
+
+== Q6. Blocksize
+
+=== Result
+#raw(read("code/q6.output"))
+
+=== Question
+
+ Is there any significant difference between 4k and 1k read? Why or why not?
+  Please justify your answer in brief.
+ If you want to achieve the best performance in the condition above, how would
+  you modify `blocksize`? Explain it briefly.
+
+=== Answer
+- _4k read_ is significantly faster than _1k read_ in terms of bandwidth and
+  latency.
+- The bandwidth can be increased by changing `blocksize` to up to `8m`.
+
+Larger `blocksize` might be useful when doing large sequential read.
+
+== Q7. Fastest nonbuffered read
+
+=== Result
+#raw(read("code/q7.output"))
+
+=== Question
+Please explain how you achieve the fastest 1G nonbuffered read in brief.
+
+=== Answer
+According to the result in the previous questions, namely, Q5 and Q6, the
+fastest non-buffered read happens when `offset=80%` and `blocksize=8m`.
+
+Furthermore, I have tested the difference between `ioengine=psync`, which is
+the default for fio, and `ioengine=libaio`, which is Linux native asynchoronous
+I/O. The result shows that `ioengine=libaio` is better in terms of bandwidth.
+
+For asynchoronous I/O, I also tested the impact of the `iodepth`. The result
+showes that `iodepth=64` will reach the peak bandwidth.
+
+Therefore, the fastest non-buffered 1G read should be the combination of
+`offset=80%`, `blocksize=8m`, `ioengine=libaio`, and `iodepth=64`.
+
+== Bonus
+
+=== Result
+#raw(read("code/bonus.output"))
+
+=== Question
+ Compare with multiple kinds of storage devices. \
+  Is there any significant difference between read/randread on storage devices? 
+  Why or why not? Please justify your answer in brief.
+ Find the specs of your own hardware that you tested on. \
+  Is your hardware running to spec? If not, could you come up with a possible 
+  theory?
+
+=== Answer
+This experiment include two storage devices, Crucial P3 Plus M.2
+SSD@ct1000p3pssd8 (Disk 1) and WD Blue HDD@wd10ezex (Disk 2). The hardware
+sequential read rate for Disk 1 is 5000MB/s. And the hardware transfer rate for
+Disk 2 is up to 150MB/s.
+
+The result for the experiment shows that
+ Disk 1 (SSD) is significantly faster than Disk 2 (HDD).
+ Disk 1 is quite a bit slower than the hardware spec (about 3400MB/s).
+ Disk 2 is faster than the hardware spec (about 170MB/s).
+
+The reason for the underperformance of Disk 1 might be
+- I'm using Disk 1 to run my OS at the same time, which interferes the result.
+- Disk 1 had been used for over 4 years and had used up 50% of the capacity.
+  In constract, Disk 2 was seldom used and didn't used up any of the capacity.
+
+#bibliography("bibliography.yml")