CPU, Disk, RAM, Ethernet Data Flow

366 Views Asked by At

Trying to have an understanding of data flow on a modern consumer desktop computers.

  1. Looking first at a SATA port. If you want to load some bytes into RAM does the CPU send that request to the memory controller which handles the request to the SATA device and is that data than moved onto RAM by the memory controller or does the CPU cache or registers get involved with the data at all?

  2. I assume the OS typically blocks the thread until an I/O request is completed. Does the memory controller send an interrupt to let the OS know it can schedule that thread into the queue again?

  3. Ethernet: So assuming the above steps are complete where some bytes of a file have been loaded onto RAM does the memory get moved to ethernet controller by the memory controller or does the CPU get involved holding any of this data?

  4. What if you use socket with localhost? Do we just do a round about with the memory controller or do we involve the ethernet controller at all?

  5. SATA to SATA storage transfer buffered anywhere?

I know that is a lot of questions, if you can comment on any I would appreciate it! I am really trying to understand the fundamentals here. I have a hard time moving on the higher levels of abstraction without these details...

2

There are 2 best solutions below

1
On BEST ANSWER

Looking first at a SATA port. If you want to load some bytes into RAM does the CPU send that request to the memory controller which handles the request to the SATA device and is that data than moved onto RAM by the memory controller or does the CPU cache or registers get involved with the data at all?

The CPU is not involved during the operation. It is a AHCI which makes the transfer which is a PCI DMA device.

The AHCI from Intel (https://www.intel.ca/content/www/ca/en/io/serial-ata/serial-ata-ahci-spec-rev1-3-1.html) is used for SATA disks. Meanwhile, for more modern NVME disks an NVME PCI controller is used (https://wiki.osdev.org/NVMe).

The OS will basically write in RAM to write to the registers of the AHCI. This way, it will be able to tell the AHCI to do stuff and tell it to write to RAM at a certain positions (probably in a buffer provided/allocated by the user mode process which asks for the data on disk). The operation is DMA so the CPU is not really involved.

From C++, you ask for data probably using fstream or direct API calls made to the OS in the library provided by the OS. For example, on Windows, you can use the WriteFile() function (https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile) to write to some files.

Underneath, the library is a thin wrapper which makes system calls (it is the same for the standard C++ library). You can browse my answer at Who sets the RIP register when you call the clone syscall? for more information on system calls and how they work on modern systems.

The memory controller is not really involved. It is involved to write to the registers of the AHCI but Brendan's answer is probably more useful for that matter because I'm not really aware.

I assume the OS typically blocks the thread until an I/O request is completed. Does the memory controller send an interrupt to let the OS know it can schedule that thread into the queue again?

Yes the OS blocks the thread and yes the AHCI triggers an MSI interrupt on command completion.

The OS will put the process on the queue of processes which are waiting for IO. The AHCI does trigger interrupts using the MSI capability of PCI devices. The MSI capability is a special capability which allows to bypass the IO APIC of modern x86 processors and to directly send an inter-processor interrupt to the local APIC. The local APIC then looks in the IDT as you would expect for the vector to trigger and makes the processor jump to the handler associated.

It is the handler number which differentiates between the devices that triggered the interrupt which makes it easy for the OS to just place a proper handler for the device on that interrupt number (a driver).

The OS has a driver model which takes for account the different types of devices that will be used on modern computers. The driver model will often be in the form of a virtual filesystem. The virtual filesystem basically presents every devices to the upper layer of the OS as a file. The upper layer basically makes open, read, write and ioctl calls on the file. Underneath, the driver will do complex things like triggering read/write cycles by writing the registers of the AHCI and put the process on other queues to wait for IO.

From user mode (like when calling fstream's open method), it is really a syscall which is made. The syscall handler will check all permissions and make sure that everything is okay with the request before returning a handle to the file.

Ethernet: So assuming the above steps are complete where some bytes of a file have been loaded onto RAM does the memory get moved to ethernet controller by the memory controller or does the CPU get involved holding any of this data?

The Ethernet controller is also a PCI DMA device. It reads and writes in RAM directly.

The Ethernet controller is a PCI DMA device. I never wrote an Ethernet driver but I can tell it just reads and writes in RAM directly. Now for network communications you will have sockets which act similarly to files but are not part of the virtual filesystem. The network cards (including Ethernet controllers) are not presented to the upper layer as files. Instead, you use sockets to communicate with the controller. Sockets not implemented in the C++ standard library but are present on all widespread platforms as OS provided libraries that must be used from C++ or C. Sockets are also system calls on their own.

What if you use socket with localhost? Do we just do a round about with the memory controller or do we involve the ethernet controller at all?

The ethernet controller is probably not involved.

If you use sockets with localhost, the OS will simply send the data to the loopback interface. The Wikipedia article is quite direct here (https://en.wikipedia.org/wiki/Localhost):

In computer networking, localhost is a hostname that refers to the current computer used to access it. It is used to access the network services that are running on the host via the loopback network interface. Using the loopback interface bypasses any local network interface hardware.

SATA to SATA storage transfer buffered anywhere?

It is transferred to RAM from the first device then transferred to the second device afterwards.

On the link I provided earlier for the AHCI it is stated that:

This specification defines the functional behavior and software interface of the Advanced Host Controller Interface (ACHI), which is a hardware mechanism that allows software to communicate with Serial ATA devices. AHCI is a PCI class device that acts as a data movement engine between system memory and Serial ATA devices.

The AHCI isn't for moving from SATA to SATA. It is for moving from SATA to RAM or from RAM to SATA. The SATA to SATA operation thus involves bringing the data in RAM then to move the data from RAM to the other SATA device.

0
On

The memory controller doesn't create requests itself (it has no DMA or bus mastering capabilities).

The memory controller is mostly about routing requests to the right places. For example, if the CPU (or a device) asks to read 4 bytes from physical address 0x12345678, then the memory controller uses the physical address to figure out whether to route that request to a PCI bus, or a different NUMA node/different memory controller (e.g. using quickpath or hyper-transport or omni-path links to other chips/memory controllers), or to its locally attached RAM chips. If the memory controller forwards a request to its locally attached RAM chips; then memory controller also handles the "which memory channel" and timing part; and may also handle encryption and ECC (both checking/correcting errors, and reporting them to OS).

Most devices support bus mastering themselves.

Because most operating systems use paging "contiguous in virtual memory" often doesn't imply "contiguous in physical memory". Because devices only deal with physical addresses and most transfers are not contiguous in physical memory; most devices support the use of "lists of extents". For example, if a disk controller driver wants to read 8 KiB from a disk, then the driver may tell the disk controller "get the first 2 KiB from physical address 0x11111800, then the next 4 KiB from physical address 0x22222000, then the last 2 KiB from physical address 0x33333000"; and the disk controller will follow this list to transfer the pieces of an 8 KiB transfer to the desired addresses.

Because devices use physical addresses and almost all software (including kernel, drivers) typically primarily use virtual addresses, something (kernel) will have to convert the virtual addresses (e.g. from a "read()" function call) into the "list of extents" that the device driver/s (probably) need. When an IOMMU is being used (for security or virtualization) this conversion may include configuring the IOMMU to suit the transfer; and in that case its better to think of them as "device addresses" that the IOMMU may convert/translate into physical addresses (where the device uses "device addresses" and not actual physical addresses).

For some (relatively rare, mostly high-end server) cases; the motherboard/chipset may also include some kind of DMA engine (e.g. "Intel QuickData technology"). Depending on the DMA engine; this may be able to inject data directly into CPU caches (rather than only being able to transfer to/from RAM like most devices), and may be able to handle direct "from one device to another device" transfers (rather than having to use RAM as a buffer). However; in general (because device drivers need to work when there's no "relatively rare" DMA engine) it's likely that any DMA engine provided by the motherboard/chipset won't be supported by the OS or drivers well (and likely that the DMA engine won't be supported or used at all).