1. System Call Basics System calls (syscalls) are the interface for user-space programs to request services from the kernel. Examples include:
File I/O: read(), write(), open(), close(). Device Control: ioctl(). Signal Handling: kill(), signal(). 2. System Call Table and Registration Syscall Table: A table (sys_call_table) maps syscall numbers to handler functions. Architecture-Specific: x86: Defined in arch/x86/entry/syscalls/syscall_64.tbl. ARM: Defined in arch/arm/tools/syscall.tbl. Registration: Syscalls are registered at compile time using macros like SYSCALL_DEFINE (e.g., SYSCALL_DEFINE3(write, ...) for write()). For custom syscalls (rare and discouraged), you would: Add an entry to the syscall table. Define the handler using SYSCALL_DEFINE. Recompile the kernel (or use modules for dynamic insertion). 3. Flow of System Calls 1. User-Space Invocation The libc wrapper (e.g., read(), ioctl()) triggers a software interrupt (int 0x80 on x86) or uses the syscall instruction (modern x86/ARM). // User-space code fd = open("/dev/mydevice", O_RDWR); // Syscall 1: open() read(fd, buf, 100); // Syscall 2: read() ioctl(fd, MY_CMD, arg); // Syscall 3: ioctl() close(fd); // Syscall 4: close() 2. Transition to Kernel Mode Switches to kernel mode (ring 0 on x86, EL1 on ARM). Saves user-space registers (e.g., RIP, RSP, EFLAGS). Jumps to the kernel’s syscall entry point (e.g., entry_SYSCALL_64 on x86) 3. Syscall Dispatching Syscall Number: The syscall number is stored in a register (e.g., RAX on x86, R7 on ARM). Example: __NR_read (syscall number for read()). Syscall Table: The kernel uses sys_call_table (array of function pointers) to find the handler. Example: sys_call_table[__NR_read] points to sys_read(). 4. Handler Execution in Process Context Generic Steps for All Syscalls: Argument Validation: Check pointers (e.g., buf in read()) using access_ok() Copy arguments from user space with copy_from_user() or get_user() Kernel Function Execution: Perform the requested operation (e.g., read from a file, send an ioctl command) File Operations (read/write): File Descriptor Resolution: Convert fd to a struct file using fdget(). Check file permissions (FMODE_READ/FMODE_WRITE). Driver Interaction: Call the read/write method from the file’s file_operations struct. Example: For /dev/mydevice, this invokes the driver’s .read function. I/O Control (ioctl): The ioctl syscall (sys_ioctl()) calls the driver’s .unlocked_ioctl method. !../3-Resource/Platform/IOCTL in Kernel Device Drivers#3. Integrate into file_operations 5. Return to User Space: Result is stored in eax/r0, and the kernel restores user registers Execute iret (x86) or exception return (ARM) to resume user-mode execution. 4. Device File Operations Character devices (e.g., /dev/char_dev) expose operations via file_operations:
...