1. System Call Basics

System calls (syscalls) are the interface for user-space programs to request services from the kernel. Examples include:

  • File I/O: read(), write(), open(), close().
  • Device Control: ioctl().
  • Signal Handling: kill(), signal().

2. System Call Table and Registration

Syscall Table:

  • A table (sys_call_table) maps syscall numbers to handler functions.
  • Architecture-Specific:
    • x86: Defined in arch/x86/entry/syscalls/syscall_64.tbl.
    • ARM: Defined in arch/arm/tools/syscall.tbl.
  • Registration:
    • Syscalls are registered at compile time using macros like SYSCALL_DEFINE (e.g., SYSCALL_DEFINE3(write, ...) for write()).
    • For custom syscalls (rare and discouraged), you would:
      1. Add an entry to the syscall table.
      2. Define the handler using SYSCALL_DEFINE.
      3. Recompile the kernel (or use modules for dynamic insertion).

3. Flow of System Calls

1. User-Space Invocation

  • The libc wrapper (e.g., read()ioctl()) triggers a software interrupt (int 0x80 on x86) or uses the syscall instruction (modern x86/ARM).
// User-space code
fd = open("/dev/mydevice", O_RDWR);  // Syscall 1: open()
read(fd, buf, 100);                  // Syscall 2: read()
ioctl(fd, MY_CMD, arg);              // Syscall 3: ioctl()
close(fd);                           // Syscall 4: close()

2. Transition to Kernel Mode

  • Switches to kernel mode (ring 0 on x86, EL1 on ARM).
  • Saves user-space registers (e.g., RIP, RSP, EFLAGS).
  • Jumps to the kernel’s syscall entry point (e.g., entry_SYSCALL_64 on x86)

3. Syscall Dispatching

  • Syscall Number:
    • The syscall number is stored in a register (e.g., RAX on x86, R7 on ARM).
    • Example: __NR_read (syscall number for read()).
  • Syscall Table:
    • The kernel uses sys_call_table (array of function pointers) to find the handler.
    • Example: sys_call_table[__NR_read] points to sys_read().

4. Handler Execution in Process Context

Generic Steps for All Syscalls:

  1. Argument Validation:
    • Check pointers (e.g., buf in read()) using access_ok()
    • Copy arguments from user space with copy_from_user() or get_user()
  2. Kernel Function Execution:
    • Perform the requested operation (e.g., read from a file, send an ioctl command)

File Operations (read/write):

  • File Descriptor Resolution:
    • Convert fd to a struct file using fdget().
    • Check file permissions (FMODE_READ/FMODE_WRITE).
  • Driver Interaction:
    • Call the read/write method from the file’s file_operations struct.
    • Example: For /dev/mydevice, this invokes the driver’s .read function.

I/O Control (ioctl):

  • The ioctl syscall (sys_ioctl()) calls the driver’s .unlocked_ioctl method. !../3-Resource/Platform/IOCTL in Kernel Device Drivers#3. Integrate into file_operations

5. Return to User Space:

  • Result is stored in eax/r0, and the kernel restores user registers
  • Execute iret (x86) or exception return (ARM) to resume user-mode execution.

4. Device File Operations

Character devices (e.g., /dev/char_dev) expose operations via file_operations:

struct file_operations {
    ssize_t (*read)(struct file *, char __user *, size_t, loff_t *);
    ssize_t (*write)(struct file *, const char __user *, size_t, loff_t *);
    long (*unlocked_ioctl)(struct file *, unsigned int, unsigned long);
    // ...
};

Examples:

  • Character Device Management in Kernel Drivers
  • IOCTL in Kernel Device Drivers

5. Signal Handling (Ctrl+C)

  • Ctrl+C sends a SIGINT to the foreground process.
  • Kernel Flow:
    1. The terminal driver (e.g., tty_io.c) receives the interrupt.
    2. The kernel’s signal-handling code (kernel/signal.c) delivers SIGINT to the process.
    3. The process’s signal handler (if registered via signal() or sigaction()) is invoked.
  • Key APIs:
    • send_signal(): Kernel function to queue a signal.
    • do_signal(): Handles signal delivery during return to user space.

6. Example: Tracing the read() Syscall

  1. User-Space:
read(fd, buf, 100); // Invokes syscall via libc
  1. Kernel-Space:
  • sys_read() resolves fd to a struct file.
  • Calls vfs_read(), which invokes the driver’s .read method.
  • Driver copies data from device to kernel buffer, then to user space using copy_to_user().

7. Key Kernel APIs and Modules

APIs for Syscalls:

  • SYSCALL_DEFINE{0-6}: Define syscall handlers (e.g., SYSCALL_DEFINE3(read, ...)).
  • copy_to_user()/copy_from_user(): Safely copy data between kernel and user space.
  • get_user()/put_user(): Access single values in user memory.

APIs for Device Drivers:

  • register_chrdev(): Register a character device.
  • unregister_chrdev(): Unregister a device.
  • class_create()/device_create(): Create device nodes in /dev.

APIs for Signals:

  • kill_pid(): Send a signal to a process
  • sigaction(): User-space API to register signal handlers

8. Critical Concepts to Know

  1. Syscall Table: Architecture-specific table mapping syscall numbers to handlers.
  2. Process Context vs. Interrupt Context:
    • Syscalls run in process context (can sleep).
    • Hardware interrupts run in interrupt context (atomic).
  3. Device File Operations: file_operations struct ties syscalls to driver functions.
  4. User/Kernel Boundary: Use copy_to_user/copy_from_user to safely exchange data.
  5. Signals: Delivered via send_signal and handled during syscall return.

References

  • Linux Kernel Documentation:
    • Syscalls: Documentation/admin-guide/sysctl/kernel.rst.
    • Device Drivers: Documentation/driver-api/.
    • Signals: Documentation/core-api/signal.rst.
  • Books:
    • Linux Device Drivers (O’Reilly). link
    • Understanding the Linux Kernel (O’Reilly). link