Ahmet Can Gulmez

Posted on Oct 9

The Linux Programming Interface - Memory Mappings

#c #linux #lowcode #gcc

Let's continue discussing the Linux programming with mmap() system call.

The mmap() system call creates a new memory mapping in the calling process's virtual address space. There are two types of mapping:

File mapping: A file mapping maps a region of a file directly into the calling process's virtual memory.
Anonymous mapping: An anonymous mapping doesn't have a corresponding file. Instead, the pages of the mapping are initialized to 0.

The memory in one process's mapping may be shared with mappings in other processes (MAP_SHARED). This can occur in two ways:

When two processes map the same region of a file, they share the same pages of physical memory.
A child process created by fork() inherits copies of its parent's mappings, and these mappings refer to the same pages of physical memory as the corresponding mappings in the parent.

We also set the memory mapping as private (MAP_PRIVATE) so that other processes cannot see the content of the memory mapping.

In total, we can use mmap() in four situations:

Private file mapping
Private anonymous mapping
Shared file mapping
Shared anonymous mapping

And don't forget exec() family erase and rebuild the process virtual memory so that memory mapping is lost.

If you type the man 2 mmap and see the its signatures:

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
int munmap(void *addr, size_t length);
int msync(void *addr, size_t length, int flags);
void *mremap(void *oldaddr, size_t oldsize, size_t newsize, int flags, ... /* void *newaddr */)

In here:

The addr argument indicates the virtual address at which the mapping is to be located. You should put the NULL so that kernel selects the proper address.
The length argument specifies the size of the mapping in bytes.
The prot argument is a bit mask specifying the protection to be placed on the mapping and can take PROT_NONE, PROT_READ, PROT_WRITE, PROT_EXEC. For example, PROT_READ | PROT_WRITE means that mapping can be read and written but not executed.
The flags argument is a bit mask of options controlling various aspects and can be MAP_PRIVATE and MAP_SHARED.
The fd and offset arguments are just for file mapping. But, in case of anonymous mapping, it is good practice to set fd as -1.

Below is the simple anonymous file mapping program:

#include "../linux.h"

void main(int argc, char *argv[])
{
   char *addr;
   int fd;
   struct stat sb;

   if (argc != 2 || strcmp(argv[1], "--help") == 0)
      usage_error("Wrong command-line usage");

   fd = open(argv[1], O_RDONLY);
   if (fd == -1)
      syscall_error();

   if (fstat(fd, &sb) == -1)
      syscall_error();

   addr = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
   if (addr == MAP_FAILED)
      syscall_error();

   if (write(STDOUT_FILENO, addr, sb.st_size) == -1);
      syscall_error();

   exit(EXIT_SUCCESS);
}

As can be seen above, for file mapping (private or shared), the following steps are applied:

Obtain the file descriptor that is mapped, typically via a call to open().
Use stat() system call to get the file length in bytes.
Pass these variables to mmap() system call call with NULL as start address of mapping.

Below is the another example about shared file mapping:

#include "../linux.h"

#define MEM_SIZE 10

void main(int argc, char *argv[])
{
   char *addr;
   int fd;

   if (argc < 2 || strcmp(argv[1], "--help") == 0)
      usage_error("Wrong command-line usage");

   fd = open(argv[1], O_RDWR);
   if (fd == -1)
      syscall_error();

   addr = mmap(NULL, MEM_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
   if (addr == MAP_FAILED)
      syscall_error();

   if (close(fd) == -1)       /* No longer need 'fd' */
      syscall_error();

   printf("Current string=%.*s\n", MEM_SIZE, addr);

   if (argc > 2) {            /* Update contents of region */
      if (strlen(argv[2]) >= MEM_SIZE)
         usage_error("String too long");

      memset(addr, 0, MEM_SIZE); /* Zero out region */
      strncpy(addr, argv[2], MEM_SIZE - 1);
      if (msync(addr, MEM_SIZE, MS_SYNC) == -1)
         syscall_error();

      printf("Copied \"%s\" to shared memory\n", argv[2]);
   }

   exit(EXIT_SUCCESS);
}

In this example, we see the writing 10 bytes to shared memory mapping after the zeroing a partition of file mapping. In addition, we see the msync() system call. It is actually optional but good practice when writing to disk. It ensures that the 10 bytes are written to disk.

In addition to MAP_PRIVATE and MAP_SHARED, Linux allows a number of other values to be included in the mmap() flags argument:

MAP_ANONYMOUS
MAP_FIXED
MAP_LOCKED
MAP_HUGETLB
MAP_NORESERVE
MAP_PRIVATE
MAP_POPULATE
MAP_SHARED
MAP_UNINITIALIZED

Please read the manual page (man 2 mmap) to learn these flags.

When talking about memory mapping and virtual memory layout, I wanna give you some virtual memory operations. There are four main system call:

The mprotect() system call changes the protection on a region of virtual memory.
The mlock() and mlockall() system calls lock a region of virtual memory into physical memory.
The mincore() system call allows a process to determine whether the pages in a region of virtual memory are resident in physical memory.
The madvise() system call allows a process to advise the kernel about its future patterns of usage of a virtual memory region.

Signatures of these system calls:

int mprotect(void *addr, size_t len, int prot);
int mlock(const void *addr, size_t len);
int mlockall(int flags);
int munlock(const void *addr, size_t len);
int munlockall(void);
int mincore(void *addr, size_t length, unsigned char *vec);
int madvise(void *addr, size_t length, int advice);

Below is the simple program about changing the memory protection mapped by mmap() system call.

#include "../linux.h"

#define LEN          (1024 * 1024)
#define SHELL_FMT    "cat /proc/%ld/maps | grep zero"
#define CMD_SIZE     (sizeof(SHELL_FMT) + 20)

void main(int argc, char *argv[])
{
   char cmd[CMD_SIZE];
   char *addr;

   addr = mmap(NULL, LEN, PROT_NONE, MAP_SHARED | 
               MAP_ANONYMOUS, -1, 0);
   if (addr == MAP_FAILED)
      syscall_error();

   /* Display line from /proc/self/maps corresponding to mapping */

   printf("Before mprotect()\n");
   snprintf(cmd, CMD_SIZE, SHELL_FMT, (long) getpid());
   system(cmd);

   /* Change protection on memory to allow read and write access */

   if (mprotect(addr, LEN, PROT_READ | PROT_WRITE) == -1)
      syscall_error();

   printf("After mprotect()\n");
   system(cmd);

   exit(EXIT_SUCCESS);
}

In here, firstly I've mapped a shared anonymous memory. mprotect() changes the memory mapping's from none to read and write protections. To see the this change, we can inspect the /proc/PID/maps file.

Top comments (1)

Paul J. Lucas • Oct 9

mmap has nothing to do with Linux. mmap is part of POSIX and is implemented in all POSIX-compliant operating systems, e.g., BSD and macOS.