Today I’ll be covering a critically important part of Linux that gets no airtime at all, and that’s the shared memory subsystem. The closest you’ll come to it in most cases is seeing a /dev/shm filesystem when you type df. That is a recent development, however.
The shared memory subsystem came about because of a quirk in Linux process management that make interprocess communication (IPC) a little difficult. This is the fact that unless a process is directly spawned off another, it is impossible for two processes to begin communication with each other by accessing memory or variables in a program. Sure, connections can be made via named sockets or other kinds of sockets, but that requires a communications protocol and many times are more trouble than they’re worth.
So, the geniuses who came up with UNIX (and by extension Linux) came up with the idea of shared memory. A program can request a chunk of shared memory, and then access it in a similar manner that it would normal memory. A process can also attach to an already existing chunk of shared memory created by another process. This makes interprocess communication much simpler, and any program can then communicate with any other program, as long as they know the shared memory ID they’re working with.
Usually this subsystem just hums along in the background, most sysadmins never need to touch it. However, when it goes south, it goes south in a big way. I’ll show you how to examine it, and fix it.
First, examine. You do this using the “ipcs” command.
[root@katesama]# ipcs -a ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status ------ Semaphore Arrays -------- key semid owner perms nsems 0x00000000 5963776 apache 600 1 0x00000000 5931009 apache 600 1 0x00000000 5996546 apache 600 1 0x00000000 6029315 apache 600 1 0x00000000 6062084 apache 600 1 0x00000000 6094853 apache 600 1 0x00000000 6127622 apache 600 1 0x00000000 6160391 apache 600 1 ------ Message Queues -------- key msqid owner perms used-bytes messages
ipcs lists the shared memory segments that are in use. As you can see, in this case, there are no shared memory segments in use, although there are some semaphores. Semaphores are a lot like shared memory, except a semaphore is basically a signalling system rather than carrying any really useful information.
Problems will occur when the shared memory fills up (you can find out how much shared memory is available by looking in the files /proc/sys/kernel/shmall and /proc/sys/kernel/shmmax (one determines how big ALL of the shared memory can be, and one determines how big one chunk can be). The tricky thing about this is, it will return a -ENOSPC error, which is the same error you’ll get when a disk is full. “No space left on device”. But if you do a df, you’ll see that the disks aren’t full.
This is an indicator that all of the shared memory has been used up. An strace of the recalcitrant process will confirm this.
This does happen in the real world, primarily when httpd is killed uncleanly repeatedly.
How do you fix this?
This command allows you to delete shared memory segments that are not being used anymore. The only caveat is you have to be sure that the shared memory segment truly is not in use, or you will mess something up royal – and you probably won’t have any idea what until things start crashing. How can you tell? Well, you really can’t. This is where you kind of have to take an educated guess. lsof MAY help you in this case, but I’m not sure if it will tell you what you need to know. Just keep your wits about you. The worst that will likely happen is you’ll have to reboot. (I’m sure if you catch it at just the right time files could get corrupted, but that’s pretty unlikely, I wouldn’t worry about it).
Anyway, that’s the wide world of shared memory and IPC. Yet another supremely important and completely ignored aspect of the Linux operating system. I hope you’ve found this informative.