Different kinds of files

January 15, 2009

Linux has many different kinds of files. First let’s start with a little more basic discussion: what is a file?

Basically, a file is anything that can have a file descriptor associated with it.

What is a file descriptor?

Ahh. Glad you asked. Sit down, this could take a bit.

Linux is POSIX compliant, which means its API (Application Programming Interface) is consistent with a set of standards developed for Unix a long time ago. It defines a set of system calls (system calls are basically a way of requesting services from the kernel) and library calls. Anything that is POSIX compliant is going to have the same basic core API, although the standards are vague enough that there is a little wiggle room here and there.

There are some important system calls when it comes to file manipulation. Four of them are:

  • open
  • read
  • write
  • close

These system calls make up the foundations of file manipulation, although there are other calls that are just as important to do things like erasing or moving a file.

A file descriptor is a number returned from an open() system call. That’s all it is, is a number. However, once a file is opened, that descriptor is used to tell the kernel which opened file you are trying to operate on. The descriptor is passed into any other system call that is referencing that file, such as read(), write(), and close().

So, basically, a file is anything you can open using open().

You will find that nearly everything in Linux is a file – including network connections (though these don’t appear on the filesystem, you interact with them in nearly the same way as you do a regular file).

There are several different types of files.

  • File
  • Directory
  • Link
  • Named pipe
  • Block special file
  • Character special file
  • Socket

All of these file types ARE files, but they show up differently when listing a filesystem (the first character of the permissions shows you what kind of file it is) and more importantly behave differently when you try to operate on them.

A regular file (indicated by a “-”) is just that, a regular file. You can write to it, read from it, erase it, or whatever.

A directory (indicated by a “d”) is basically a file that contains a list of other files. It’s still a file, however.

A symbolic link (indicated by am “l”) is a file that points to another file – in such a way that the libraries and OS know how to follow it.

A named pipe is basically a FIFO (first in first out) that is exposed on the filesystem. These are used in interprocess communication – a process can have it open for reading, for example, while another has it open for writing. A socket is similar to a named pipe.

Device special and block files are both ways to interface with kernel devices. For example, /dev/null is a special file. When you write into it, the kernel takes the bytes and dumps them into the bitbucket. Other drivers do different things, for example, /dev/tty. When you do a ls of one of these, you’ll see a device major and device minor number – these numbers are the kernel’s way of keeping track of what goes where. You could rename /dev/null to /dev/Bush if you wanted to, and as long as it had the same major and minor numbers it would behave identically. The kernel doesn’t care what it’s called, only what it is.

Now that you understand what the different type of files are, how about a little tip on how to use them?

You probably already know about “ls”, so I’m not going to go into it. But did you know about lsof? lsof will show you all of the open files on your system – including network connections. (Remember I told you that network connections were files too? Here’s proof).

Another useful little command is mknod. This is how you create the device special and block special files (though don’t do it directly if you can avoid it, use MAKEDEV instead). This is useful to know if you, somehow, end up with /dev/null as a regular file. (It happens).

And don’t forget about the simple but tried-and-true command, ln. This creates symbolic links if given with the -s option, and hard links if not (but I’m not going to go into what those are right now).

Unexpectedly complex, huh? You’ll find every aspect of the Linux OS to be like that – a beguiling simplicity overlaying a fiendishly complex nest of interrelated subsystems.

It’s worth it to know all of these things, though. You never know when that kind of knowledge will come in handy.

  • Share/Bookmark

Leave a Reply