from Michael Reber

Inodes, hardlinks and symlinks

Linux

How exactly a Linux system stores files and folders or how they are referenced is often not clear to everyone.

Before we start, let's briefly look at the basics of inodes. Inodes are data structures, represented as an integer and unique per file. They contain all information such as access rights (read, write, execute), ownership, group, file type, file size, SELinux context and also the number of «links» pointing to the contents of the file. The content of the file itself and the file name are stored separately and are not contained in the inode.

Show inodes

The inode number of a file shows ls -i to. Complete metadata is provided by stat:

[root@rlwebp01 ~]# ls -i swissmakers-apache-dos.conf
537286221 swissmakers-apache-dos.conf

[root@rlwebp01 ~]# stat swissmakers-apache-dos.conf
  File: swissmakers-apache-dos.conf
  Size: 520 Blocks: 8 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 537286221 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:admin_home_t:s0
Access: 2025-08-04 10:39:30.552986781 +0200
Modify: 2025-06-30 01:07:07.825337403 +0200
Change: 2025-06-30 01:07:07.825337403 +0200
 Birth: 2025-06-30 01:07:07.825337403 +0200

The file name that is displayed comes from the directory entry, not from the inode. A directory is essentially a table that maps file names to inode numbers. This separation is the prerequisite for hard links and symlinks to work at all.

When inodes go out

When formatting a file system, a fixed number of inodes is reserved. With ext4, the standard setting is one inode per 16 KB of memory. On larger systems with many small files (e.g. mail servers, global session caches, elastcsearch clusters, etc.), this ratio is not always sufficient.

The system then reports «No space left on device», although df -h displays sufficient free memory. In such cases, the second look should be df -i:

[root@rlwebp01 ~]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
devtmpfs 2008078 474 2007604 1% /dev
tmpfs 2013666 3 2013663 1% /dev/shm
tmpfs 819200 1000 818200 1% /run
/dev/mapper/rl-root 153659392 648597 153010795 1% /
/dev/sda1 524288 374 523914 1% /boot

With IUse% 100 the problem can only be solved by deleting files that are no longer needed or by recreating the file system with a higher inode density (mkfs.ext4 -i ...).

Hardlinks

A hard link is an additional directory entry that points to the same inode as an existing entry. There is no original and no copy, both names are equivalent. The inode only recognises its link counter.

A hard link is created with ln:

[root@rlwebp01 blog]# echo "Swissmakers" > original.txt
[root@rlwebp01 blog]# ln original.txt hardlink.txt

[root@rlwebp01 blog]# ls -li
total 8
827116010 -rw-r--r--. 2 root root 12 May 14 15:13 hardlink.txt
827116010 -rw-r--r--. 2 root root 12 May 14 15:13 original.txt

Both entries have the same inode number, the link counter is set to 2. original.txt is also immediately available in hardlink.txt because both names refer to exactly the same data blocks.

Restrictions

Hard links have two hurdles:

  1. File system-boundInode numbers are only unique within a file system. Hard links do not work across mount points.
  2. No directoriesHardlinks to directories are blocked by default (for unprivileged users). Otherwise, loops could be created that find, you or backup tools into endless loops.

Where are hard links used?

The strength of Hardlinks lies in space-saving snapshot backups. Tools like rsnapshot or rsync can create a complete directory for each snapshot, but only copy changed files, for example. Unchanged files are mounted as an additional hard link in the new snapshot:

[root@rlwebp01 blog]# rsync -a --link-dest=/tmp/backup/2026-05-08 /tmp/blog/ /tmp/backup/2026-05-09/

rsync will check whether files stored in the source directory (/tmp/blog/) are also available in the link directory (/tmp/backup/2026-05-08) are available. If this is the case, hard links to these files are created in the target directory (/tmp/backup/2026-05-09/) instead of physically copying the files. To the outside world, each snapshot looks like a full backup. The actual memory requirement corresponds to the sum of the changes between the snapshots.

Symlinks

A symlink (symbolic link, also known as a soft link) is an independent file with its own inode. Its content consists of a path. Each time it is accessed, the kernel follows this path to the target file. Conceptually, a symlink corresponds to a shortcut under Windows or an alias on macOS.

[root@rlwebp01 blog]# ln -s /etc/redhat-release symlink-to-sysrelaese

[root@rlwebp01 blog]# ls -li symlink-to-sysrelaese
827116011 lrwxrwxrwx. 1 root root 19 May 14 15:41 symlink-to-sysrelaese -> /etc/redhat-release

The l in the permissions identifies the file as a link, the arrow points to the original. The pure path content is readlink back. With -f nested symlinks and relative paths are also resolved, if available:

[root@rlwebp01 blog]# readlink symlink-to-sysrelaese
/etc/redhat-release

[root@rlwebp01 blog]# readlink -f symlink-to-sysrelaese
/etc/rocky-release

Nice to Know

Symlinks are allowed to do practically everything that hardlinks are not:

  • Point to destinations in other file systems
  • Point to directories by default
  • Contain relative or absolute paths

The price for this: If the target is moved or deleted, the symlink points to nothing. Such «dangling symlinks» can be avoided with find (in the optimum case, is not displayed):

[root@rlwebp01 blog]# find /etc -xtype l

Where are symlinks used?

Practically everywhere in a modern Linux system. There are three very typical areas of application:

  • Version management in the Debian universe, via update-alternatives: /usr/bin/python points out /etc/alternatives/python, which in turn refers to /usr/bin/python3.11. A version change is essentially a Symlink change.
  • Systemd to activate or deactivate services that are to be started automatically at boot time is activated after sending e.g. systemctl enable httpd a symlink is created: (Created symlink /etc/systemd/system/multi-user.target.wants/httpd.service → /usr/lib/systemd/system/httpd.service.)
  • Site configurations: With Nginx and Apache, a vhost can easily be activated by changing the configuration from sites-available to sites-enabled is linked.

Hardlink or symlink?

The decision can be reduced to a few criteria:

RequirementHardlinkSymlink
Linking via multiple file systemsNoYes
Linking to directoriesNoYes
Remains valid after renaming the destinationYesNo
Visible as a link in ls -lNoYes
Own permissionsNo (inode-bound)Yes (but ultimately these of the target)
File sizeas originalLength of the path in bytes

Simply put: Hardlinks for deduplicated snapshots within a volume, Symlinks for everything else.

What rm actually makes

rm does not delete any files. The command calls the syscall unlink(2) removes a directory entry and decrements the link counter in the inode. The effective storage is only released when two conditions are met:

  1. The link counter is set to 0
  2. No process keeps the file open

This results in two types of behaviour that regularly occur in everyday server use.

Log rotation works without a service restart. As long as a process keeps the old log file open, it continues to write to it, even if the directory entry has already been removed. Only when the file descriptor is closed or after a SIGHUP The store is released for reopening.

The Memory remains after rm occupied. If a process keeps a large file open and it is deleted, the file system does not release the space. This can be diagnosed with lsof:

[root@rlwebp01 blog]# lsof | grep deleted

In this situation, no further rm. Only the restart or a kill -HUP of the holding process releases the required space. Alternatively, the file descriptor can be called up under /proc//fd/ with truncate -s 0 directly without stopping the process.

Conclusion

File names in Linux are only directory entries, the inode manages the actual file. This separation explains several peculiarities that occur in everyday computer science: the file system is full despite free memory, deleted log files continue to grow.

Hardlinks and symlinks make this separation even more useful. Hardlinks for deduplicated data within a volume, symlinks as flexible links across multiple file systems. Both have been part of the standard inventory of Linux administration since the early days of Unix and can be found in almost every productive environment.

Photo of author

Michael Reber

Years of experience in Linux, security, SIEM and private cloud

Hinterlassen Sie einen Kommentar

4 × one =