Find the Largest Files and Directories on Linux

In short
  • Quickest answer: du -h --max-depth=1 /path | sort -h — one directory at a time, follow the biggest one down.
  • Interactive: ncdu /path — navigates the disk usage tree visually. Worth installing.
  • Top-50 files by size: the script lower down.
  • Disk is full but du says it isn't: a process is holding deleted files open. lsof finds them.

Why this comes up

A disk that says it's 95% full when you thought you had plenty of space is one of the most common Linux annoyances. The cause is almost always a single directory tree that grew quietly — a log directory, a download folder, a build cache, an old kernel snapshot, a runaway journalctl. The fix is to find it.

Three tools do most of the work: du (built in everywhere), find (also built in), and ncdu (a small package). The recipes below are the ones experienced sysadmins reach for; the script at the bottom wraps them in something you can keep in /usr/local/sbin.

The quickest first step

When something is full, start at the root of the filesystem in question and work down. du with --max-depth=1 shows immediate subdirectories only, sorted by size:

# Subdirectories of / by total size, largest at the bottom
sudo du -h --max-depth=1 -x / 2>/dev/null | sort -h | tail -20

The flags worth knowing:

  • -h — human-readable sizes (MB, GB).
  • --max-depth=1 — only show the immediate children, not the whole tree. Repeat with --max-depth=2 to see grandchildren.
  • -x — stay on one filesystem. Important when /home or /var is a separate mount; without -x, du will descend into other filesystems too.
  • sort -h — sort the human-readable sizes correctly. sort -n alone treats "1.5G" as smaller than "200M".
  • 2>/dev/null — suppress "permission denied" noise for files du can't read.

Identify the biggest subdirectory, then repeat the command one level deeper:

# Suppose /var was the offender. Recurse into it.
sudo du -h --max-depth=1 -x /var 2>/dev/null | sort -h | tail -20

Three or four iterations almost always lead to the culprit.

Interactive: ncdu

ncdu is a curses-based interactive front-end for the same data du produces. Once you've installed it (it's in every distribution's repository under that name), you can walk the tree with arrow keys, see relative sizes as bar charts, and delete files in place. For one-off disk hunts it's faster than typing du commands.

# Scan / (skipping other mounts) and open the browser
sudo ncdu -x /

# Or scan a remote machine over SSH
ncdu -e -x <(ssh user@server 'ncdu -1xo- /')

The second form is a small trick: ncdu can emit a JSON scan from one machine and read it on another, which means you don't need to install ncdu on the server you're investigating — just on your laptop.

Finding the largest single files

du aggregates by directory. To find the largest individual files regardless of where they live, use find:

# Top 20 files anywhere under /var, by size
sudo find /var -xdev -type f -printf '%s\t%p\n' 2>/dev/null \
  | sort -rn | head -20 | awk '{
      size=$1;
      $1="";
      if (size>=1073741824) printf "%6.2f GB  %s\n", size/1073741824, $0;
      else if (size>=1048576) printf "%6.2f MB  %s\n", size/1048576, $0;
      else if (size>=1024)    printf "%6.2f KB  %s\n", size/1024, $0;
      else                    printf "%6d  B   %s\n", size, $0;
    }'

This prints the 20 largest files under /var in human-readable form, anywhere in the tree. The -xdev flag stops find at filesystem boundaries, like du -x above.

The same with files larger than a threshold:

# Files larger than 500 MB anywhere under /
sudo find / -xdev -type f -size +500M -printf '%s\t%p\n' 2>/dev/null \
  | sort -rn

A ready-to-use script

The script below packages the most common variants behind a single command. Save it as biggest.sh, make it executable and run it.

biggest.sh
#!/usr/bin/env bash
#
# biggest.sh — show the biggest files or directories under a path.
#
# Usage:
#   ./biggest.sh                  # top 20 directories under /
#   ./biggest.sh /var             # top 20 directories under /var
#   ./biggest.sh /var --files     # top 50 individual files under /var
#   ./biggest.sh /var --depth 2   # group by 2-level subdirectories
#
set -euo pipefail

PATH_ARG="${1:-/}"
MODE="dirs"
DEPTH=1
COUNT=20

shift || true
while [ "$#" -gt 0 ]; do
  case "$1" in
    --files) MODE="files"; COUNT=50 ;;
    --depth) DEPTH="$2"; shift ;;
    --top)   COUNT="$2"; shift ;;
    -h|--help)
      sed -n '2,12p' "$0"; exit 0 ;;
    *) echo "Unknown argument: $1" >&2; exit 2 ;;
  esac
  shift
done

if [ "$MODE" = "dirs" ]; then
  echo "Top $COUNT directories under $PATH_ARG (depth $DEPTH, one filesystem):"
  echo
  sudo du -h --max-depth="$DEPTH" -x "$PATH_ARG" 2>/dev/null \
    | sort -h \
    | tail -n "$COUNT"
else
  echo "Top $COUNT files under $PATH_ARG (one filesystem):"
  echo
  sudo find "$PATH_ARG" -xdev -type f -printf '%s\t%p\n' 2>/dev/null \
    | sort -rn \
    | head -n "$COUNT" \
    | awk -F'\t' '{
        size=$1; path=$2;
        if (size>=1073741824) printf "%8.2f GB  %s\n", size/1073741824, path;
        else if (size>=1048576) printf "%8.2f MB  %s\n", size/1048576, path;
        else if (size>=1024)    printf "%8.2f KB  %s\n", size/1024, path;
        else                    printf "%8d  B   %s\n", size, path;
      }'
fi

When du and df disagree

One classic puzzle: df says the disk is 95% full, but du -sh / only accounts for 60% of that. The usual cause is a process holding a deleted file open. Linux frees the disk space when the last reference to a file is closed; if a long-running process opened a log file, the file got rotated and unlinked, but the process kept its file descriptor, the bytes are still on disk — just invisible to du.

Find them with lsof:

# List open files that have been deleted, with sizes
sudo lsof -nP 2>/dev/null | awk '/deleted/ {print $7, $1, $2, $9}' | sort -rn | head

The fix is usually to restart the process holding the file. For systemd services, sudo systemctl restart <service> does it; for ad-hoc processes, find the PID and send it a graceful signal. See the systemd basics page for restart patterns.

Other places worth checking

Disk-eating offenders that come up regularly:

  • /var/log/journal — the systemd journal. Cap it with journalctl --vacuum-size=500M or by setting SystemMaxUse= in /etc/systemd/journald.conf.
  • /var/log more generally — old logs that logrotate never got around to compressing or removing.
  • /var/cache/apt or /var/cache/dnf — package caches. apt clean or dnf clean all reclaims this.
  • ~/.cache — user-level caches. Browser caches, thumbnail caches and language-server caches can all be GB-sized.
  • /boot — old kernels. On Debian/Ubuntu, sudo apt autoremove --purge removes kernels no longer needed.
  • Dockerdocker system df shows space used by images and containers, docker system prune -a reclaims it (read what it will delete first).

Related reading