Find Top 404 Error Pages with Apache
A quick one-liner to find the most common pages giving 404 errors on your apache2 setup. Set this up as a shell alias to get easy access at any time.
cut -d'"' -f2,3 /var/log/apache/access.log | awk '$4=404{print $4" "$2}' | sort | uniq -c | sort -rg
Replace /var/log/apache/access.log with the path to your own Apache setup. On web hosts this may be under ~/logs/apache or elsewhere.
This one liner breaks down as follows:
cutsplits the input (the logfile) by-ddelimiter and returning only the fields given by `-f``- Output is piped into
awkwhich searches for lines where field 4 = 404, returning these as a line containing404 URL - These are then sorted (so duplicates can be counted) with
sort uniq -ccounts the duplicates chucks them away and appends a number to the beginning of each row that is left- The resulting output is then
sorted numerically-gand-rreverse (highest first - remove this to get the top ones at the bottom of the list)