Using awk to effectively filter Apache/Nginx access logs

Access log lines from Apache and Nginx look like:

your.domain.tld - - [15/Feb/2019:17:28:59 +0100] "GET /path HTTP/1.1" 200 16115 "" "Mozilla/5.0 (Linux; Android 5.0.1; SAMSUNG GT-I9515 Build/LRX22C) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/8.2 Chrome/63.0.3239.111 Mobile Safari/537.36" - 0.202

Tools like cut and awk don't see the bits between brackets and quotes as fields, but split on whitespace, making it more difficult to get the data you need.

FPAT to the rescue:

head /var/log/nginx/extra.log | awk '{print $6}' FPAT='[^ ]*|"[^"]*"|\\[[^]]*\\]'

This will get you all the requests strings a logfile with the above example format.

Read more about defining fields by content.