Netfilter Log Format Issues

Posted: 28th July 2010 by admin in Linux
Tags: , ,

Positives

  1. Netfilter logs are intuitive and easy to read by the occasional, non-expert admin.
  2. They provide much more information than f.e. ipchains, in particular about the transport protocol.
  3. Show the header of messages returned inside an ICMP packet.

Consistency Issues

  1. Most items in the log use the LABEL=value format, but:
    flags appear on their own,
    options use “OPT (abc..)“, and
    incomplete is flagged like this: “INCOMPLETE [12 bytes]“!
    The latter would also be much more useful if it showed the data:
    INCOMPLETE=0050445ACAFEAFFEEFFAEFAC“.
  2. The TOS field is ripped apart into “TOS” and “PREC”. This is particularly bad because TOS is increasingly being replaced by DS and ECN.
  3. The two MAC addresses and the type-code are thrown together into one 14 byte sequence.
  4. Recursive headers are enclosed in [..] but the [..] are also used for other purposes including inside a recursive header.
  5. Having a variable number of fields makes it impossible to quickly extract specific information. F.e. the following will fail miserably if the number of IP options varies:
    awk '/DPT=53 /{printf("%s %s\n", $10, $15)} logfile'
  6. The user prefix ought not to allow white space. Having a variable number of spaces (from different prefixes) makes it impossible to quickly extract specific information, see previous point.

Efficiency Issues

  1. The worst case netfilter log line would easily exceed 400 chars.
  2. The sheer length of the log lines typically causes wrapping over 2 or 3 lines on a 100 column xterm. This makes it extremely difficult to discriminate log records and to find individual items.
  3. The LABEL=value format with multi char labels is approximately double the size of what it needs to be.
  4. Interpreting flag bits and then printing them separately is approximately 10% as efficient as just printing the byte or word.
  5. Mapping protocol numbers to names does not belong into the kernel.

Parsing Issues

  1. Netfilter logs need a unique identifying mark. Recognition of netfilter logs is currently based on the hope that no other subsystem generates the fields we are using for identification.
  2. Currently, the arbitrary nature of the user prefix makes it impossible to guarantee recognition of the start of the log record.
  3. The user prefix needs to be separated from the next field by a space. It serves no conceivable purpose to have it joined up with the “IN=” field.
  4. Simple extractor commands like the one below are defeated by the variable number of fields.
    awk '/DPT=111 /{printf("%s %s\n", $10, $15)} logfile'
  5. The kernel code should not be burdened with providing a user interface by interpreting flags, protocol numbers and sub-fields.
  6. Header lengths need to be logged. Since header-option logging is an iptables option there is no way of telling if the header really did not have any options or if they were just not logged.
  7. The end of a log line needs to be reliably detectable by an unique field that is guaranteed to be present in all records. Relying on the EOL is unreliable because “newline” chars do occur in mid record if logging on the console and they can also be introduced by cut-n-paste operations.

Proposal for an alternative log format

Actually, there probably should be two log formats:

  1. ipchains compatible, so existing applications can continue to be used.
  2. a native format that is easy to read for a moderately experienced sysop and also parsable without resorting to unreasonable measures.

This format (with all options set) aims to make logfilter output consistent, cuts out some expansions, adds header-length (HLEN) fields and has a terminating mark “#”.

NF: USERPREFIX=userprefix IN=eth1 OUT= MAC=00808c1e1260,001076002fc2,0800 SRC=211.251.142.65 DST=203.164.4.223 LEN=100 TOS=0x00 TTL=44 ID=31526 FLAGS=0x4000 HLEN=60 OPT=072728CBA404DFCBA40253CBA4032ECBA403A2CBA4033ECBA402C1180746EA18074C52892734A200 PROTO=6 SPT=4515 DPT=111 SEQ=1168094040 ACK=0 WINDOW=32120 FLAGS=0x003 URGP=0 HLEN=40 OPT=020405B40402080A05E3F3C40000000001030300 #

We can already drop existing options (–log-ip-options, –log-tcp-sequence, –log-tcp-options) but the labels need to be kept so the number of fields remains constant. In most cases we could also drop the “MAC=..” (keep the label). These measures reduce the log to a more managable size:

NF: USERPREFIX=userprefix IN=eth1 OUT= MAC SRC=211.251.142.65 DST=203.164.4.223 LEN=100 TOS=0x00 TTL=44 ID=31526 FLAGS=0x4000 HLEN=60 OPT PROTO=6 SPT=4515 DPT=111 SEQ ACK WINDOW=32120 FLAGS=0x003 URGP HLEN=40 OPT #

Slightly re-ordering and shortening labels to single chars, but keeping labels for empty fields and reverting to the well known IP:PORT notation:

NF: U=userprefix 211.251.142.65:4515 203.164.4.223:111 I=eth1 O M L=100 S=0x00 T=44 I=31526 F=0x4000 H=60 P=6 S A W=32120 F=0x003 U H=40 O #

Other remedies

Harald Welte has written a ULOG module which provides and interface for a user-space daemon. This facility might allow a user-space plugin to produce customized output. It also won’t have to go through the old, inefficient syslog mechanism.

Unfortunately ULOG is not part of the kernel distribution yet. But it is available from http://gnumonks.org/projects/ for those who are willing and able to patch the kernel.

  1. [...] Netfilter Log Format Issues [...]