Wednesday, May 6, 2015

PSA: Linux Does RPF Checking

Twice now I've "discovered" that Linux hosts (even those that aren't doing IP forwarding) do Reverse Path Forwarding checks on incoming traffic.

Both times this has come up was in the context of a multicast application. It resulted in a conversation that went like this:
Application Person: Hey Chris, what's up with the network? My application isn't receiving any traffic.
Me: Um... The routers indicate they're sending it to you. The L3 forwarding counters are clicking. The L2 gear indicates it has un-filtered all of the ports between the router and your access port. Are you sure?
Application Person: My application says it's not arriving.
Me: I now have tcpdump running on your server. The traffic is arriving. Here are the packets. Do they look okay?
In the end, it turns out that the network was operating perfectly fine. The requested traffic was being delivered to the server, on the interface that requested it. It was the routing table within the Linux host that was screwed up.

RPF Checks
Reverse Path Flow checking is a feature that checks to make sure that a packet's ingress interface is the one that would be used to reach the packet's source. If a packet arrives on an interface other than the one matching the "reverse path", the packet is dropped.

RPF checking usually comes up in the context of routers. It's useful to make sure that users aren't spoofing their source IPs, and is a required feature of some multicast forwarding mechanisms, to ensure that packets aren't replicated needlessly.

Linux
Linux boxes, even those that aren't routing multicast packets (Not recommended - the Linux PIM implementation is weak, especially in topologies that invoke turnaround-router operation) also implement RPF filters. If the host has only a single IP interface, you'll never notice the feature, because the only ingress interface is the RPF interface for the entire IP-enabled universe.

Implementing these filters is a little weird when a multi-homed Linux host is operating as a multicast sender or receiver. It's weird because the socket libraries exposed to applications let the application chose which interface should receive (or send!) the traffic. If the application chooses the "wrong" interface (according to the Linux routing table), incoming traffic will be dropped. Tcpdump will see the traffic, but it'll be dropped in the kernel before it reaches the application.

There's a lever to control this behavior:

sysctl net.ipv4.conf.eth0.rp_filter
net.ipv4.conf.eth0.rp_filter = 1

It's documented in /usr/src/linux-3.6.6/Documentation/networking/ip-sysctl.txt which says:

rp_filter - INTEGER
        0 - No source validation.
        1 - Strict mode as defined in RFC3704 Strict Reverse Path
            Each incoming packet is tested against the FIB and if the interface
            is not the best reverse path the packet check will fail.
            By default failed packets are discarded.
        2 - Loose mode as defined in RFC3704 Loose Reverse Path
            Each incoming packet's source address is also tested against the FIB
            and if the source address is not reachable via any interface
            the packet check will fail.

        Current recommended practice in RFC3704 is to enable strict mode
        to prevent IP spoofing from DDos attacks. If using asymmetric routing
        or other complicated routing, then loose mode is recommended.

        The max value from conf/{all,interface}/rp_filter is used
        when doing source validation on the {interface}.

        Default value is 0. Note that some distributions enable it
        in startup scripts.

CentOS, it turns out, is one of the distributions that enables the feature in strict mode right out of the box. Rather than change the RPF checking, I opted to straighten out the host routing table both times this has come up. Still, I'm not convinced that RPF checking in a host (not even an IP forwarder, let alone running a multicast routing daemon) makes any sense. Given that user space processes are allowed to request multicast traffic on any interface it likes, it doesn't make much sense for the kernel to request that traffic (IGMP) on behalf of the application only to throw it away when the packets arrive.

The kernel documentation above cites RFC3704. I had a read through the RFC to see if it distinguishes between recommended behavior for hosts vs. routers. It does not. The RFC only addresses router behavior. I don't think the authors intended for this behavior to be implemented by hosts at all.

Of course, IP multihoming of hosts without either per-interface routing tables, or running a routing protocol is almost always a bad idea that will lead to sadness. I do not recommend it.