SACK Panic – Multiple TCP-based remote denial-of-service issues

SACK Panic – Multiple TCP-based remote denial-of-service issues(access.redhat.com)

416 points by cdingo 7 years ago | 131 comments

This is the way I block such things on my own VM's (not at work) using iptables:

    iptables -t raw -I PREROUTING -i eth0 -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss ! --mss 640:65535 -j DROP

Here it is in action:

    iptables -L -n -v -t raw | grep mss  
    84719 3392K DROP       tcp  --  eth0   *       0.0.0.0/0            0.0.0.0/0            tcp flags:0x17/0x02 tcpmss match !640:65535

My settings may be a little aggressive and may block some old pptp/ppoe users. Perhaps 520 would be a safer low end. As a funny side note, this also blocks hping3's default settings (ping floods) as it doesn't set mss. This also blocks a slew of really poorly coded scanners.

For everything else at work, we are behind a layer 7 load balancer that is not vulnerable.

You may also find it useful to block fragmented packets. I've done this for years and never had an issue:

    iptables -t raw -I PREROUTING -i eth0 -f -j DROP

If you have the PoC, then feel free to first verify you can browse to https://tinyvpn.org/ then send the small MSS packets to that domain, then see if you can still browse to it. I don't care if the server reboots or crashes. Just don't send DDoS please, as the provider will complain to me.

To see the counters increase, here is a quick and silly cron job that will show you the MSS DROPs in the last minute, that I will disable after a couple days: [1]

[1] - https://tinyvpn.org/up/mss/

ilikepi 7 years ago | |

The iptables commands listed in the Mitigation section of the RedHat article lists only SYN in the tcp-flags mask section:

    iptables -I INPUT -p tcp --tcp-flags SYN SYN -m tcpmss --mss 1:500 -j DROP

If I interpret the man page correctly, the above is more broad because it does not care about the presence or absence of other flags, whereas your rule explicitly requires the other listed flags to be unset. In fact it seems like the above might be broad enough to include incoming SYNACK response packets that are the result of outgoing connections.

Am I understanding this correctly, and if so, do you have a thought about why they suggest this?

LinuxBender 7 years ago | | |

Theirs is just more specific. Both should mitigate the attack, but I would follow theirs instead of my example. They are certainly a better authority on the subject. If something goes wrong, much better to say "Followed vendor suggestions" than random HN poster. :-) That said, I would still use the raw table vs input.

chupasaurus 7 years ago | |

FYI Debian Security Team recommends setting new sysctl value net.ipv4.tcp_min_snd_mss to 536, even though they (Debian) are preserving the default kernel value of 48 for compatibility.

LinuxBender 7 years ago | | |

Thats cool. In CentOS the default is 256. The settings are a little different though.

    net.ipv4.route.min_adv_mss = 256
    net.ipv4.tcp_base_mss = 512
    net.ipv6.route.min_adv_mss = 1220

londons_explore 7 years ago | | |

It's worth noting that lwIP, the network stack used by most microcontroller based IoT devices, has a very low MSS. It's configurable, but typically defaults to 512.

hackersword 7 years ago | |

How would you do that for the INPUT table (not a VM)? Just: > iptables -t raw -I INPUT -i eth0 -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss ! --mss 640:65535 -j DROP

here [1] gives example of ... is your just inverting/negating the DROP rule ?

>iptables -A INPUT -p tcp -m tcpmss --mss 1:500 -j DROP

[1] https://github.com/Netflix/security-bulletins/blob/master/ad...

LinuxBender 7 years ago | | |

The raw tables does not contain INPUT. For the raw table you would have to use PREROUTING. If you are using the default table of filter, then you can use INPUT.

So for the raw table, it would be

    iptables -t raw -I PREROUTING -i eth0 -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss --mss 1:500 -j DROP

For the default (filter) table

    iptables -t filter -I INPUT -i eth0 -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss --mss 1:500 -j DROP

Generally speaking, if you know you are going to drop everything that matches a pattern or address, it is useful to put that in the raw table, so that malicious traffic can't spike your CPU load as easily. Every packet to the filter table will incur potentially CPU expensive conntrack table lookups. As your conntrack table gets bigger, this gets more expensive.

The reason I use the opposite method is that we not the normal range we want. Programs can also set super high values or not set mss at all (which is not the same as 0).

I explicitly set the interface, so that we don't match interfaces such as lo, tun, tap, vhost, veth, etc... because you never know what weird behavior some program depends on. In my example, eth0 is directly on the internet. In your systems, that might be bond0.

ilikepi 7 years ago | | |

It's worth noting that use of -A instead of -I in your example from [1] likely makes this rule ineffective, since it will be appended to the end of the INPUT chain. This has already been reported as an issue[2].

[2]: https://github.com/Netflix/security-bulletins/issues/4

LinuxBender 7 years ago | |

In the event anyone needs this, you can also override outbound mss in case someone is telling your client to use a low mss.

This goes in the mangle table. DO NOT use this example unless you know for sure what you are doing.

    iptables -t mangle -I POSTROUTING -o eth0 -p tcp -m tcp --tcp-flags SYN,RST,ACK SYN -m tcpmss --mss 1:100 -j TCPMSS --set-mss 1360

This example should work for most use cases, but don't do this unless you for sure know the implications. Dropping bad inbound options is easy, but outbound can get more complicated. I am just showing this in case anyone asks and I am asleep. :-) Talk to your network admins and ask what is the highest mss/mtu your VPN's and 3rd party networks support.

This may not even help, as the packet has already been generated and we are too late in the process. I just figure someone might ask. There are probably use cases where this may help (for proxies, edge firewalls, hypervisors, docker hosts, maybe)

Or just log and drop the connections, or send yourself (your app) a tcp-reset.

isodude 7 years ago | | |

I read in a note just below --set-mss in man iptables-extensions that iptables will not increase mss if it's already lower than what set in --set-mss.

peterwwillis 7 years ago | |

And ipv6? Fragmentation and mss are handled somewhat differently, iirc

LinuxBender 7 years ago | | |

Good point. I don't have a place to test ipv6. You would also have to create ip6tables rules to test that. If you have a test server on ipv6, please apply the rule and link it here. Well, test first, then link here. :-)

topranks 7 years ago | | |

IP fragmentation is different in IPv6 true.

MSS is a TCP parameter, however, and operates at layer 4. Won’t matter if the protocol underneath is IPv4 or IPv6 in this case.

isodude 7 years ago | |

How about stripping MSS instead? Like so:

iptables -t mangle -I PREROUTING -p tcp --tcp-flags SYN SYN -m tcpmss --mss 1:500 -j TCPOPTSTRIP --strip-options mss

I experimented with a host that sends out mss 216 and the communication was still ok with the above, but not while dropping the traffic.

LinuxBender 7 years ago | | |

Thats a great idea. That would certainly help if someone unwittingly had some malware that enabled this behavior on their host, but you still wanted them to reach you.

Avamander 7 years ago | |

How does one do this with nftables?

LinuxBender 7 years ago | | |

I've not done this in nftables, but this might give some ideas: [1]

[1] - https://wiki.nftables.org/wiki-nftables/index.php/Mangle_TCP...

talawahtech 7 years ago |

AWS Bulletin: https://aws.amazon.com/security/security-bulletins/AWS-2019-...

FYI if your instances are behind an Application Load Balancer or Classic Load Balancer then they are protected, but NOT if they are behind a Network Load Balancer.

A patched kernel is available for Amazon Linux 1 and 2, so you won't have to disable SACK. You can run "sudo yum update kernel" to get it, but of course you have to reboot. Updated AMIs are also available.

Amazon Linux 1: https://alas.aws.amazon.com/ALAS-2019-1222.html Amazon Linux 2: https://alas.aws.amazon.com/AL2/ALAS-2019-1222.html

For Amazon Linux 2 the fixed kernel is kernel-4.14.123-111.109.amzn2. Looking at my instances, it look like I have been on that version since Friday.

ones_and_zeros 7 years ago | |

Even if your instances are behind ALBs or ELBs they may not be protected if they make outbound connections to the internet.

pferde 7 years ago | | |

Is this so? Can this kernel panic also be triggered in TCP connections initiated by the victim? I can't find a conclusive mention of this anywhere.

As each direction of a TCP connection has its own MSS, it would make sense that an attacker's server could exploit this.

trollied 7 years ago |

Looks like most of the internet should be getting patched ASAP! Or not, as is usually the case :-(

I remember the original ping of death back in the 90s https://web.archive.org/web/19981206105844/http://www.sophis...

usrname 7 years ago |

Some solution: echo 0 > /proc/sys/net/ipv4/tcp_sack

https://github.com/Netflix/security-bulletins/blob/master/ad...

href 7 years ago | |

Thanks for pointing this out. Applying this buys us time before we can properly patch all our systems. In our case this was easy to roll out in a jiffy.

I do wonder though, can anyone guess what kind of impact one might see with TCP SACK disabled? We don't have large amounts of traffic and serve mostly websites. Maybe mobile phone traffic might be a bit worse off if the connection is bad?

wademealing 7 years ago | | |

Disclaimer: I worked on initial Red Hat article linked above.

In my personal AWS instance from the last few days less than half a percent of the traffic had hit the firewall rule to log the error.

Most of that traffic seemed to come from the China, this was possibly port probing / portscans or really old hardware accessing my the server.

I would say that the iptables rule is a 'better' solution than dropping sack as you may find you use significantly more CPU/bandwidth when dealing with retransmits when not using selective acknowledgements.

topranks 7 years ago | |

The MSS change is better, won’t affect performance.

hoffie 7 years ago |

Red Hat's article on these issues also provides further explanations: https://access.redhat.com/security/vulnerabilities/tcpsack

dang 7 years ago | |

That seems to have a bit more information, so we switched to it from https://www.openwall.com/lists/oss-security/2019/06/17/5. Thanks!

loeg 7 years ago | | |

https://github.com/Netflix/security-bulletins/blob/master/ad... is the advisory by the party that discovered the issue. (Disclosure: I have met Jonathan Looney and know some of the Netflix engineering staff, but I don't work for Netflix.)

ilkkao 7 years ago | | |

The original link includes links to the patches. Fascinating how the SACK MSS problem seems to be a relatively simple situation nobody realized can occur.

bauruine 7 years ago | |

There is also an ansible playbook on the resolve tab to easily apply the net.ipv4.tpc_sack workaround on all your hosts.

pferde 7 years ago | | |

With a typo ("tpc_sack" instead of "tcp_sack") in the task name. The playbook still works, but I found it chuckle-worthy. :)

geggam 7 years ago |

Is this the same thing ?

https://www.cvedetails.com/cve/CVE-2005-0960/

Multiple vulnerabilities in the SACK functionality in (1) tcp_input.c and (2) tcp_usrreq.c OpenBSD 3.5 and 3.6 allow remote attackers to cause a denial of service (memory exhaustion or system crash).

wademealing 7 years ago | |

Similar concept, different operating system.

jkern 7 years ago |

So you can remotely cause a kernel panic on basically any system with kernel newer than 2.6.28? The doesn't sound great

syn0byte 7 years ago | |

No, not "any system". Besides needing SACK enabled (which is by default) you also need segment offloading and non-shite networking hardware that will respect and preserve stupid MSS fields in packets.

pending a patch simply disable SACK: ~$ echo 0 > /proc/sys/net/ipv4/tcp_sack

and/or disable segmentation offloading: ~$ ethtool -K eth? tso off

TCP and Checksum offloading still aren't super standard on customer grade NICs or virtual machines. I'd assume less than half of the internet's linux hosts are actually at risk.

acdha 7 years ago | | |

> TCP and Checksum offloading still aren't super standard on customer grade NICs or virtual machines.

I thought VMware shipped that at least decade ago — is there some specific sub-feature you had in mind? Similarly, at least Apple's consumer hardware had checksum offloading back in the early 2000s and segmentation support shipped in 10.6 (2009) so it seems like it should be relatively mainstream since they tended to use commodity NIC hardware.

foofc7c8 7 years ago | | |

Can't find anywhere prerequisite on segment offloading, any link on this?

oasisbob 7 years ago | | |

Isn't TSO enabled on EC2? Their bulletin implies it at least, I seem to remember the same.

dsp 7 years ago | | |

Disabling tso alone would not be enough. You also need to disable gso if you go that route.

bhauer 7 years ago |

Sorry, this is probably a bit simplistic, but I am curious: How likely is this to affect embedded devices? E.g., hardware firewalls, routers, IoT devices that all use a Linux kernel?

LinuxBender 7 years ago | |

If you are not exposing any tcp ports or reaching out directly from those devices to a malicious host, then very unlikely. Either way, it's best to check the vendors site or open a ticket with them, if that is an option.

An alternate option would be to put the device behind another firewall or load balancer or proxy that you know is not vulnerable.

xyzzy_plugh 7 years ago | | |

I guess a question I have is then: can I hose you during a TLS handshake? If I can forge DNS, then I can DoS, right? Which makes BGP a prime target right now?

ravingraven 7 years ago | |

They are equally as affected. Linux is Linux, it makes no difference in what box it runs. What is usually a mitigating factor is that embedded devices usually have a very different configuration compared to non-embedded devices (built with minimal options, not a lot of services running on them etc.).

jandrese 7 years ago |

That BUG_ON is a full up kernel panic? That's a pretty severe issue if so.

Twirrim 7 years ago | |

Yes. In all seriousness, this is a "drop everything and patch" situation, as soon as the patches are available.

It's a little bit more involved than a ping of death, but still, relatively easy to exploit.

jandrese 7 years ago | | |

Seems like anybody with an open TCP port and SACK enabled (the default) is vulnerable.

dsp 7 years ago | |

Yes.

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.gi...

nitinics 7 years ago |

Here's the series of patches - looks like applied few hours ago.

https://lore.kernel.org/netdev/20190617.104121.1475407136257...

cmurf 7 years ago |

These are patched in linux 5.1.11, 4.19.52, 4.14.127, 4.9.182, 4.4.182.

https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.1.1...

5.0 is EOL as of 5.0.21.

rphlx 7 years ago |

When was the last time Linux had a similar, reliably-remotely-exploitable kernel panic in the TCP/IPv4 stack? Pre-2000?

foofc7c8 7 years ago | |

Never, from what I can recall.

rphlx 7 years ago | | |

Though not as bad as Win9x it definitely had some frag-of-death/ping-of-death vulns around 1997/98. teardrop et al.

danmg 7 years ago | | |

No. Teardrop.

mkj 7 years ago |

Will any Android phones be affected, or they don't support segment offloading?

Droobfest 7 years ago | |

Android seems to be affected: https://android.googlesource.com/kernel/common/+/5d625e9b4a6...

This might be fun...

Edit: Don't know if segmentation offloading is on by default in Android, but on my default Arch kernel it is, so I wouldn't know why not.

strcat 7 years ago | | |

It is impacted, but patches being present in kernel/common doesn't imply that since all of the upstream changes from the LTS branches are merged.

fortran77 7 years ago |

Will there be a big performance hit if I just turn sack off (for now) until I finish running all our tests on the new kernel?

CloudNetworking 7 years ago | |

Depends on your traffic, but in general terms you might want to instead drop packets with low MSS.

ahoka 7 years ago |

Maybe this helps someone:

  [Unit]
  Description=Disable TCP SACK

  [Service]
  Type=simple
  ExecStart=/sbin/iptables -A INPUT -p tcp -m tcpmss --mss 1:500 -j DROP

  [Install]
  WantedBy=sysinit.target

keyle 7 years ago |

What does OpenBSD/NetBSD do differently to not be affected by this?

rphlx 7 years ago | |

They use a different TCP/IP stack which implemented SACK without introducing this bug.

It's a Linux-specific implementation defect, not an intrinsic problem with the TCP SACK wire protocol or spec.

jontro 7 years ago | | |

They found a bug in FreeBSD too: https://github.com/Netflix/security-bulletins/blob/master/ad...

sl-1 7 years ago |

Is there any way to quickly test if any of my machines are vulnerable?

loeg 7 years ago | |

  [ "$(uname -s)" = Linux ] && echo "Vulnerable"

("CVE-2019-11479: Excess Resource Consumption Due to Low MSS Values (all Linux versions)", "CVE-2019-11478: SACK Slowness (Linux < 4.15) or Excess Resource Usage (all Linux versions).")

https://github.com/Netflix/security-bulletins/blob/master/ad...

superkuh 7 years ago | | |

I know you're being glib, but are 2.6.x kernels vulnerable? The big corps tend define all linux as all linux that they support and isn't end of life. As far as I read early 3.x kernels on the Ubuntu side are not effected. Like version 12 and before.

So there's plenty of linux being used out there that's probably not effected.

sp332 7 years ago | |

If you click the Diagnose tab, there's a script that will check your kernel versions and relevant TCP settings. https://access.redhat.com/sites/default/files/cve-2019-11477... If you're not running RedHat, the kernel detection might be too strict, but at least there's some example code for you to check your own settings.

davoti 7 years ago |

Folks,

imo, TSO is intel NIC card function, does this affect others like from Cavium CPU?

thanks

shereadsthenews 7 years ago |

Looks like the issue was fixed upstream a month ago. Might have been nice to know earlier? Is this how long it takes for the distros to lurch into action?

megous 7 years ago | |

The fix was pushed just now to stable kernels.

shereadsthenews 7 years ago | | |

That's just a detail of how linux is developed. The fixing patch was mailed on May 17th and it mentions the CVE.

bloopernova 7 years ago |

Ouch.

Good luck out there, folks.

xeno56 7 years ago |

how does one apply this patch?

gravitas 7 years ago |

I'm collecting vendor links internally for work:

Red Hat / CentOS

https://access.redhat.com/security/vulnerabilities/tcpsack

https://access.redhat.com/security/cve/cve-2019-11477

https://access.redhat.com/security/cve/cve-2019-11478

https://access.redhat.com/security/cve/cve-2019-11479

Ubuntu

https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/SACKPanic

https://people.canonical.com/~ubuntu-security/cve/2019/CVE-2...

Oracle Linux

https://linux.oracle.com/errata/ELSA-2019-4686.html (RHCK kernel)

https://linux.oracle.com/errata/ELSA-2019-4685.html (UEK5 kernel)

https://linux.oracle.com/errata/ELSA-2019-4684.html (UEK4 kernel)

Amazon AWS

https://aws.amazon.com/security/security-bulletins/AWS-2019-...

https://alas.aws.amazon.com/ALAS-2019-1222.html (Linux 1)

https://alas.aws.amazon.com/AL2/ALAS-2019-1222.html (Linux 2)

Debian

https://security-tracker.debian.org/tracker/CVE-2019-11477

https://security-tracker.debian.org/tracker/CVE-2019-11478

https://security-tracker.debian.org/tracker/CVE-2019-11479

SUSE / SLES

https://www.suse.com/de-de/support/kb/doc/?id=7023928

https://www.suse.com/security/cve/CVE-2019-11477/

https://www.suse.com/security/cve/CVE-2019-11478/

https://www.suse.com/security/cve/CVE-2019-11479/

CoreOS

https://coreos.com/releases/#2079.6.0

Arch

https://security.archlinux.org/AVG-983

https://security.archlinux.org/CVE-2019-11477

https://security.archlinux.org/CVE-2019-11478

https://security.archlinux.org/CVE-2019-11479

(please reply with additional vendor links if you have them)

SEJeff 7 years ago |

I guess this post beat mine? https://news.ycombinator.com/item?id=20205859

floatingatoll 7 years ago | |

Nope, it's just the random dupe that ended up getting the upvotes for whatever reasons. This happens constantly and the one that gets the upvotes has more to do with chance than any other factor.

cookiecaper 7 years ago | | |

s/chance/timing/. Social media scoring can be fickle indeed, but there are entire industries devoted to optimizing and reverse-engineering the hotness algos of various traffic-drivers. This case is probably coincidental, but it's naive to ascribe high scoring merely to luck. It looks SEJeff posted around 12pm PDT, maybe on his way out to lunch. This post was two hours later, as everyone got back from lunch. :)