SACK Panic – Multiple TCP-based remote denial-of-service issues(access.redhat.com) |
SACK Panic – Multiple TCP-based remote denial-of-service issues(access.redhat.com) |
iptables -t raw -I PREROUTING -i eth0 -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss ! --mss 640:65535 -j DROP
Here it is in action: iptables -L -n -v -t raw | grep mss
84719 3392K DROP tcp -- eth0 * 0.0.0.0/0 0.0.0.0/0 tcp flags:0x17/0x02 tcpmss match !640:65535
My settings may be a little aggressive and may block some old pptp/ppoe users. Perhaps 520 would be a safer low end. As a funny side note, this also blocks hping3's default settings (ping floods) as it doesn't set mss. This also blocks a slew of really poorly coded scanners.For everything else at work, we are behind a layer 7 load balancer that is not vulnerable.
You may also find it useful to block fragmented packets. I've done this for years and never had an issue:
iptables -t raw -I PREROUTING -i eth0 -f -j DROP
If you have the PoC, then feel free to first verify you can browse to https://tinyvpn.org/ then send the small MSS packets to that domain, then see if you can still browse to it. I don't care if the server reboots or crashes. Just don't send DDoS please, as the provider will complain to me.To see the counters increase, here is a quick and silly cron job that will show you the MSS DROPs in the last minute, that I will disable after a couple days: [1]
iptables -I INPUT -p tcp --tcp-flags SYN SYN -m tcpmss --mss 1:500 -j DROP
If I interpret the man page correctly, the above is more broad because it does not care about the presence or absence of other flags, whereas your rule explicitly requires the other listed flags to be unset. In fact it seems like the above might be broad enough to include incoming SYNACK response packets that are the result of outgoing connections.Am I understanding this correctly, and if so, do you have a thought about why they suggest this?
net.ipv4.route.min_adv_mss = 256
net.ipv4.tcp_base_mss = 512
net.ipv6.route.min_adv_mss = 1220??
here [1] gives example of ... is your just inverting/negating the DROP rule ?
>iptables -A INPUT -p tcp -m tcpmss --mss 1:500 -j DROP
[1] https://github.com/Netflix/security-bulletins/blob/master/ad...
So for the raw table, it would be
iptables -t raw -I PREROUTING -i eth0 -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss --mss 1:500 -j DROP
For the default (filter) table iptables -t filter -I INPUT -i eth0 -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss --mss 1:500 -j DROP
Generally speaking, if you know you are going to drop everything that matches a pattern or address, it is useful to put that in the raw table, so that malicious traffic can't spike your CPU load as easily. Every packet to the filter table will incur potentially CPU expensive conntrack table lookups. As your conntrack table gets bigger, this gets more expensive.The reason I use the opposite method is that we not the normal range we want. Programs can also set super high values or not set mss at all (which is not the same as 0).
I explicitly set the interface, so that we don't match interfaces such as lo, tun, tap, vhost, veth, etc... because you never know what weird behavior some program depends on. In my example, eth0 is directly on the internet. In your systems, that might be bond0.
This goes in the mangle table. DO NOT use this example unless you know for sure what you are doing.
iptables -t mangle -I POSTROUTING -o eth0 -p tcp -m tcp --tcp-flags SYN,RST,ACK SYN -m tcpmss --mss 1:100 -j TCPMSS --set-mss 1360
This example should work for most use cases, but don't do this unless you for sure know the implications. Dropping bad inbound options is easy, but outbound can get more complicated. I am just showing this in case anyone asks and I am asleep. :-) Talk to your network admins and ask what is the highest mss/mtu your VPN's and 3rd party networks support.This may not even help, as the packet has already been generated and we are too late in the process. I just figure someone might ask. There are probably use cases where this may help (for proxies, edge firewalls, hypervisors, docker hosts, maybe)
Or just log and drop the connections, or send yourself (your app) a tcp-reset.
MSS is a TCP parameter, however, and operates at layer 4. Won’t matter if the protocol underneath is IPv4 or IPv6 in this case.
iptables -t mangle -I PREROUTING -p tcp --tcp-flags SYN SYN -m tcpmss --mss 1:500 -j TCPOPTSTRIP --strip-options mss
I experimented with a host that sends out mss 216 and the communication was still ok with the above, but not while dropping the traffic.
[1] - https://wiki.nftables.org/wiki-nftables/index.php/Mangle_TCP...
FYI if your instances are behind an Application Load Balancer or Classic Load Balancer then they are protected, but NOT if they are behind a Network Load Balancer.
A patched kernel is available for Amazon Linux 1 and 2, so you won't have to disable SACK. You can run "sudo yum update kernel" to get it, but of course you have to reboot. Updated AMIs are also available.
Amazon Linux 1: https://alas.aws.amazon.com/ALAS-2019-1222.html Amazon Linux 2: https://alas.aws.amazon.com/AL2/ALAS-2019-1222.html
For Amazon Linux 2 the fixed kernel is kernel-4.14.123-111.109.amzn2. Looking at my instances, it look like I have been on that version since Friday.
As each direction of a TCP connection has its own MSS, it would make sense that an attacker's server could exploit this.
I remember the original ping of death back in the 90s https://web.archive.org/web/19981206105844/http://www.sophis...
https://github.com/Netflix/security-bulletins/blob/master/ad...
I do wonder though, can anyone guess what kind of impact one might see with TCP SACK disabled? We don't have large amounts of traffic and serve mostly websites. Maybe mobile phone traffic might be a bit worse off if the connection is bad?
In my personal AWS instance from the last few days less than half a percent of the traffic had hit the firewall rule to log the error.
Most of that traffic seemed to come from the China, this was possibly port probing / portscans or really old hardware accessing my the server.
I would say that the iptables rule is a 'better' solution than dropping sack as you may find you use significantly more CPU/bandwidth when dealing with retransmits when not using selective acknowledgements.
https://www.cvedetails.com/cve/CVE-2005-0960/
Multiple vulnerabilities in the SACK functionality in (1) tcp_input.c and (2) tcp_usrreq.c OpenBSD 3.5 and 3.6 allow remote attackers to cause a denial of service (memory exhaustion or system crash).
pending a patch simply disable SACK: ~$ echo 0 > /proc/sys/net/ipv4/tcp_sack
and/or disable segmentation offloading: ~$ ethtool -K eth? tso off
TCP and Checksum offloading still aren't super standard on customer grade NICs or virtual machines. I'd assume less than half of the internet's linux hosts are actually at risk.
I thought VMware shipped that at least decade ago — is there some specific sub-feature you had in mind? Similarly, at least Apple's consumer hardware had checksum offloading back in the early 2000s and segmentation support shipped in 10.6 (2009) so it seems like it should be relatively mainstream since they tended to use commodity NIC hardware.
An alternate option would be to put the device behind another firewall or load balancer or proxy that you know is not vulnerable.
It's a little bit more involved than a ping of death, but still, relatively easy to exploit.
https://lore.kernel.org/netdev/20190617.104121.1475407136257...
https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.1.1...
5.0 is EOL as of 5.0.21.
This might be fun...
Edit: Don't know if segmentation offloading is on by default in Android, but on my default Arch kernel it is, so I wouldn't know why not.
[Unit]
Description=Disable TCP SACK
[Service]
Type=simple
ExecStart=/sbin/iptables -A INPUT -p tcp -m tcpmss --mss 1:500 -j DROP
[Install]
WantedBy=sysinit.targetIt's a Linux-specific implementation defect, not an intrinsic problem with the TCP SACK wire protocol or spec.
[ "$(uname -s)" = Linux ] && echo "Vulnerable"
("CVE-2019-11479: Excess Resource Consumption Due to Low MSS Values (all Linux versions)", "CVE-2019-11478: SACK Slowness (Linux < 4.15) or Excess Resource Usage (all Linux versions).")https://github.com/Netflix/security-bulletins/blob/master/ad...
So there's plenty of linux being used out there that's probably not effected.
imo, TSO is intel NIC card function, does this affect others like from Cavium CPU?
thanks
Good luck out there, folks.
Red Hat / CentOS
https://access.redhat.com/security/vulnerabilities/tcpsack
https://access.redhat.com/security/cve/cve-2019-11477
https://access.redhat.com/security/cve/cve-2019-11478
https://access.redhat.com/security/cve/cve-2019-11479
Ubuntu
https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/SACKPanic
https://people.canonical.com/~ubuntu-security/cve/2019/CVE-2...
https://people.canonical.com/~ubuntu-security/cve/2019/CVE-2...
https://people.canonical.com/~ubuntu-security/cve/2019/CVE-2...
Oracle Linux
https://linux.oracle.com/errata/ELSA-2019-4686.html (RHCK kernel)
https://linux.oracle.com/errata/ELSA-2019-4685.html (UEK5 kernel)
https://linux.oracle.com/errata/ELSA-2019-4684.html (UEK4 kernel)
Amazon AWS
https://aws.amazon.com/security/security-bulletins/AWS-2019-...
https://alas.aws.amazon.com/ALAS-2019-1222.html (Linux 1)
https://alas.aws.amazon.com/AL2/ALAS-2019-1222.html (Linux 2)
Debian
https://security-tracker.debian.org/tracker/CVE-2019-11477
https://security-tracker.debian.org/tracker/CVE-2019-11478
https://security-tracker.debian.org/tracker/CVE-2019-11479
SUSE / SLES
https://www.suse.com/de-de/support/kb/doc/?id=7023928
https://www.suse.com/security/cve/CVE-2019-11477/
https://www.suse.com/security/cve/CVE-2019-11478/
https://www.suse.com/security/cve/CVE-2019-11479/
CoreOS
https://coreos.com/releases/#2079.6.0
Arch
https://security.archlinux.org/AVG-983
https://security.archlinux.org/CVE-2019-11477
https://security.archlinux.org/CVE-2019-11478
https://security.archlinux.org/CVE-2019-11479
(please reply with additional vendor links if you have them)
https://security-tracker.debian.org/tracker/CVE-2019-11477
Disclaimer: I work for SUSE
https://support.f5.com/csp/article/K78234183
Fedora
https://lists.fedoraproject.org/archives/list/package-announ... (F29)
https://lists.fedoraproject.org/archives/list/package-announ... (F30)
CentOS
https://lists.centos.org/pipermail/centos-announce/2019-June... (v6)
https://lists.centos.org/pipermail/centos-announce/2019-June... (v7)
https://coreos.com/releases/#2079.6.0[edit: deleted link, OP has updated]
I just assume that stripping MSS should be enough, looking through the information that is available.
From https://en.m.wikipedia.org/wiki/Maximum_segment_size :
"To avoid fragmentation in the IP layer, a host must specify the maximum segment size as equal to the largest IP datagram that the host can handle minus the IP and TCP header sizes.[2] Therefore, IPv4 hosts are required to be able to handle an MSS of 536 octets (= 576[3] - 20 - 20) and IPv6 hosts are required to be able to handle an MSS of 1220 octets (= 1280[4] - 40 - 20)."
So it seems like one needs to clamp tcpv4 to at least 536, and tcpv6 to at least 1220. (tcpv6 is common shorthand for TCP over IPv6, similar to udpv6 and icmpv6)
$ ethtool -k eth0 | grep tcp-seg
tcp-segmentation-offload: onAlso on the virtualization side, VMWare VMXNet adapters also support offloading for guests.
As to the specific subset; TCP Segmentation Offload. As was mentioned in the article.
Yes, I know. I was asking for clarification on the off chance that you were describing something which didn’t ship a decade ago. I first used TSO on servers in the early 2000s and by 2010 even the consumer-grade hardware I was seeing had it.
"When Segmentation offload is on and SACK mechanism is also enabled, due to packet loss and selective retransmission of some packets, SKB could end up holding multiple packets, counted by ‘tcp_gso_segs’."
Segmentation offload in linux is dependent on checksum offloads per here:
https://www.kernel.org/doc/Documentation/networking/segmenta...
I have a personal Digital Ocean (not my employer) instance that is frequently being probed for stuff (primarily Russian and Chinese IPs). Same old, same old.
I've been running with the rule for around a week just logging & dropping small MSS packets out of curiosity, but hardly seen anything worth writing home about. I was somewhat surprised. I'm curious to see how long it takes for that rule to go nuts (my shellshock rule still triggers from time to time, that had a definite curve of action)
More and more are moving away from $0.25 microcontrollers, and up to $5 SoC's running Linux, so the problem is going away gradually...
YOUR_RULE -m limit --limit 2/sec -j LOG --log-prefix="MALFORMED_MSS: " --log-ip-options --log-tcp-options --log-level 7The bigger impact will be for users far away, with increased risk of packet loss and higher latency.
It's too easy to drop packets with very low MSS and, unless you've got specific needs (someone mentioned IOT), there's no reason to not drop packets with MSS < 536 or so. I believe Window's smallest MTU (MSS + IP and TCP headers) size is 576 bytes for example.
If you put each link on a line and then leave a newline between each one you should get a nice list though.
https://packetstormsecurity.com/files/15507/CA-96.26.ping.ht...
My favorite of that era was simply the working-as-designed simplicity of sneaking the Hayes modem hangup sequence into various protocols: actual Hayes modems used +++ with a time-delay to send commands such as ATH0 (hangup) but everyone else skipped that time-delay in an attempt to avoid the patent so you could disconnect any modem-connected system if you could figure out how to get it to echo "+++ATH0". Some IP stacks (e.g. Windows 95) would simply send the received ICMP payload as the response so a simple `ping -p …` would do it but people found ways to cause similar problems with sendmail, FTP, etc.
https://dl.packetstormsecurity.net/new-exploits/modem-DoS.tx...
Pop into some random channel, send "/ctcp #channel ping +++ATH0", and wait patiently... a moment or two later you would be rewarded with a flood of "signoff" messages as the users' TCP sessions to the IRC server timed out (by responding to the CTCP, they had, in effect, told their modems to hang up).
The goal, of course, was to get the highest "body count" possible from a single CTCP message.
Smurf attacks, the "ping of death", AOHell, the latest sendmail and wu-ftpd holes of the week, open proxies... the Internet was a very entertaining place for a bored teenager from the midwest back then.
Thanks for the flashback!