We improved the performance of a userspace TCP stack in Go

We improved the performance of a userspace TCP stack in Go(coder.com)

226 points by infomaniac 2 years ago | 129 comments

dpeckett 2 years ago |

Really cool to see others hacking on netstack, bit of a shame it's tied up in the gVisor monorepo (and all the Bazel idiosyncracies) but it's a very neat piece of kit.

I've actually been hacking on a similar FOSS project lately, with a focus on building what I'm calling a layer 3 service mesh for the edge. More or less came out of my learned hatred for managing mTLS at scale and my dislike for shoving everything through a L7 proxy (insane protocol complexity, weird bugs, and you still have the issue of authenticating you are actually talking to the proxy you expect).

Last week I got the first release of the userspace router shipped, worth taking a look if you want to play around with a completely userspace and unprivileged WireGuard compatible VPN server.

https://github.com/noisysockets/nsh/blob/main/docs/router.md

iangudger 2 years ago | |

If you want to use netstack without Bazel, just use the go branch:

https://github.com/google/gvisor/tree/go

go get gvisor.dev/gvisor/pkg/tcpip@go

The go branch is auto generated with all of the generated code checked in.

dave78 2 years ago | | |

I did this once for an experimental project and found it really difficult to keep the version of gVisor I was using up to date, since it seems like the API is extremely volatile. Anyone else had this experience? If so, is there some way around it that I don't know? Or did I just try it at a bad point in the development timeline?

raggi 2 years ago | | |

hey Ian, long time. Is there any chance y'all could swap out main so that main contains the generated code version?

I don't know the status on those export tools these days as I left the company years ago, but if they could sync with a different branch.

This would help various folks quite a bit, as for example tsnet users often fall into the trap of trying to do `go get -u`, which then pulls a non-functional gvisor version.

zxt_tzx 2 years ago |

I met one of the founders of Coder.com, he's a really cool dude. It's a pity that it is a product aimed more at enterprises than individual developers, else it would have far more developer mindshare.

Unlike, say, GitHub Codespaces, running something like this on your own infra means your incentives and Coder.com's are aligned, i.e. both of you want to reduce your cloud costs (as opposed to, say, GitHub running on Azure gives them an opportunity and incentive to mark up on Azure cloud costs).

santiagobasulto 2 years ago | |

It seems like a great product. I'm wondering why they don't offer more "startup-oriented" plans. It's like either Self Hosted or "Talk to sales". Is it maybe to not compete against Github codespaces?

kylecarbs 2 years ago | | |

Founder of Coder here. Many small (or teams at big) companies use Coder for free with <=150 devs just using our open-source.

We’ve tried to align our pricing with the value of the product. In small teams the productivity gains seem to be much lower, so we target Enterprise!

wmf 2 years ago |

"Asking for elevated permissions inside secure clusters at regulated financial enterprises or top secret government networks is at best a big delay and at worst a nonstarter."

But exfiltrating data with a userspace VPN is totally fine?

I'm also wondering why not use TLS.

parhamn 2 years ago |

I don't know anything about Coder, but Gvisor proliferation is annoying. It's a boon for cloud providers, helping them find another way to get a large multiple performance decrease per dollar spent in exchange for questionable security benefits. And I'm seeing it everywhere now.

weitendorf 2 years ago | |

I don't understand - what do you suggest as an alternative to Gvisor?

> large multiple performance decrease per dollar spent

Gvisor helps you offer multi-tenant products which can be actually much cheaper to operate and offer to customers, especially when their usage is lower than a single VM would require. Also, a lot of applications won't see big performance hits from running under Gvisor depending on their resource requirements and perf bottlenecks.

parhamn 2 years ago | | |

> I don't understand - what do you suggest as an alternative to Gvisor?

Their performance documents you linked claim vs runc: 20-40x syscall overhead, half of redis' QPS, and a 20% increase in runtime in a sample tenserflow script. Also google "CloudRun slow" and "Digital Ocean Apps slow", both are Gvisor.

Literally anything else.

tptacek 2 years ago | |

Are you referring to gVisor the container runtime, or gVisor/netstack, the TCP/IP stack? I see more uptick in netstack. I don't see proliferation of gVisor itself. "Security" is much more salient to gVisor than it is to netstack.

parhamn 2 years ago | | |

In the issue of abysmal performance on cloud-compute/PaaS Im talking about the container runtime (most Paas is gVisor or Firecracker, no?) cloudrun, DO, modal, etc.

But given this article is about improving gvisors userland tcp performance significantly, it seems like the netstack stuff causes major performance losses too.

I saw a github link in another top article today https://github.com/misprit7/computerraria where the Readme's Pitch section feels very relevant to gvisor.

kccqzy 2 years ago | |

There are still products from cloud providers that don't use gvisor. Basics like EC2 or GCE. Sounds like you chose the wrong cloud product.

loosescrews 2 years ago | |

Can you elaborate on your concern? Is the issue that you don't trust gVisor to keep the cloud provider secure?

parhamn 2 years ago | | |

Providers managed secure shared environments for decades before ultra inefficient wrappers and runtimes like gVisor existed.

raggi 2 years ago |

It's great to see this, I know the team went on a long journey through this and the blog makes it almost look shorter and simpler than it was. I'm hoping one day we can all integrate the support for GSO that's been landing in gvisor too, but so far we've (tailscale) not had a chance to look deeply into that yet. It was really effective for our tun and UDP interfaces though.

kylecarbs 2 years ago | |

At Coder we’re fans and users of Tailscale, so very happy to have these changes be consumed upstream as well!

ignoramous 2 years ago | |

> one day we can all integrate the support for GSO that's been landing in gvisor

Google engs recently rewrote the GSO bit, but unlike Tailscale, it is only for TCP, though.

Besides, gvisor has had "software" & "hardware" GSO support for as long as I can remember.

pantalaimon 2 years ago |

The obvious question is: How does it compare to the in-Kernel TCP stack?

raggi 2 years ago | |

It's less mature, which shows up in lots of places, such as sometimes having less than ideal defaults (as in buffer sizes shown here), and bugs if you start using more fancy features (which improve over time of course).

This is approximately the case for any alternative IP stack you might pick though, a mature IP stack is a huge undertaking with all the many flavors of enhancements to IP and particularly TCP over the years, the high variance in platform behaviors and configurations and so on.

In general you should only take on a dependency of a lesser-used IP stack if you're willing to retain or train IP experts in house over the long haul, because as is demonstrated here, taking on such a dependency means eventually you'll find a business need for that expertise. If that's way outside of your budget or wheelhouse, it might be worth skipping.

syzcowboy99 2 years ago | |

gVisor's netstack is still much slower than the kernel's (and likely always will be). The goal of this userspace netstack is not to compete with the kernel on performance, but offer an alternative that is more portable and secure.

Xelynega 2 years ago | | |

How is it more portable or secure than an API that's been stable for decades, and getting constant security fixes?

I see an explanation in their blog about avoiding TUN devices since they require elevated permissions, but why would you need a TUN device to send data to/from an application? I can't understand what their product does from the marketing material but it doesn't look like it would require constructing raw IP packets instead of TCP/UDP packets and letting the OS wrap them in the other layers.

raggi 2 years ago | | |

for some definition of portable which is deeply tied to the go runtime

jiveturkey 2 years ago |

help me understand something.

> we’d need a way for the TCP packets to get from the operating system back into Coder for encryption.

yes, this is commonly done via OpenSSL for example.

> This is called a TUN device in unix-style operating systems and creating one requires elevated permissions

waitasec, wut? sure you could use a TUN device I guess, but assuming some kind of multi-tenant separation is an underlying assumption they didn't mention in their intro, couldn't you also use cgroup'd containers? sorry if I'm not fluent in the terminology.

i'm struggling to understand the constraints that push them towards gVisor. simply needing to do encryption doesn't seem like justification. i'm sure they have very good reasons, but needing to satisfy a financial regulator seems orthogonal at best. i would just like to understand those reasons.

nynx 2 years ago |

Doesn’t creating a raw socket need elevated permissions?

tptacek 2 years ago | |

They're not creating raw sockets†. The neat thing about WireGuard is that it runs over vanilla UDP, and presents to the "client" a full TCP/IP interface. We normally plug that interface directly into the kernel, but you don't have to; you can just write a userspace program that speaks WireGuard directly, and through it give a TCP/IP stack interface directly to your program.

† I don't think? I didn't see them say that, and we do the same thing and we don't create raw sockets.

vlovich123 2 years ago | | |

So it tunnels TCP/IP over Wireguard UDP?

convolvatron 2 years ago |

is this part of the open source releases? I looked at the coder.com github, but couldn't find it. I haven't written a compatible TCP, but a different reliable transport in go userspace. fairness aside, i wonder why we dont see this more often. would love to take a look

tazjin 2 years ago | |

They upstreamed their gVisor changes: https://github.com/google/gvisor/pull/10287

andrewstuart 2 years ago |

If you’re tunneling a better connection configuration isn’t the tunnel what defines the latency?

andrewstuart 2 years ago |

I have a problem right now which is that it’s slow to copy large files from one side of the earth to the other. Is this the basis of a solution to that maybe?

392 2 years ago | |

No. Profile first. Make sure you've tried tweaking params like batch sizes.

dpe82 2 years ago | |

What do you think are the current problems contributing to your slow transfers?

andrewstuart 2 years ago | | |

Window and buffer size is a problem on high latency links.

raggi 2 years ago | |

not enough detail here to provide a good answer, but I can tell you explicitly that if you're using SMB you're likely not going to get good performance here even if your network stack is has tons of space to overcome bdp and congestion challenges.

jijji 2 years ago |

it's a solution looking for a problem

hpeter 2 years ago | |

It's an engineering challenge and they do solve a problem, it's just not your problem :) It's a nice read anyways.

lxgr 2 years ago | |

gVisor definitely solves a problem for me: https://news.ycombinator.com/item?id=39900329

yencabulator 2 years ago |

tl;dr Increased TCP receive buffer size, implemented HyStart instead of traditional TCP slow start in gVisor's netstack, changed an in-process packet queue from drop-when-full to block-when-full.