On this page On this page
Episode 11 – Modern networking in Firefox with Max Inden.
A conversation with Max Inden, Staff Software Engineer at Mozilla, about modernizing Firefox’s networking stack in Rust. We cover his work on the QUIC and HTTP/3 stack — improving UDP I/O, congestion control, and overall performance — and why QUIC matters as a fast, encrypted, and evolvable transport for HTTP/3, WebTransport, and beyond.
If you like this podcast you might also like our modular network framework in Rust: https://ramaproxy.org
00:00 Intro00:38 Introduction to Max Inden03:27 Max Inden's Journey to Mozilla06:32 The Role of IETF in Internet Design09:42 QUIC and HTTP/3 in Firefox12:27 Understanding HTTP/3 Upgrade Mechanisms15:15 Challenges with UDP and Firefox's Networking Stack18:15 Optimizing UDP I/O for Performance21:36 Cross-Platform Performance Considerations24:23 Network Drivers and Their Impact27:18 Exploring Happy Eyeballs and Connection Strategies30:07 WebTransport and Future of QUIC32:56 Contributions to Firefox and Open Source36:05 Happy Eyeballs and related56:15 Github Git Hosting57:24 Quic Usage within Firefox01:03:02 Closing Thoughts and call to Action01:06:44 Outro
Music for this episode was composed by Dj Mailbox. Listen to his music at https://on.soundcloud.com/4MRyPSNj8FZoVGpytj .
Elizabeth (Plabayo)
0:13 | 🔗
This is netstack.fm, your weekly podcast about networking, Rust and everything in between. You are listening to episode 11, recorded on the 15th of October, 2025, where Glen has a conversation with Max, Staff Software Engineer at Mozilla, where he works on their Quic and H3 networking stack. Welcome for another week in netstack.fm. Today I will be having a conversation with Max Inden. He's a staff software engineer at Mozilla. And we will talk a bit about modernizing Firefox networking stack in Rust. Throughout this conversation, we will cover his work on the quick and the http3 stack. We will also talk a bit about the challenges he had around UDP. Near the ending, we will also start to go towards the future and discuss WebTransport and MASQUE So welcome. Max in our studio, well the virtual Studio at least. So yeah, I'm very excited to talk to you. I know you're mostly from the peer-to-peer world and those kind of edge protocols, and it was always fascinating to work with you. So I'm happy to learn from you today about your work at QUIC and HTTP3. I'm sure you have a lot to contribute. But before we get into the technicalities, could you maybe like tell a bit about your origins and how you got where you are now? Sure thing. ⁓ So right now I'm a software engineer at Mozilla and I'm working on Firefox's networking stack with a focus on QUIC and HTTP3. So QUIC being maybe in the future the replacement for TCPTLS and then HTTP3 as like the third generation of the HTTP protocol. Where am I coming from? Maybe a small anecdote. I started off ⁓ at a startup doing various things. Eventually I got the task to do the accounting and two months later the startup went bankrupt. So from there I went into software engineering and ⁓ joined a company called CoreOS. They were working on all things Kubernetes and a Linux distribution. And here I got in touch with the Prometheus project, Prometheus being like a monitoring tool and a time series database. And I've been working on the integration between Prometheus and Kubernetes. ⁓ For those that I don't know, worked for example with Prometheus operator, kube-state metrics or alert manager, like that's work I touched in the past. I'm still involved with the project. I'm maintaining the Rust client, the Prometheus Rust client, but I'm not involved as much as I have been in the past. Yeah, from here, you mentioned I've been working a lot on peer-to-peer software, in particular on libp2p, ⁓ so library for peer-to-peer. Here I've been maintaining the Rust side of things, so Rust libp2p libp2p in the end is a specification which is then implemented in various languages, and I've been focusing on Rust. And among the many building blocks that libp2p offers, ⁓ I... I would say my main focus was on distributed hash tables ⁓ and a somewhat unique hole punching mechanism which did not have any reliance on any central infrastructure but otherwise very similar to ICE or turn and stun in these kind of protocols. Yeah, eventually I quit the peer-to-peer world. I was already going to the IETF. IETF the Internet Engineering Task Force conferences, and I was very impressed by Mozilla's work there. So I quit and then really was looking around and I thought Mozilla would be the very best option. Again, very impressed by Mozilla's involvement, both on Firefox, but also in anything internet as a whole. And I was always very curious about http3 and quic and had quite some experience there. And so I started contributing to the Firefox implementation of those two protocols. Yep. And here I am. Yeah, here you are. Very fascinating and thank you for sharing your journey. Now, when you were working on Prometheus, was it still from SoundCloud? Because I always find it very fascinating how a company like SoundCloud is the one that makes Prometheus and then later donated it. Was it still the case that you worked there or was it already just remotely from wherever you wanted? So SoundCloud, yes, was the main driver of Prometheus, the project. When I joined the project, already multiple other companies joined the open source project. So at that point, it wasn't just SoundCloud. And I was working on Prometheus through the company Corus. So I've never worked at SoundCloud. But I have worked with the people at SoundCloud and learned a lot from them. So I'm very happy about that relationship. Okay, very cool. And then the IETF meetings, like I've never been there before. Like how can I imagine myself to be those meetings? Like what is there on the agenda or their talks? Is it mostly conversations? How are those meetings? Mm-hmm. Yeah, how do I phrase this? So ⁓ usually how I introduce this is I have never seen such a good signal to noise ratio as in like so well prepared and so well done content. The ITF is not so much your standard tech conference trying to educate people, but their goal is much rather to design the internet. And it so happens that you can sit in the same room while the internet is being designed, which is a wonderful opportunity and was, been for me as I can see how these, all these experts, ⁓ yeah, make this wonderful thing called the internet happen. Yeah, very cool. And we all love the internet, so that's cool. And I mean, I love their work in general. just, was totally not aware of that they have meetings. So that's very fascinating to learn. I can highly recommend joining them and all of them are recorded so if you just want to listen in you can either join remotely or you find all the recordings online. It is a bit daunting at the beginning but I think it's well worth it to once see how this is happening. I'm very impressed. I imagine, yeah. Yeah, and of course the talks are one thing but I imagine it's also very fascinating and interesting to talk to these people in person and just have conversations and learn from them, maybe ask questions that matter to you, which might not be covered in the talk. So if you can make it in person, I suppose there's a little value in there. Yeah, as in every conference, the hallway track is always the most important track. Yeah, okay, very cool. So now you're at Mozilla. When you arrived, how was the status of Quic and Http 3 at Mozilla and I suppose specifically in the Firefox team? Very good in the sense. So a little bit like that circle back a little bit. Google took ⁓ Quik to the ITF, right? Google Quik to the ITF. And then multiple companies joined the design of the IGF Quik version. And Mozilla was very involved in this, in particular, Martin Thompson from Mozilla being one of the RFC authors, actually. So I don't want to say inherit since I don't fully own the stack. But I think I joined this, the Rust Quic implementation from Mozilla where it was in a very well designed, very well done state. ⁓ So all of the work that I have done so far have only been A possible because the stack is so well designed. ⁓ before I even joined. okay and so quick is on top of UDP, HTTP 3 is on top of quick, HTTP 1 and HTTP2 was already there in Firefox that lives in TCP. The story from a client perspective and the server perspective is a bit different. You are in a browser, so I suppose you mostly or even just deal with the client sites. Now, how does it work? How does Firefox, for example, decides that you go for quick or that you go for, well, HTTP3 or if you go for HTTP2. Yes. Yeah. Okay. So there are two upgrade mechanisms to switch from HTTP2 to HTTP3. And the switch from HTTP2 to HTTP3 is significantly more difficult than the switch from HTTP1 to HTTP2. The reason being that HTTP1 to HTTP2, they share ⁓ TCP and TLS underneath, right? as in you can upgrade a HCP1 connection to an HCP2 connection. On the other side, there is no way to upgrade an HCP2 connection to an HCP3 connection because they don't share the underlying protocols as in like UDP versus TCP ⁓ and then QUICK on top. does it actually ever happen? Like do you ever reuse the same TCP connection like switching from HTTP 1 and HTTP 2? Mm-hmm. I can't give you exact numbers here. We definitely never do it from HTTP2 to HTTP3, as that's impossible. And to go a little bit into the various mechanisms that we have from HTTP2 to HTTP3, so there are two ways to do this. We can't just assume that the server speaks HTTP3. So by default, we'll assume HTTP1 or HTTP2. So we will be establishing a TCP-TLS connection. ⁓ The two mechanisms here is either we already established a TCP TLS HTTP 1, HTTP 2 connection to the server, and then the server can communicate the support of HTTP 3 through an Alt-Svc header. So it's an HTTP header, it's called Alt-Svc and in there it basically says, hey, by the way, I also speak HTTP 3. And then on the next attempt, we cache this information, and on the next attempt, we can then do HTTP 3 to the server. And then the alternative is... a so-called HTTPS DNS record. ⁓ we would query DNS, see whether the HTTPS record is available and in that record, whether it advertises HTTP3 support. And then, for example, we could do an HTTP3 connection from the get-go. Okay, very cool. So I wanna dive into the latter a bit later, but before we go there, I wanna step a bit back because I'm pretty certain that browsers, I'm not sure about like Firefox and if it's always like that, but I do seem to recall that they usually, ⁓ when it's a TLS connection, only then they would try to do HTTP2. and otherwise they would like try HTTP1 but I suppose about also anyway HTTP3 is usually done over TLS I know it's possible without but pretty much in practice it's always done over TLS but that's what quick by default dose and by spec, even I think, but I'm sure it's possible without, but anyway, like usually done with TLS. That does mean that in your client hello, you also have like things like the ALPM. So I suppose at that point, you would also know if it's HTTP2 and HTTP3 because it advertises it there as well. Yeah, let's back up a little bit. So from HTTP 1 to HTTP 2, you do the TLS handshake, right? And in the TLS handshake in ALPN, so application layer protocol negotiation, you then negotiate with the server whether to speak HTTP 1 or HTTP 2, right? That is over the same TCP connection. So now if we go the HTTP 3 route instead, what I would be doing is send UDP datagrams. to the remote and then have a reply, right? And this way do the quick handshake. And given that HTTP2 is never running over UDP, I know for sure if this quick connection establishes, I know this is an H3 connection. That said, we still do H3 in the ALPN. Okay. Yeah, actually, totally doesn't make sense what I was saying because like you couldn't like, yeah, you would be opening on TCP and you would do the TLS and check. yeah, at that point, I mean, does it ever happen? like, would it be legal for a server to say like, I'm connecting over TCP and I do my client hello. And there it may be like that I can still say I also speak HTTP tree. And if it would like return HTTP tree, that maybe I just drop the connection immediately and try to restart? Is that a thing or maybe that's not a thing? ⁓ so I'm not aware of ⁓ a standardized upgrade from H2 to H3 over ALPN on the TCPTLS ⁓ client. Hello? Well, TLS handshake. ⁓ Maybe someone is using this. I'm fairly certain Firefox would simply ignore it. So the two upgrades mechanisms are higher up in the stack, which is HTTP2 out-service, HTTP header. Yeah, I mean, I imagine, yeah. or in DNS with an HTTPS record. Okay, and then something I wonder is like you were saying it caches it, which makes sense. Does it mean like it caches it is also on disk and how long does it retain this information? And is there a way as a user to, I don't know, override because I sometimes wonder what if you're a web server master and for some reason you're changing your stack and you no longer, I don't know, support whatever cache thing is there for the user. So yeah, there's a couple of questions at once I suppose, but the first thing is how is this cached for how long and how is this cached maybe like, I don't know, burst. Yeah, unfortunately I can't give you the details on how Firefox in particular caches this and never touched this part of Firefox. In the ideal case, as the server operator, you would never have to mind Firefox cache infrastructure here. We'll probably touch later on on happy eyeball. And basically the idea here is, let's say you run a server and you have in the past been advertising and running an HTTP 3 endpoint on that server. And let's say Firefox caches that. On the next attempt, Firefox will or might, depending on the cache policy, attempt an HTTP 3 connection. But if that fails, it will fall back to HTTP 2. So as the server operator, you don't need to know about the details here. Okay, very cool. And then of course, it was a whole story how like Google managed to pull this off together with many partners because we had to switch like from TCP to UDP There were also issues with middle boxes with, okay, all that got resolved yet. Like there is a difference because now you're doing a lot of the work in user space being quick. It might move eventually to kernel space, but for now it's, I believe mostly in user space. That means that there is less optimizations that I know like the, O S can do for you upfront and that you as maybe as the application layers, because in the end, well, it's not like we live in a beautiful OSI layer world, but at least the, the, the, people making the quick library and maybe the ones above, or maybe all of it need to be a bit smarter in how you, I don't know, optimize your, your, your stream of data as much as possible. The fact that you need to worry about congestion control, et cetera. So I believe you did some. there to improve the UDP IO massively. So can we maybe discuss a bit about that? Yeah, for sure. So on user space and kernel space, ⁓ yeah, the initial implementations, I think, have all been in user space, simply as you can evolve much faster. You can ship updates as part of the application. Now, ⁓ I guess that's a debate of whether you want every application to ship with its own quick stack. That's a different story, probably worth going into. ⁓ Now that doesn't mean there aren't any kernel quick stacks. example, the Microsoft, so MS Quick, for example, can run in Windows kernel space, which is quite impressive. ⁓ And there are various operating system provided quick stacks, which for example, could run in users, are running in user space. So that would be, for example, ⁓ Apple's network framework has a quick stack. And as far as I know, ⁓ this runs in user space. but is then provided by the operating system. For us, the Firefox quick stack is in user space. And yeah, this means that ⁓ previously we could rely on, for the better or the worse, rely on the kernels TCP stack, right? So we didn't have to implement any condition controllers and so on, but simply we're sending bytes down and getting bytes back. And now we implement our entire transport stack apart from UDP and IP itself in user space. So that's ⁓ significantly larger undertaking. Okay. So from here, the thing you were hinting at, ⁓ so I have been working a lot on UDP IO. So how do we send UDP datagrams from user space to kernel space very quickly? And then on the way back, how do we read UDP datagrams back? And this is relevant, as we just discussed, because QUIC is now running in user space. So we travel back and forth between kernel space and user space all the time. That transition by itself is in the ballpark of, let's say, one microsecond. So that adds up. But then in addition to that, ⁓ if we transfer over very small UDP datagrams, it's a lot of transfers. ⁓ The static, the fixed costs per transfer are very high. So in the internet, you can assume an MTU, maximum transmission unit of around, yeah, really depends, but below 1,500 bytes, right? So if we would be sending and receiving UDP datagrams one at a time between user space and kernel space, that's a lot of overhead. And roughly one year ago, Firefox was still doing that and if you would take a CPU profile of Firefox, you would see that Firefox spends significant amount of time ⁓ allocating memory. So doing small memory allocations ⁓ by surprise around 1,500 bytes and then sending those down to the operating system and vice versa. So spending a lot of time in those Syscalls. And what I have been doing the last couple of months is switching this single datagram send and receive. We're mostly using send to and receive from the POSIX APIs to modern ⁓ operating system specific syscalls. So for example, on Linux right now, we use send message and receive message with GSO and GRO. ⁓ And on Windows, for example, WSA send message. What these APIs or syscalls allow us to do is pass multiple datagrams to the operating system and receive multiple datagrams from the operating system. And so the fixed costs per system call are amortized not for one datagram, but multiple datagrams. And this has both an impact on the syscall itself, but also, for example, on our memory allocation behavior as we can now make an allocation of multiple datagrams at once. So for example, we have a long lived 64 kilobyte buffer, and then pass that to the operating system to then receive datagrams. And overall, ⁓ this is a CPU optimization. So you won't see this optimization everywhere in Firefox. For example, if you're not CPU bound, As in like your CPU is not busy, you will not see a throughput improvement. But in case like we hit the CPU limit in Firefox, like in very CPU bound benchmarks, we see an up to 4x improvement on throughput. Now these are very artificial benchmarks, but it does show that this has a significant impact, the move to like those multi-message APIs. Yeah, I can imagine that that's very fascinating. Now, do you see massive differences on all the different platforms because Firefox supports quite a large range of platforms? Is there a noticeable difference there given that you are now relying more on advanced syscalls? So the bug reports definitely increased. That's my main signal. No, I'm kidding, of course. This has been quite a journey as in like the implementation itself was probably like two months of work. Debugging the various platforms and the various setups like people had was probably the remaining ⁓ eight months of work. As in like people run very old operating systems on very austere hardware. with maybe a VPN and maybe some antivirus which really thinks that it should be terminating to TLS connection and doing their own quick work. So there has been a lot of pain points around that. In terms of metrics, so Firefox runs in very heterogeneous environments as in like you both have some engineers running like an M4 Mac on a, I don't know, two gig up and down. And then you have someone with a very spotty mobile network on some very old Android phone, probably running something in the background. So it's very hard to measure this, and we have not succeeded thus far to have a clear signal ⁓ of how much of an impact this work has, only in artificial benchmarks and then on individual machines have received a larger impact, but not on the large Firefox population. And there we are very restricted in terms of the signals that we get, in Firefox Mozilla takes data collection very serious, as in privacy around it. And so we are very limited in what we can see out there. Yeah, I imagine and ⁓ yeah, it's it will always be tricky anyway, even if you have more metrics, I imagine ⁓ that said We talk a bit about a different platform with what about the drivers? how much do they factor into your support and your bug reports? Is there something like spotty network drivers, especially in the Linux realm, or is it for network drivers okay? I'm asking because I used to do a lot of audio, like heavy audio processing and like having to play audio in very, all kinds of manipulative ways for video games. And depending on the, Android and Linux platforms. It was definitely like you had to be a bit lucky that the user had good enough drivers because there were a lot of them which has spotty drivers, but I'm not sure in the network world how much that plays into an effect. Yes, very much. Not so much on Linux, as in like, for example, I had bug reports where someone on Windows said like, if I do this and that feature, this crashes my ⁓ network driver, which is very scary. And we like immediately rolled back that feature because we didn't know what kind of impact that has on the whole Mozilla Firefox population. Linux has been reasonably smooth now that said. There are various optimizations, for example, segmentation offloading. The idea being that I send a very large UDP datagram to the kernel and then ideally the kernel forwards that to the network card and then the network card splits that UDP datagram into smaller pieces. And then the network card ideally has very specialized hardware to do that. And that requires driver support and I... don't think we have that across all users or all devices. And so there we are missing some driver support. But the real problems of like, I'm crashing a network driver, those I've only seen on other platforms, not on okay that's that's both surprising and and happy to hear Yeah, maybe another story, a fun story here for a podcast. At some point we had someone report that a website called fosstodon.org wasn't loading quick enough. That was actually a Mozilla employee, which was of course very helpful as I could go back and forth a lot quicker. It turns out I had... just landed in Firefox Nightly in optimization, which is called a URO on Windows. So it's basically segmentation offloading on the receive path. And this person was running a Windows on ARM, which generally doesn't have the best support. And after a lot of back and forth, I basically just went ahead and bought the exact same device in the exact same color and tried to debug this. And after two weeks, I actually figured this out, I needed a small Linux tool, so I installed WSL, Windows Subsystem for Linux, and that actually triggered the bug. So when you have, for example, WSL enabled, but you're not using it, as in like you only have it enabled, and then you do a URL read on the UDP socket on Windows on ARM, then it will not give you the segment size, and that kind of defeats the whole idea, URL as in. like then it doesn't work at all. Yeah, so that was a great find. Unfortunately, that, for example, meant we can't do that optimization on Windows, which is very unfortunate. and there is no way to detect that this is like enabled the ⁓ Linux subsystem. ⁓ WSL. There are ways to detect this. that does add like, for example, when do I check? As in like, do I check on every source call? That is probably very expensive. Do I check every 10 seconds? Do I only check on startup? And then in addition to that, beyond that, we also had reports of your own not working in other scenarios. and that was on x86-64. So for example, we couldn't just discriminate based on ARM and just say, Windows on ARM, we don't do this optimization. So right now, Firefox does, unfortunately, does not do URL on Windows, which is very sad to see because, yeah, Firefox mostly reads as in it downloads, so it could really benefit from a faster receive path with URL. Yeah, so far no resolution to this. We have a reproducer, we're in touch with Microsoft, but ⁓ we haven't been able to resolve it. And so I had a conversation in the past about, yeah, was, I believe it was with Carl Lerche about Tokio mostly in episode five of this podcast. And we also discussed a bit around IO-U-Ring and while he was happy about it for the file system. He didn't really see much use of it in the network and even more so he was very unhappy about the Windows variant of that Is that also what you at Firefox see? I suppose you don't make any use of that at all, or do you? Yeah, actually this is funny because this is the question I always get when I ⁓ talk about fast UDP. So, IEuring is definitely a very popular topic, I guess is very cool to see. ⁓ Multiple things. ⁓ A, we're not going to use IEuring anytime soon because it's very Linux specific and it ⁓ moves us from this readiness base to completion base polling. So it is very intrusive to our architecture. ⁓ Given that it's only Linux, I believe it's even disabled right now on Android. I'm not sure about that. It's a very small Firefox population and probably not the lowest hanging fruit in terms of how can we help Firefox users have a faster browser. ⁓ On the Windows side, I'm simply not familiar with whether they have any ring buffer-based syscall Yeah, I believe they have but apparently it's a mess but let's not go in there then. So in that... yeah, yeah, Yeah. All of these optimizations that we did, I mentioned earlier that we have seen significantly when we do a single UDP datagram transfer to the operating system that we see that in particular in our CPU profiles. But now, given that all of this is now landed in Firefox and Firefox in many cases does multiple UDP datagrams, sys calls. ⁓ Our main bottleneck is actually TLS Crypto, which is a great thing to have. Yeah, very cool. And it's definitely something I want to touch on, I'm afraid that we're not going to have enough time for that. What I do want to start talking about is that we go a bit back in our conversation because we discussed quickly around the alt-svc option of upgrading, but you also mentioned the DNS upgrade. And in episode seven of this podcast, we talked with Dirk Jan. It was mostly around Rust-DLS, but he's also the person of QuinnRS and also he works also on Hickory DNS, also now in function of the, let's say, Encrypt. work where they want to start to adopt HECRO DNS. So is that kind of upgrade path waiting in Firefox for the fact that you're to start adopting HECRO DNS first or are you already doing it now in a different way and so that upgrade path is already being executed? Yes, okay. So, yeah, as you said, there are two ways to upgrade HTTPS3, right? Either via Alt-Svc or DNS HTTPS records. And then there are, at least for us right now, relevant, there are three ways to get an HTTPS record, as in, like, down this tree. Two to upgrade to HTTPS3, and then for DNS HTTPS, three different ways to do it. The first way is to use... ⁓ the operating system's stub resolver, ⁓ not via get address info that doesn't support HTTPS records, but similar Syscalls which would ⁓ operating system specific Syscalls that give us then the HTTPS record. And that's very hard. As far as I know, for example, we then currently fail on Windows 10 to get the HTTPS record. via the operating system sub-resolver. And so this really isn't ideal. Then the second way to get the HTTPS record for us is via DOH. So DNS over HTTPS. And ⁓ here we are not restricted by any operating system ⁓ syscalls, but we're simply restricted by the rollout of DOH, as in not every Firefox instance for various reasons, which we can go into, doesn't do DOH, so DNS over HTTP, but instead uses standard, the operating system stub resolver. And now there's a third way, which we haven't done yet, but which we are looking into, is to implement our own DNS stub resolver. And that, for example, we might be, for example, using Hickory DNS ⁓ as that's in Rust. We're very much in favor of increasing the amount of Rust in Firefox ⁓ and seems well done. And so maybe we could base on that work. Okay, now I do wonder, you keep saying DNS TUP resolver and why is that? I mean, because in the end, the HTTPS record is just a regular record, right? Like an A record and a quad A record. So what is the stup part in that sentence signifying? Ah, OK. So in the DNS architecture, you usually, as a consumer, you have some recursive resolver, right? And that you get over DHCP or any other configuration setup that you have. And then the operating system usually provides Sys calls to talk to that recursive resolver. And so the operating system in that case acts as a stub resolver, as in like the the very end of the resolver chain. And so for example, if you do a get address info syscall, then you basically ask the operating system to act as a stub resolver to reach out to your recursive resolver. And then the recursive resolver reaches out to the designated resolvers or serves you cached content. And yeah, this is the same as you said, HTTPS records are just like A and quad A records. ⁓ Just that most operating systems don't provide ⁓ Sys calls to actually get those HTTPS records. Yeah, which I suppose once you have your own DNS resolver, I mean, you can pretty much resolve any record you want. Correct, yeah, but it does come with a lot of challenges, so... Yeah, it's not as easy as just... Okay, would you still like try to use the system DNS configuration first or would you always specify, I don't know, user settings in the Firefox or by default some private aware DNS resolvers as authorized by Firefox? I mean, what kind of resolver address would you use there? Hmm. Yeah, this is a very big topic. ⁓ So I think there's two things here. ⁓ A, which resolver do you use? And then how do you communicate to that resolver? Right. So for example, some people really want to use their own resolver as like they have, what's it called? Some, some pipe pie hole or like they want to, for example, filter DNS requests. And for example, this way filter out advertisement. Some others. Hmm. Yeah, and you also have like, yeah, you have also like family plans of Cloudflare or you even have like a European version for like safety for children. I mean, there are all kinds of reasons why people might, but I suppose those will have already been configured as far as I know in the system. So I would think that you use those if you use the system default, no. Yeah. Yes, if we do a get address info that we would be using those. And then if we would be using our own stuff resolver, we would also be using those. But for example, if we use DOH in the future, in the long-term future, we would, for example, be able to discover those as there's the designated ⁓ resolver discovery protocol standardized, but we don't implement that yet. Or we use whatever the user configures in Firefox. ⁓ to be used for DOH then, so DNS over HTPS. Okay yeah, I see it here, RFC 9462 discovery of designated resolvers. I was totally not aware of that one. I need to read up on that. ⁓ Yeah, we don't implement it yet, but I would, for example, everyone. then Mozilla has ⁓ various, has contracts with other DNS providers, DOH DNS providers, where those contracts enforce certain privacy properties of those ⁓ DNS resolvers to legally bind them to adhering to Mozilla's privacy policies. Okay, very interesting. Can actually on an OS level, can the OS or some kind of security software on the system, can they enforce a resolver, which maybe like is enforced by the corporation? Because I know both on, mean, you may know that it used to be in kernel space, but now you can do it in user space and also on Macintosh. You also like can do it in user space since like, I don't know, five years ago where you can... intercept like all the DNS resolve calls and all those things but I wonder if you use your own DNS resolver I wonder if those still will in the end go via there I would think not but then it kind of defeats the security point there Yeah, so if we implement our own stub resolver we would adhere to whatever the system provides us ⁓ for the recursive resolver. So we would be talking to those. In the case of DOH, in the end, that's just an HTTPS security connection to some remote. So I guess it's a lot harder for a very secure software to intercept. okay. Yeah. So it was still intercept, I guess. Now some would argue that's by design, some would argue that's a big problem. Yeah, it depends, guess. I mean, it also... Yeah, anyway, that's a hairy topic. I'm not going to go in there. Now, I do find the HTTPS record, and it's the last kind of questions I want to ask about it because I want to move on. But I find it, first of all, I wonder why they name it HTTPS, because I feel this contains so much data that it's probably useful beyond HTTP or TLS-like. Do you know the reason why they call it HTPS? Or is it just like because that was the use case at that point and didn't really give too much thought about it? ⁓ I was not part of when this was designed, so I don't know the backstory even. ⁓ No, I understand, but... Yeah, because I mean it contains things like ALPN, so you can specify the protocol versions. And then it even includes like IPv4 hints, IPv6 hints. Do you know what those are used for? ⁓ the hints. So for example to connect to wikipedia.org and I did ⁓ a and quad a in the past and I have that cached. ⁓ Now on the next connection is stemmed I will erase and this is related to happy eyeballs I will raise an A and quad A and HTTPS DNS query to whatever resolver I have. And let's say the HTTPS record comes back first. Now, what do I do? I don't have like a definitive answer on what IP to connect to, but I have those hints. And so I can compare those hints to my cached results. And if they match, I can then establish a connection before A and quad A. okay. And what if it doesn't match? What if the hints are something different than what you're aware of? So it's basically just a small performance tuning. Hmm. This is beyond my expertise. I assume we would be waiting for a in quad a or either one. So we would wait for a definitive answer in that case. and are those then just related to like, is it a hint saying like, despite the fact that I might have like more IP addresses in my A or quad A records, like these are the IP addresses for which these advertised ALPNs work. Is that how it works or how do I have to interpret it? I assume the cache optimization is the main thing, as in, like, these are some valid IP addresses in case you already have them cached, go ahead. But beyond that, sorry, I don't know, I... Yeah, I can't give you a good answer here. Yeah, yeah, that's okay. That is totally fair. Now in in Episode six with with Daniel about curl we talked for the first time about happy eyeball because they implemented V2 just some weeks or months ago at least it shipped some weeks ago back then it was some weeks ago now it's like a month ago I guess and Then I mean I was also telling them to the audience like for example in the Rama in the network framework that we develop and maintain We only have IP eyeballs V1. So it was nice to discover happy eyeballs V2 with Daniel I know As far as I know that Firefox already supports Happy Eyeballs V1 and V2 and that was always around IPv4 and IPv6 and how you like prioritize those also to give a bit of a boost to the IPv6 adoption. But then we had a conversation in the past Max where you mentioned that now there's an Happy Eyeballs V3 which again surprised me. Like it seems like I keep getting more and more versions behind and there it's about the fact that it's no longer even just about IPv4 ipv6 but also the fact do i support Quic and so you htp3 or do i prefer like tcp so htp2 is it am i am i correct there that that's the the main addition to v3 over v2 or am i missing something Yeah, I think it's fair to say that V3, the goal of it is to drive Quic adoption. As in, previously we were trying to drive IPv6 adoption, right, racing IPv4 and IPv6, and now we are racing ⁓ in addition, quick NTCP, and the goal, probably not for everyone, but at least for us, is to drive up Quic adoption here. To back up a little bit, so... Yeah, as you said, v1 and v2 mostly just care about IPv4 and IPv6, a lot of versions here. ⁓ And now since then, we got a lot more dimensions to the connectivity matrix, right, if you spend this up as a matrix. And so we got TCP versus QUIC, for example, in there. ⁓ We, of course, have v4 and IPv4 and IPv6. We had that previously. ⁓ Now we have, for example, different DNS records beyond A and quad A. We now also have HTPS records. We have ⁓ the ability, for example, to do ECH, so encrypt a client hello or not do ECH. So all of these ⁓ lead to a cardinality explosion, right? As in like they multiply those dimensions. And at this point, it's actually really important that you're smart about how you connect. And this is the whole work around heavy eyeballs. It's still being standardized at the IETF as far as I know it's still a draft and we're looking into it. It's our next priority in the networking team. We haven't done it yet. I believe this is one of the most impactful things for Firefox users. There's a big difference on which of those protocols you use. For example, when you do more QUIC versus TCP TLS, do you save run round trip? And that is significant in the web space. like web experience is very latency sensitive. As in, it very much matters whether your connection establishes in 10 milliseconds or 20 milliseconds. Because per website, you're probably doing like 10, 20, 30 connections in total. And so these really add up. And so happy eyeball is very important, especially in the web context. But I do wonder if it's solving a problem or you are looking for a problem to solve because let's say you, yeah, I mean, I will maybe back up my claim a bit here because in the end, as long as the app service, but maybe that's a big assumption, implement the specs correctly, it would mean that, you connect to a new website. and it's over HTTP 2 because you know that's a safe assumption when you go over TLS and then it will say in the alt svc header like hey I support HTTP 3 so at that point you can just cache it you don't have to reconnect because it's not like HTTP 2 is bad And anyway, a browser makes many connections to the same server, even in the same tab. So you could basically just catch that and then for the next time, okay, you can just do HTTP 3. And that way you would also drive up adoption and same for those HTTPS records. So I wonder what is the need to have to do it in advance because it's not like you have to drop a connection that HTTP 2, I mean, it's a valid connection if it works. And I don't think there will ever be... a web server which only does HTTP 3 because that will leave out so much people. So I'm sure they will always anyway support HTTP2 It's not like you're going to be in a situation where you try to connect to TCP and it's not going to work. I imagine for a very long time, if not forever, it's going to still accept anyway TCP connections. So then I wonder what you were really saving there because you were talking about an extra round trip, but I wasn't really following why that was. Yeah, okay. of relevant? Okay. Let's walk through this. This is a really good question. So first off, let's say we do HTTP 2 always on the first connection, right? And TCP TLS with TLS 1.3, connection establishment is always two round trips. And then after that, we can send requests, right? On QUIC, it's one round trip, and then we can send requests. So you just cut the connection establishment latency in half. In connection establishment, when you load a website, that's the number one driver, right? the time to first byte. It's probably going to be your connection establishment that has the highest impact there. So on the first attempt. Yeah, that's only for the first attempt. Now people are quite repetitive in their web surfing behavior, right? So you have a good point. Why do we care so much on... Mm-hmm. Okay, but that's for a new website, right? Yeah, that's the first attempt. Yeah. about the first attempt. Now, I would still argue the first attempt is important, but let's say, like, people just go to Wikipedia, Google, and something, something, and so it's going to be cached anyways. Okay, cool. So we have that cached entry, right? We got an alt service back, so we do know the remote supports HTTP 3, right? Okay, what do we do now? So on the next connection attempt do I... you're suggesting basically just do HTTP 3, right? But what if the user changed networks in between and the network no longer supports UDP? Or we previously connected over IPv6 and the user switched networks and it no longer works over IPv6 because it's an IPv4 only network. Or, yeah, this is more ⁓ strange, but let's say the website only offers quic over certain paths, as in like it has load balancing. or some geolocation based load balancing in between. And so not all ⁓ PoPs like points of presence support HTTP3. So then again, we don't know. And so we could just do the HTTP3 connection, wait till it's established or not, and then fall back to HTTP2, right? But that is on the orders of seconds. As in, we don't know for sure whether the HTTP3 connection simply It takes a very long time to establish as in like we have a very high latency or whether we are, for example, in a UDP black hole where all of a sudden all of this blocked. So what happy eyeball basically suggests instead is great. You have that information that you were able to connect over Http3 in the past. You can take in various information off like your interface change, for example. So then you would be more careful or it didn't change and the environment is still the same. Then you would be less careful. And then we would try that HTTP 3 connection. after, think I would have to double check, but I think after 50 milliseconds, we would kick off the HTTP 2 connection as well and then race the two. And this way we, yes, it's a trade off. We do consume more resources, but it probably leads to a better web surfing experience. Does that make sense? Hmm. Yeah, I mean, that's fair enough. And anyway, modern systems, I would be surprised if you feel that now that said would there ever be a point? I mean, let's say a browser tab maybe makes like three or four connections. And let's say you are doing happy eyeballs and both of them work. Would there be any benefit in any way using them both so that you have to make less new connections? Because I mean, you could also say, I always just take the first and then drop the other ones. But if you do that four times, you're also dropping off connections. Do you also just use them anyway if you're planning to make multiple connections anyway, or is it another thing? Mm-hmm Yeah, so we only do the multi… I forgot how we call this. Yeah, we only do multiple connections to the same origin within the same tab context or website context on HTTP one because HTTP one only allows us to do one request at a time, right? So there we do up to six connections so that we can do six requests at a time. On HTTP two on the other hand given that allows us to do multiple requests at once, even though it does have head-of-line blocking, so that's a problem, but that's a tangent we can go into later. ⁓ On HTTP2 we only do one connection. And then on http3, that's kind of the ideal. We have streams ⁓ on the transport layer, so we don't have head-of-line blocking, so there we only ever need one connection. And I would even argue that At least in the HTTP3 case, multiple connections are harmful because those connections, at least in the Firefox case, will use multiple congestion controllers. And the more information one congestion controller has, as in like is aware of every byte that's going in and out, the better it can make decisions. And so a single connection with one congestion controller is much better than two connections, which with two... congestion controllers under the assumption that you're running a protocol without head-of-line blocking. It is by the way always one UDP socket per ⁓ quic connection or do you use multiple? your connection. some servers, or I would assume many servers, ⁓ run multiple quick connections over the same UDP socket. In Firefox, we use a single UDP socket connection mapping. So it's a one-to-one mapping. And the main reason here is for every UDP socket that we allocate, we get a new IP address, port on the machine. And that means we get a new NAT mapping on the NAT. And that means that we get some additional privacy as you cannot no longer associate connections from the same Firefox instance. Let's say Firefox would use the same socket for all quick connections. Then it would not get a new NAT mapping. And then it would to the servers out there if they colluded. They could then match by port NIP and this way. could match what connection belongs to which Firefox instance. Yeah, but I was wondering the other way around, because you can say I can serve multiple or one connection. You do one for one because it's like privacy aware. But I wonder, is it ever like a performance improvement if you do multiple sockets for a signal connection? Or does it make any difference? Mm. I have never heard of this. ⁓ So there is the whole idea of ⁓ socket per core architectures on the server side, but then you would really also make sure that every connection is pinned to a particular core. ⁓ I've never heard of anyone optimizing by using multiple sockets. That would also mess with a lot of servers as... Like there's this notion of a path in QUIC and that is dependent on the port. So that wouldn't quite work. There is related to this. There is an effort to add multipath capabilities to QUIC. It's called the multipath QUIC draft at the IETF I think it's still a draft as far as I know. And so that explores this realm a little bit more, but in general, no, always one. The socket to connection mapping, one connection, never use multiple sockets. Okay, and then I have one last happy eyeball question and that is like, okay, so ⁓ if you don't know anything, you have no cache information, then I suppose you would try to prioritize quic and IPv6, let's say. But let's say you have statistics or you have cached data from previous connections to that server for that user, maybe even on that network, like however you want to like map this and you know, okay, this is like, dislikes HTTP2 and IPv4 on the specific network interface or for this specific website. Do you then next time in your happy eyeball prioritize first the one that we know works over the maybe more ideal solution or do you always try to be very, yeah, like trying to push adoption forward and always prioritize the one that you would like to have? I am the wrong person to talk to here. I believe we have various heuristics in place that go beyond ⁓ the that always use QUIC and IPv6, but I don't know the details. Okay that's cool. I will do some research myself. As that's the beautiful thing about Firefox you can read the source code nicely and recently they moved to github. Did that have any impact or like on the development space or don't know like some issues? Yeah. Hmm. Yes, so the Rust QuicStack has always been on GitHub and that was definitely a very low barrier to entry for me to contribute as like I've been maintaining a lot of open source projects in the past and that has mostly always been on GitHub. I'm very familiar with the platform. Unfortunately Mozilla has a lot of custom tooling and the reason there being not because Mozilla likes to have so much custom tooling but because Mozilla has been around for quite some time. like this code base is very old and so back then this tooling didn't exist. I think Mozilla should move more towards not invented here ⁓ and one great step I believe has been the recent move to host the main repository on GitHub. So I think that's a really good step. Okay, very cool. And then like soon we're going to start a wrap up, but we already touched on Quic And so I like quick a lot, I've used it. didn't, I have not used personally, HTTP 3 a lot as an implementer myself a lot, but I have used quic a lot because I like it a lot where I want to like establish like my streams. And I know I can use a lot of streams over ⁓ one quic connection. And recently we had a conversation episode nine about GRPC with Lucio Franco and there it's on HTTP two. And that's also the reason why GRPC is on HTTP 2 because you have ⁓ access to all those different streams and so HTTP 3 also allows that. Is there, besides HTTP 3 and besides WebTransport, which I want to talk about next, another use case of Quic in Firefox or that's just the main two drivers for now within Firefox? you Yeah, so right now Firefox only uses QUIC in the context of HTTP3. So only below HTTP3. That's both for, for example, standard HTTP3 getPost and so on, right? WebTransport, which you mentioned, which we will go into. And then, for example, DOH, so DNS over HTTP, which then could use HTTP3 and QUIC and so on. ⁓ Firefox right now does not use QUIC for anything else. There has been explorations, not much on don't think on the Mozilla side, but in general, to run ⁓ RTP over, sorry, to run the SCTP part of WebRTC over QUIC instead. So there are explorations around this. I am not aware of any larger deployments using QUIC itself. There are many projects that run QUIC, but not in a web scale standardization. Okay, and then so, yeah, HTTP tree is built off of Quic but another exciting one, is, so I find it very fascinating how web transport, I think by now is like way past the draft and it's like, or maybe still in a draft, I don't know, but at least it's pretty established by now, even though it's still very new. Like a lot of people are still discovering web sockets. So yeah, like, I mean, I see it all the time on like blog posts and like news, like. It's not that established yet, even WebSockets, even though it's pretty old. And then I wonder like, how long will it take for like WebTransport to adopt? But that's not what I want to talk about. What I do want to talk about is like, were there any challenges in supporting WebTransport once you already have QUIC? Or was it like fairly trivial? I mean, as trivial as implementing an RFC is, let's see. Mm-hmm. Yeah. So just as a disclaimer, I did not write the WebTransport implementation in Firefox. That said, I've worked a lot on it and did a bunch of bug fixes around it. ⁓ WebTransport is the rather... So when you run WebTransport on top of HTTP3, note there is also a way to run WebTransport over HTTP2. One of them is still in draft. If you run it over HTTP3, that's a very shallow wrapper. basically embracing all the quick capabilities to a user on top of HTTP3. So there is not a lot of complexity. It is tricky, as in every network protocol is tricky, ⁓ but it is mostly a shallow wrapper around. Okay, very cool. And so as far as I understood from a past conversation with you, it is still in development, right? Or can people already use Web Transport within Firefox? I'm not even sure how it works from a JavaScript site. I've never really, to be honest, touched Web Transport from JavaScript. You Yeah, so it's still a draft. Firefox has been supporting Web Transport for quite some time at this point, but only a draft of a draft of the hopefully to be RFC. We're currently on draft four. There is draft 11. There are not, like, as in you can already use it and draft four is already fully usable and there are people out there using Web Transport, even though adoption isn't that large. but there are various features that we need to add to web transport to our web transport implementation to catch up. So for example, recently, in the IETF there has been the decision to introduce a flow control mechanism. And so that we need to add to our web transport implementation. There has been the decision to add an ALPN like mechanism. I think they even call it ALPN. So application layer protocol negotiation. So. to negotiate what you're running on top of Web Transport. Yeah, so these features are in the latest draft, but not yet in Firefox, but we are planning to catch up. Okay, very exciting. I have a feeling I will have to invite you again once like WebTransport becomes more established and I might even like bundle that together with... a guest we had in episode 4 we talked with Delaney Gillian who was of course the creator of Datastar it's in like an SSE web framework but he also has plans to make something called Darkstar and there he would really want to make use of WebTransport and I have a feeling maybe like inviting both of you a long episode could give some fireworks so I I do want to like attempt that in the future to be continued. Now we are reaching the end of this episode, but I would like to give you the opportunity if you want to like say like, okay, Glen we didn't discuss this topic, even though we have to keep it short. Or you say, yeah, I want to like plug this information. Like now is your moment to shine. Yeah, okay, let me think. I may be discussed a lot of course, I mean, like, there might be stuff you say like, okay, like I'm doing this at Firefox and or maybe I'm doing this work and you really should check out this or I don't know, like whatever, like people need to look up or learn more about or do differently, maybe some advice, doesn't really matter. you Yes, okay. So I ⁓ would give a call for like next steps and collaboration. ⁓ In general, as I said earlier, I can highly recommend any work happening within the ITF. I'm very impressed by the organization, especially by its openness. it's not, you can't take it for granted that you can simply buy a ticket and then help design the internet. That is very impressive. ⁓ And the fact that everyone can just participate in those working groups, I think is wonderful. And the fact that they let me sit in the back and just listen in is a wonderful thing. So I can highly recommend that. ⁓ Then, yeah, for any listener that would like to get more involved, I can only speak from the Firefox site. Now there are other browsers that are open source and they're willing to have contributors. On the Firefox side, our ⁓ QuicStack so in case you're interested in that, is entirely written in Rust. It is on GitHub. So for example, if people would like to get more involved in this, ⁓ I'm happy to help people get started and lend their first contribution, which would then eventually land in Firefox and then help millions of people out there serve the web better. So that would be around Quic for example, or a MASQUE like a new proxy protocol, or around congestion control, around various IO optimizations. There is unlimited amount of opportunity to optimize ⁓ our Rust stack. There's things like MTU discovery, ⁓ maybe our DNS stub resolver, and so on and so on. So a lot of, in my eyes, very exciting work. Yeah, very cool. That is a very great call to action and I hope more people contribute to Firefox because it's one of the last standing different kind of web engines of the web. And that's also established because how how more and more there used to be a lot more variety but now a lot of the different ones have been morphed into chromium shells and so it's I'm very happy and that's also why every day I use Firefox because I think more people should use more variety of app engines because it's not healthy for any I mean I mean even if they do the best job in the world Google and I and I believe a lot of those folks are really good at heart and they do their best it's not healthy if one player drives the specifications. Like you need different players, just like you need different commercial vendors, you need different, like competition is healthy, different opinions is healthy, and so I'm very grateful for all the work you do at Mozilla Max. You Yeah. Thank you for this podcast. I've been a listener before this episode. Yeah, very cool and we hope you continue to enjoy it. So I want to thank you for participating today and I will talk to you again soon. Elizabeth (Plabayo)
1:06:44 | 🔗
Netstack.fm is brought to you by Plabayo building secure, open, and resilient infrastructure with Rust protocols, and purpose. This show is also made possible by Rama, the open source networking framework. Plabayo offers service contracts and welcome sponsorships to keep building and supporting its ecosystem. The theme music of this podcast was composed by DJ Mailbox. If you enjoyed this episode, don't forget to subscribe on your favorite podcast platform and leave a five-star review. It really helps others discover the show. Thanks for tuning in. We'll see you next time for the next handshake.