Protocol Shorts: MITM Proxies and Transparent L4 Interception

Season 1 · Episode 31 · 16:13

On this page

Show notes

episode 31 — Protocol Shorts: MITM Proxies and Transparent L4 Interception.

In this second "Protocol Shorts" episode, we look at man-in-the-middle proxies from the transport layer up. The episode explains how HTTP proxies, HTTP CONNECT, and SOCKS5 differ, why they all assume a proxy-aware client, and what changes when a transparent layer 4 proxy is inserted by the operating system instead.

From there, we dig into protocol detection from the first bytes on the wire and into the BridgeIo abstraction in Rama: a way to relay and inspect stacked handshakes incrementally instead of terminating every protocol upfront.

Learn more

Rama

If you like this podcast you might also like our modular network framework in Rust: https://ramaproxy.org

Chapters

  • Intro
  • Understanding Proxies: The Basics
  • Diving Deeper into Proxy Types
  • Layer 4 Proxies: A New Approach
  • Challenges of Transparent Proxies
  • Bridging Conversations: A New Insight
  • Example: HTTPS request within a SOCKS5 tunnel
  • Layer 4 Proxies and Protocol Reconstruction
  • Outro

Netstack.FM

Music for this episode was composed by Dj Mailbox. Listen to his music at https://on.soundcloud.com/4MRyPSNj8FZoVGpytj

Transcript

Plabayo BVImagine a client Most networking diagrams show a simple story. A client talks to a server. But in reality, there is often a third party sitting quietly in the middle. A proxy. Sometimes it just forwards traffic, but sometimes it actively inspects it, modifies it, or enforces policies. That kind of proxy is called a man in the middle proxy, or MITM. wishing to make an HTTPS request over a SOCKS5 tunnel. What will happen in our flow? Well, first the client will want to establish a connection to the SOCKS5 proxy server. And recently, while adding layer 4 proxy support to the rama framework, I realized once again that building one of these correctly is a lot more subtle than it first appears. However, given we have an L4 proxy in the middle, it will actually mean that the client will open a TCP flow to our L4 proxy. At first, our L4 proxy will not know what... the bytes it receives contains. But very quickly from the first bytes it will be able to detect ⁓ incoming bytes are meant for a SOCKS5 proxy and it will see that these bytes contain the header data sent as the first part of a SOCKS5 handshake initiated by the client. This week is another protocol short episode. In case we are not dealing with authentication, it will send those bytes to the destination SOCKS5 proxy and it can do so because it receives from the OS or the interface the actual destination address. ⁓ the second one in this series. on ⁓ January the 20th, and about HTTP ⁓ as an application bus. You find the links on our website. and if all is well, it will receive also the reply on this first part of the SOCKS 5 handshake. It once again replace this to the client. The client will then continue to do the handshake depending on what exact flow of SOCKS 5 you are dealing with here. But eventually if all is well, in order to understand why building one of these L4 proxies correctly is a little more subtle and adverse to appears, we first need to look at the two proxy types that most people interact with. we established a SOCKS5 connection on both the ingress and egress sides, meaning we established a SOCKS5 connection between the client and our L4 proxy and from our L4 proxy to the actual destination SOCKS5 At that point, ⁓ will once again continue to receive more bytes. ⁓ The most common proxy people know is an HTTP proxy. and given in our example we were dealing with HTTPS requests, it will mean that the next part is TLS. Once again, at first we didn't know what we were dealing with next, but very quickly, using the first bytes, we can detect using the TLS client hello, which is the first part of the TLS handshake, again initiated by the client, that we were dealing with TLS traffic. In this setup a client sends a request to a proxy and the proxy forwards the request. And this works fine for plain HTTP traffic. But HTTPS breaks this model completely. At that point we will start our TLS interception flow. and thus HTTP Connect was born.

Elizabeth (Plabayo)Netstack.fm is brought to you by Plabayo building secure, and resilient infrastructure with Rust protocols, and purpose. This show is also made possible by Rama, the open source networking framework. Plabayo offers service contracts and welcome sponsorships to keep building and supporting its ecosystem.

Plabayo BVConnect basically asks the proxy to open a pipe to the destination. As long as the pipe exists, the proxy mostly just forwards encrypted bytes. whereas in our traditional man in the middle flow, we would at this point already have established our TLS connection with the client, ⁓ we are instead in new flow, first ⁓ replaying ⁓ or reusing client

Elizabeth (Plabayo)The theme music of this podcast was composed by DJ Mailbox.

Plabayo BVto establish a connection or a self to the server. And once, and only once, this TLS connection is established with the egress server over the egress SOCKS 5 proxy tunnel, will we continue the TLS handshake the client and the L4 proxy. It is important to understand in the context of HTTP proxies that the HTTP connect method is only used by the client when connecting to a proxy for TLS encrypted traffic. For plain text traffic this wasn't needed and isn't used.

Elizabeth (Plabayo)If you enjoyed this episode, don't forget to subscribe on your favorite platform and leave a five-star review. It really helps others discover the show. Thanks for tuning in. We'll see you next time for the next handshake.

Plabayo BVis another proxy protocol that takes this idea one lower, SOCKS 5. This comes with a small handshake. The handshake allows the client to use... By doing this flow we are serving both sides according with this natural flow and even more we can at each time respect the actual preferences from each side. A method, the method defines whether or not it needs to have authentication and in case of indication, what kind of indication the most common similar to HTTP proxies is username password. Once the server and client agree on the methods, we get to the command and in the context of our proxy story. For TLS in specific that means we can mirror the original TLS certificate from the server when we forge our own man in the middle TLS certificate specific for that server flow. We can also mirror the defined settings such as the negotiated ALPN, the negotiated TLS version, etc. we will focus only on the connect command which operates very similar to an HTTP connect request meaning it also contains the destination address to which the proxy is expected to connect to once the outbound connection is made by the proxy it All of this we can use to keep our both sides as equally configured as possible. This is very cool and was very different than how we used to do men in the middle flows. just like the HTTP proxy in the TLS encrypted traffic is expected to forward bytes. Layer 4 proxies are fascinating because they operate before the application layer even exists. They can start with nothing but raw bytes and must reconstruct the protocol stack from there. And implementing this in Rama led us to the Bridge.io abstraction, which lets us inspect and relay protocols. The main differences between a Socks5 and an HTTP proxy is that first of all Socks5 is one layer lower and secondly Socks5 and its handshake is always done regardless of the traffic regardless if it HTTP traffic or not and regardless if it's going over TLS or not. That is different than HTTP proxy because with an HTTP proxy one handshake part at a time. In a future we dive deeper into TLS Interception and encrypted client Hello. It is only using HTTP Connect in case it is TLS encrypted traffic. And in case you are new to proxies, we will also link the intro to proxy chapters as found in the Rama book in the show notes. An HTTP proxy is typically also only used for proxying HTTP traffic, while for a Socks5 proxy it's a lot more natural to transport non-HTP traffic as well. In those chapters you will be able to learn more about all the kind of proxies that exist and it will also link you to examples found in the Rama codebase so you can play with these kind of proxies yourself and see how they might be implemented in the wild. Now both of these proxies share one important assumption. The client knows it is using a proxy. And that turns out to be a problem. Layer 4 proxies solve this differently. Instead of the client connecting to the proxy, the operating system intercepts the connection. We encourage all of you to play with these technologies. in order to get a better understanding of it yourself. And this concludes our episode on man-in-the-middle proxies. ⁓ Typically this means that when a client requests a socket from the operating system meant for outbound connections that instead of an actual socket for that purpose it will instead deliver traffic to a socket or interface on which your L4 proxy is listening to this kind of TCP flows. See you next week for another week of Netstack.FM And in this instance, the client is not aware that there is even a proxy in the middle to begin with. As such, L4 proxies are at times also called transparent proxies. However, this introduces a new challenge. The proxy no longer knows what protocol it is receiving. To keep the scope of this podcast more enough, let's focus for now only on TCP. And in order to understand better what I mean, let's discuss some examples. Example number one is where a client is making an HTTP request to example.com. The client will DNS resolve example.com domain and will make and will request to the OS a socket for this outbound connection using the destination IP address that it resolved using its DNS resolver. Instead of receiving a regular socket from the OS, it will instead open an interface to the proxy. The client will not notice, but the proxy will. And the proxy, the only thing it will see is a destination IP address. It does not know what server is exactly on that IP address, unless it knows about the IP address, but in most cases it won't. and neither does it know what is the protocol used by the client connecting to this destination. So the proxy must detect protocols from the very first byte of a connection. in HTTP Wang this means that it can detect the header the HTTP request, meaning the method and then a path. For TLS it will be able to see the first bytes looking like a client hello request. And if it wants to be sure, it can peek the entire client hello request, given this is still in plain text. For SOCKS, it will see the SOCKS 5 version followed by a method, followed by some more data in case it wants to be even more sure. And this can work for anything. The beauty in all this is that each protocol starts with its own unique set of bytes. So far I haven't found any collision in this. at least not in the protocols which are widely used in the wild. And while implementing this in Rama, we ran into an architectural problem. Prior to explicitly supporting L4 proxies in Rama, we already had man-in-the-middle support. This in context of HTTP and SOCKS5 proxies. In this traditional man-in-the-middle flow, the server accepts an incoming connection. explicitly requested by the client to the proxy as an HTTP proxy or a SOCKS 5 proxy. It is at this point that the proxy is terminating the protocol, terminating also the inner protocol and so on. Meaning in case it's a HTTPS request, will terminate the TLS request, it will then terminate the HTTP server accept and only then when it will receive the first HTTP request was this proxy gonna make an actual outbound connection meaning has to do a TCP handshake outbound, TLS handshake outbound and finally send the incoming request at first request to the outbound and then send the response once again to the client. And this works, but it becomes extremely complicated when protocols stack. The insight we ended up with is that most protocols are actually just conversations. The client speaks and the server replies. So instead of terminating the whole protocol first, we can bridge the conversation while it happens. Bridge.io is a small abstraction that connects two IO streams, the client I.O and the server I.O, and it forward bytes between them. But the proxy can still inspect messages, intercept parts of the handshake, terminate protocols if ⁓

← All episodes