Jump to content

Roy

Banned
  • Posts

    2,853
  • Joined

  • Last visited

  • Days Won

    383

Posts posted by Roy

  1. Update

    The last couple of days I've been trying to whitelist the Rust+ service. Unfortunately, this was harder than I expected since it uses the client's source IP to establish the TCP connection. However, I believe I got it pretty much working.

     

    With that said, I'm setting up scripts to automatically update our whitelisted services and this is nearly finished.

     

    I've updated the Sydney's POP with the up-to-date filters and just want to make sure things run stable for a day or so. Afterwards, we will start expanding the filters to new POPs.

     

    Thank you.

  2. 27 minutes ago, Saizy said:

    How long did this take you & potentially other developers?

    To get most of the modifications completed, probably around 7 - 8 hours (I spent 6 hours straight on this last Friday IIRC). However, I highly overlooked something for production due to the nature of Anycast and had to redo parts of the code which took another hour or two. Monitoring logs and trying to figure out the small bugs probably took 3 - 4 hours since debugging with XDP is a pain in the ass (you have to print debugging messages to the Linux trace pipe and you can't use user-space functions that would typically make things easier).

     

    Monitoring outbound services is a pain since a lot of them use CDNs and come from multiple source IPs. Therefore, you want to whitelist ASNs instead and my back-end script automatically looks up prefixes from the ASN.

     

    As for the back-end services, probably around 3 - 4 hours since that was written in Go and I'm not fully experienced in Go yet.

     

    Other than Dreae helping me with the per CPU map issue here, I was able to complete everything else.

     

    EDIT

    Monitoring the handshake process between every game we host game servers in probably took around 1 - 2 hours as well.

  3. Hey everyone,

     

    As some of you may know, I've been working to hard-code filters into our current packet processing software (Compressor V1). This is a temporary solution until Compressor V2 is completed and its main purpose is to mitigate recent (D)DoS attacks aimed towards our network. You can read more about this here which explains why I'm doing this.

     

    So far, these new filters are deployed on our Sydney POP and after correcting a couple issues, it appears game server traffic is flowing through this POP smoothly (I haven't heard of any complaints in five or so days). Therefore, I believe we're ready to expand these filters to POPs in other locations.

     

    This morning, I setup an application I made recently to handle all the services we need whitelisted. Each POP will grab the whitelisted services from this back-bone every x seconds. I plan on giving Technical Administrators and Directors access to modify these service lists along with appropriate training so I'm not the only one making these changes since it will be a burden.

     

    I do want to prepare everybody for the deployment. Please read below.

     

    Deployment Plan

    The deployment plan is pretty simple. I'm going to add these filters to the rest of our Asia POPs (Tokyo and Singapore) along with giving it one - two days to ensure there are no reported issues. Afterwards, I will be deploying these new filters to our Europe POPs (Frankfurt, Paris, London, and Amsterdam). Finally, we'll start deploying it to North America after the Europe POPs are proven stable.

     

    I may give it an additional one - two days to try to capture more outbound traffic and make adjustments to our service whitelists. With that said, I need to figure out how to support Rust+ with the new filters since that's important for Rust servers now, I guess. This will likely add some delays.

     

    Things May Break

    Since we're whitelisting all outbound connections/services made by the game server, there's a high chance we've missed some things initially. I have all global/GFL-specific services whitelisted. However, there's a chance the game server is making outbound connections to a specific service in that game or game mode that isn't on the whitelist yet. If you notice something in the server breaking, please report it to the Server Manager and they'll go up the chain if it's related to networking.

     

    I'm doing the best I can to monitor outbound game server traffic and whitelist the correct services. However, monitoring outbound traffic from 20+ game servers is very time-consuming and there's a high chance I've missed some things.

     

    That's basically it. I just wanted to thank you for your understanding and patience regarding this matter. I understand some of you may be frustrated if a specific service breaks. However, this is the only way we can fully secure our network.

     

    If you have any questions, please feel free to reply.

     

    Thank you.

  4. After talking to @Dreae and reviewing the XDP program's code, Dreae told me the validation map being a LRU per CPU map would cause issues when reading values from the map in the XDP program itself since that is unsupported at the moment (you can only read per CPU maps within the user space like I am doing with the handshake LRU per CPU map). Therefore, I've converted the validation map to a single map and used a different function Dreae told me about to update the hits count. This was most likely the issue and why the hits count was 0 (it was reading the value from the wrong CPU). Since the hits and expiration time values were at 0, this was most likely causing the timeout issues after 20 seconds of loading in or so. I used per CPU maps for performance advantages. However, the validation map doesn't perform many reads/updates. Therefore, converting this map to a single map wouldn't be a big deal in my opinion.

     

    The whitelist map will remain a LRU per CPU map for performance advantages (this is the map that needs to be spread out to multiple CPUs anyways). We only modify values on this map within the XDP program and read values within the user space which is fine.

     

    I've applied this update to the Sydney POP. I will continue to monitor the logs to ensure I see nothing else suspicious. This issue should be resolved and thank Dreae for all the help :)

  5. Hey everyone,

     

    As some of you may know, our Sydney POP has new filters I've applied to the packet processing software we're using. I've hard-coded these filters into the software as a temporary solution to mitigate (D)DoS attacks we've been receiving recently. Dreae and I are working on Compressor V2 (or whatever name we go with) as a permanent solution. However, this will take time to create.

     

    The past few days I've been monitoring traffic on the Sydney POP and there have been a few issues that I've mostly fixed. However, I'm currently trying to track down one last issue. The side effects of this issue appear to be when you connect to the server, you'll time out around 20 or so seconds after the initial connection (the validation map timeout). I received a report regarding this and tried connecting using the forwarding server I have in Sydney and actually witnessed it myself. Unfortunately, I didn't have logging enabled at the time and no matter what I do, I can't reproduce the issue this time around after enabling logging for validations and handshakes. I've been monitoring the log all day and haven't seen many much occurrences of this since this rarely happens.

     

    I do have a suspicion on what the issue might be. However, it appears to be a bug with BPF or something because it's reporting the hits member (indicating how many packets the client has sent during validation period) on the validation map is 0 even though there is never a time it should be set to 0.

     

    I'm making this thread to make users aware of this issue if they're routing through the Sydney POP (anybody closer to Sydney, AUS more than likely). You can confirm you route through this POP by performing a trace route or MTR to any IP on the Anycast network (92.119.148.0/24).

     

    If you experience the side effects from this issue, please post on this thread with the time it happened and private message me your public IP address (you can get it from here).

     

    This issue is the only thing preventing me from deploying these filters to more POPs at the moment. Unfortunately, the only work-around for the players right now is to keep reconnecting until you puts you on the handshake map. This usually works after the first or second retry, at least for me.

     

    Thank you.

  6. Last Update

    I just wanted to provide one last update regarding this issue. As stated in my last reply, the cause to this issue was more than likely the NYC DC having an outage that affected our NYC POP (where traffic was flowing through). Unfortunately, in cases like these, there's not much we can do. However, I'm going to look into BIRD and see if it's possible to turn off BIRD if the network is detected down. This way, if there's an outage of some sort, the POP will stop announcing the network and therefore, no traffic should flow through it.

     

    With that said, we have a plan for Compressor V2 that'll result in outbound traffic from the game server and not from Steam or back to other players not having to go through the Anycast network. This'll be a lot more reliable, but won't be ready until Compressor V2 (or whatever name we go with).

     

    In the meantime, I'm going to be looking into BIRD option.

     

    Thank you!

  7. Update

    Unfortunately, I was asleep when all of this was going down. This was likely not due to any filters put in-place recently on certain POPs. Services weren't able to connect due to not being able to resolve the needed host names. It's hard to say what the exact cause was since I wasn't here. However, our NYC POP's DC was having a network issue around the time this started occurring. I know many of our service's traffic would route through the NYC POP and this probably included DNS.

     

    With that said, there was a (D)DoS attack at this time as well (what a surprise), but this only lasted a couple minutes and the traffic wasn't being forwarded to the game server machine based off of the graphs I've seen. The CPU on the NYC POP only spiked to 60% from this and we saw around 500 - 600 mbps inbound from the attack. The outage occurred a bit after the attack as well.

     

    I've advised our staff to next time resolve the host names and replace them with the IPs. This should allow us to see if it was only a DNS issue or if the POP (probably NYC) was having actual networking issues and all needed traffic (including DNS) was trying to flow through it.

     

    If this happens next, the following troubleshooting steps will need to be attempted:

    • Resolve host names and replace host names with IPs indicating whether this is only a DNS issue or not.
    • If not only a DNS issue, go into the web machine (where database traffic and so on is coming from), and perform an MTR to the network to see which POP the web server routes through.
    • Connect to affected POP and do generic troubleshooting. My guess is the NYC POP wouldn't even be connectable due to the DC outage. 

     

    Thanks.

  8. This is likely due to the filters I've put in-place for SRCDS servers regarding the temporary solution I proposed in this post. As of right now, any traffic that isn't from source port 27005 is blocked. I was thinking a lot of people would use the 27005 source port since it's default and while a lot of people have this set as the source port (you can check via the clientport command in-game), their routers are doing something weird and changing the source port per connection (a source port mapping method). I underestimated the amount of players this affects.

     

    As for the VPN working, it probably is routing to one of the POPs that doesn't have these filter rules applied (3/4 of the US POPs don't have this set at the moment and some in Europe as well). In order to get a VPN/forwarding server working correctly that is forwarding to one of the POPs where the filters are applied, you'd need an SNAT rule that sources out as port 27005 so the forwarding server uses that port to communicate with the POP/game servers. This is what I did as a temporary solution for other players having the same issue and I can certainly do it for you as well if you let me know which servers you're wanting to connect to.

     

    Anyways, I plan to revert this specific filter change on all POPs later today. I'm going to be looking into implementing handshake whitelisting to the network which I was hoping would wait until Compressor V2, but with the recent attacks and this source port issue, it looks like I'm going to have to figure something out (I know what needs to be done and I can do it, it's just a pain in the ass).

     

    I apologize for the inconvenience. It should be resolved later today when I'm not as burnt out/busy with my job.

     

    Thanks!

  9. 10 hours ago, JGuary551 said:

    Roy I love what you are doing! im completely amazed by everything you have said, this is the perfect post to let everyone know (even the ones that bitch and complain but will somehow still find a way to LOL) cant wait for this to be finished and you can relax on a beach retired xD 
     

    Oh and you forgot Ark survival xDDD

    Lol, I'll do Ark eventually. I'd imagine the Unreal Engine games use the same type of handshake, so it shouldn't be that difficult, I hope.

     

    5 hours ago, License to Kill said:

    Love your work. I know i havent been around long but i am amazed by all the staff and the community it gives that We are one feeling and its great.

    Keep up all the great work.

     

     

    Glad you're liking it here :)

  10. Hey everyone,

     

    I'm creating this thread to store my notes regarding findings with filtering rules that I plan to implement into our Anycast network when Dreae and I complete Compressor V2. I also address recent (D)DoS attacks on the network and things we're implementing in the future to mitigate these attacks.

     

    I am making this thread public to educate others or perhaps myself by somebody who may have looked into this before. If this is an interest to you, please feel free to reach out or post a reply :) The more help with these findings, the better.

     

    Our Problem

    We've been getting continuously hit by (D)DoS attacks recently and since we don't have many filtering rules applied at the moment (besides generic SRCDS hardening rules), all malicious traffic is forwarded to the game server machines creating a single-point-of-failure. While there is no way to completely remove this single-point-of-failure because game servers simply can't be "Anycasted", we can apply filtering rules to drop as much malicious traffic as possible on the POP-level.

     

    Why This Isn't As Simple As People Think

    I've seen some people complain recently about the network and how it should be easy for us to find a new hosting provider or get better (D)DoS protection. While I understand the frustrations from these users, I just wanted to briefly explain why this is a lot more complicated than it looks. It's simple actually, we own our Anycast network (we have our own IPv4 block and ASN). This network sits in-front of all our game servers and is responsible for forwarding traffic to our game servers. This network has been responsible for a lot of the success we've had in the past year as well and it's really the thing I'm most interested in on GFL (on a technical-level). Therefore, removing the network isn't really an option and I will admit, I would probably lose a lot of motivation if it was.

     

    Since we own the network, we need to also implement processing and filtering software. As of right now, we use Compressor, a project made by @Dreae (one of the smartest people I've seen in networking and network programming). Unfortunately, Compressor V1 doesn't include in-depth filtering rules at the moment. This is our main issue. With that said, since Compressor V1 doesn't connect to a backbone to handle game server connectors (it relies on a config file on each POP), it's not easy to spin up many POP servers as well due to maintenance. These are all things that will be tackled with Compressor V2 (read below).

     

    To conclude, adding protection to our network is a lot harder than it looks. Thankfully, this is something Dreae and I are very interested in. I'm still pretty new to this area as well, but I've made a lot of success the past few months with network programming and so on. Therefore, we're doing everything we can to protect the network. However, Compressor V2 will take some time to make. Keep in mind, hosting companies pay developers a ton of money to implement filtering rules and so on for (D)DoS protection. Dreae and I are both doing this for free and open-source.

     

    I do understand the frustration regarding downtime from malicious attacks, but I just wanted to go over why it's a lot harder than people think to upgrade our network capacity and (D)DoS protection. This is still nothing compared to what we used to see back in 2014 - 2016. We were having servers getting nulled daily for 4 - 12 hours at a time back in those days (a null route is when a hosting provider sends all traffic to a specific IP to a blackhole via BGP usually). Thankfully, servers are very hard to null route on an Anycast network since we're using the overall network capacity from multiple data centers.

     

    Temporary Solution

    Since Compressor V2 won't be available for some time, I'm implementing a temporary solution in hopes to block this malicious traffic at the POP level. Due to the nature of Compressor and XDP, there's no real way to perform packet captures on the POP level. Technically, you could use bpf_trace_printk() (a BPF Helpers function) to print to /sys/kernel/debug/tracing/trace_pipe within the XDP program, but the output looks horrible, only three arguments are supported, and performance would be highly impacted. However, you may do packet captures on the game server machine. One thing to note is the format is in IPIP (basically an outer IP header is added with the size of 20 bytes) and sometimes that's hard reading with something like Wireshark or capturing with tcpdump. For example, I still haven't found a way to sort by the inner IP header's source/destination on Wireshark and tcpdump. I'm pretty sure it's possible with tcpdump though. I just need to find out how.

     

    Anyways, I hard-coded these filters into Compressor V1.

     

    The challenge I've mostly faced with implementing these filters is trying to whitelist Steam traffic. I was able to do so after some time and multiple packet captures. I had to do this with each game we host under the network at the moment because some games use TCP to communicate with Steam while others use UDP. It's a mess in my opinion, but I believe I got it figured out and I've tested this 5 times per game with all successful attempts.

     

    As of right now, I'm waiting for our POP hosting provider to fix our BGP issue which is causing a roadblock with deploying the temporary solution. I'm hoping to move away from this provider eventually or at least get more hosting providers in the future (read below) so we aren't relying on just our current provider (their support has been horrible in my experience). Since we've got our own ASN, we're considered multi-home and can find more hosting providers. I've already done this when we setup our Hivelocity POP in NYC a few months back (this was discontinued due to unrelated issues and pricing). Therefore, I know what I need to do for the future.

     

    This temporary solution should stop forwarding TCP/UDP floods along with reflection attacks to our game server machines. There's still a chance the POP could be overloaded and if it is, only traffic routing to that POP will be affected. From what I've seen, the attacks recently are just typical UDP floods. There's nothing special about them from what I've seen.

     

    It's also important to get the TC BPF program I made here working on GS12 so we don't have another single-point-of-failure. I'm still waiting for our hosting provider to look into why certain upstreams are filtering traffic spoofed as our Anycast network (needed for the TC BPF program to work properly). I will request an update on this tomorrow. All other machines are using this program successfully.

     

    Permanent Solution

    The permanent solution to this issue will come with Compressor V2. While Compressor V2 will support both a whitelist and blacklist approach, we (GFL) will be taking a whitelist approach to all game server traffic (client => game server) which is a lot safer in my opinion.

     

    The goal is to make it so no malicious traffic will ever be forwarded to the game server machines. If this is the case, the (D)DoS protection will primarily rely on network capacity and resources. The plan with this is we're going to be heavily expanding our network by getting 60 - 100+ POPs along with 2 - 3 solid new hosting providers (I've sent many emails to hosting providers in the past and continue to do so). With Compressor V2, we're planning to automate literally everything. Therefore, all new POPs will be setup automatically with Compressor V2 using API scripts, etc. This shouldn't be too difficult and we'll just need a template for the BIRD config for BGP.

     

    With that said, we plan to implement a monitoring system that will check the resource usages (CPU and network) at each POP and location. If a location or POP is found with high resource usages, we'll automatically spin up temporary POPs to load balance the traffic. Most hosting providers load-balance traffic between POPs at each location such as our current provider (a round-robin method). This is so we're not wasting money on POPs we don't necessarily need while attacks aren't going on, but if there's a long (D)DoS occurring, temporary POPs will be spun up to absorb the attack.

     

    Filtering Structure For Compressor V2

    I'm going to share an image of my idea for implementing filtering rules and modules (for whitelisting clients after validating a game server handshake) into Compressor V2:

     

    2775-05-23-2020-nbC0Gn6e.png

     

    I won't go into detail on this since that's out of this thread's scope. However, I will provide more detail later and I'll likely post something here.

     

    Note that malicious traffic will be dropped via XDP-native (one of the fastest hooks in the Linux networking path besides maybe DPDK). I had a discussion on the XDP Newbies Mailing List here. David Ahern (a super intelligent guy) confirmed that XDP-native is still a lot faster than XDP-generic even on the virtio_net driver (what we'll be using since we're going to have a VPS as each POP server). Beforehand, I was under the assumption XDP-native would only be useful if the hosting provider offloaded packets off their cluster's NIC directly to the VPS. This is not the case and there's actually a separate XDP mode for this (XDP_FLAGS_HW_MODE) along with only one NIC driver being supported.

     

    Rate Limiting And Sent/Received Ratio Thresholds

    The first line of defense we'll have on the network is rate limiting and sent/received ratio thresholds.

     

    The first part is pretty self explanatory. We're going to be limiting the amount of packets per second (and bytes per second) a source IP can send to our network. If they're found sending more than the thresholds, we'll add them to the XDP program's blacklist map to have them dropped via XDP-native for a certain amount of time (probably 30 minutes or something since they're definitely malicious traffic).

     

    The second part is going to be a ratio for the amount of packets a source IP sends and receives. Empty UDP floods not targeting a specific service will likely not receive a UDP response from our game servers. Therefore, we'll start blocking the source IP after a certain threshold (e.g. 200 sent packets per response). We'll have to find a way to exclude when a game server is down for example as well.

     

    Note - Legitimate traffic based off of the whitelist handshakes (explained below) will be accepted before the blacklist map drops traffic. This makes it so an attacker can't start spoofing as a random source IP each packet and possibly get the IP address of somebody on the game server and impact them. We'll still have fair rate limits applied to users on the game servers as well. Therefore, spoofing as a certain player IP will still not be able to take down the network. It may impact that client though (if it does, they'll have to change their public IP). I'd be surprised to see an attacker go that far (I haven't seen it before), but better to be prepared :) 

     

    Cached Packet Types

    Attacks targeting a specific service can be damaging (e.g. the A2S_INFO query). I've done a lot of pen-testing and it's super easy to take down a server when sending many A2S_INFO requests for example if the game supports these queries (all of our game servers do). Since the server replies to these requests, it's usually easy to make the server use all of its resources by sending a high amount of low throughput packets (e.g. the A2S_INFO request payload only needs to contain bytes 0xFF 0xFF 0xFF 0xFF 0x54). To prevent this, we will be caching these specific packets. This'll make it so the attackers can't target a specific service within our game servers and cause it to use all of its resources. The load will be distributed all throughout the POPs instead.

     

    Note - Some games support caching the A2S_INFO query via an extension. I haven't been able to find a working extension yet, but theoretically, if it was compiled correctly, it would be a decent defense. It's still better to cache the packet on the POP level though since it'll allow you to distribute the load throughout all the POPs instead of having the game server respond to each with the cached response. You can still take down the game server if you use the extension from my findings.

     

    Note - Compressor V1 currently caches the A2S_INFO response as well. However, this is hard-coded into Compressor V1 and with Compressor V2, you will have the ability to cache certain packets easily by adding them via a form in the panel. More info on the plan to implement this can be found here.

     

    How To Whitelist Outbound Game Server Traffic?

    Since we're taking a whitelist approach with Compressor V2 along with a module system, the first thing I thought about is how we're going to whitelist traffic that the game server sends out (e.g. Steam traffic, API requests from the game servers, MySQL connections, etc). Due to how we're going to have our Docker containers and network namespaces setup, outbound traffic the server sends will be encapped in FOU and will be processed on the POP by the two TC BPF programs I made here. This does NOT include traffic from the game server going back to the players since that's going through a separate route and device not being encapped with FOU (e.g. the veth pair/bridge connecting the network name space to the main host).

     

    FOU is similar to IPIP, but it includes an outer UDP header of 8 bytes as well as the outer IP header (20 bytes). The outer UDP header's source and destination port will represent the FOU port. This supports UDP and TCP for the inner headers.

     

    Therefore, on the POP server, we'll add the inner IP header's destination address to a BPF map for the XDP-native program whitelist for any valid incoming FOU requests for a certain period of time (let's say 45 seconds). We don't want to permanently whitelist these IPs, at least that's not an approach I'd like in the case the malicious attacker spoofed the IP as something on that whitelist.

     

    Ez pz :)  (not all that easy in programming terms, though, due to certain checks we'll need to implement, etc)

     

    Whitelisting Game Server Traffic

    Now this is when things get interesting :) I've been doing plenty of packet captures just trying to understand the handshake process for players to our game servers. I've been using Wireshark to inspect these packet captures, but using tcpdump to record the actual packet captures on our game server machines. I still need to do a lot more digging into the Steam Networking library to confirm everything below, but here's what I have so far.

     

    Also, here's a screenshot of all my packet captures!

     

    2744-05-25-2020-sg8mwcDg.png

     

    Quite a bit 😄

     

    SRCDS Games

    When a client connects to standard SRCDS game servers (in our case, FOF, GMod, CS:S, TF2, and CS:GO), they're usually using 27005 as the default source port. This can be changed by adding +clientport xxxxx to your launch options. However, there is rarely ever a need to change it.

     

    When I first connected to my test CS:S server, I sent a request with 0XFF 0XFF 0XFF 0XFF 0X71 as the headers:

     

    2745-05-25-2020-UyV5MQas.png

     

    Afterwards, the server sends back a response with 0XFF 0XFF 0XFF 0X41 set:

     

    2746-05-25-2020-iV48uUHb.png

     

    Then, the client sends a response that differs per game I believe (0XFF 0XFF 0XFF 0XFF 0X6B for CS:S, though):

     

    2747-05-25-2020-9q8H2llh.png

     

    Finally, the server sends back a response with 0XFF 0XFF 0XFF 0XFF 0X42 set:

     

    2748-05-25-2020-SxabioFf.png

     

    While I haven't inspected the actual data yet (I plan on doing so), I believe the 0xFF 0XFF 0XFF 0XFF 0X42 request the game server sends back to the client is an indicator that the client is legit and valid. Therefore, with our module system in Compressor V2 and utilizing TC ingress and egress filters, I believe we can whitelist traffic when the server responds to the client with a the first five bytes being 0XFF 0XFF 0XFF 0XFF 0X42.

     

    I will dig more deeply into this later to confirm as mentioned above. However, that's my theory as of right now.

     

    Rust

    Rust is pretty easy to detect as well. It uses the RakNet networking library that I'm not that familiar with yet, but I do want to look into it more in the future.

     

    The first request the client sends contains a single header byte of 0X05:

     

    VjQj3rUb1S.png

     

    Afterwards, the server sends a response with a single header byte of 0X06:

     

    flJIrUl4hi.png

     

    Then, the client responds back with a single header byte of 0X7:

     

    VP9x8ydJvp.png

     

    And finally, the server responds back to the client with a single header byte of 0X08:

     

    3n03G7mZCZ.png

     

    Note - I blocked out irrelevant information since I performed this packet capture on a public Rust server. Therefore, I didn't want anybody to get my public IPv4 address.

     

    Anyways, I believe we can whitelist legit Rust player IPs based off of the destination IP of the response the server sends with 0X08 set as the single byte header.

     

    I still need to look into the RakNet library to confirm this specific response means the connection is valid and accepted.

     

    Other Games

    Other games I will be looking into the near future for this include:

     

    • Killing Floor 1/2.
    • Left 4 Dead 1/2 (they should be using the standard SRCDS methods, but I just want to confirm).
    • Arma 3.
    • Red Orchestra 2.
    • DayZ.
    • FiveM.
    • RedM.
    • And more.

     

    Conclusion

    I know this is a very long thread, but I hope some people find the discoveries interesting. I also hope this clarifies some of the recent statements regarding our network and shows that Dreae and I are doing everything we can to strengthen our (D)DoS protection.

     

    If you have any questions, feel free to ask!

     

    Thank you.

  11. It turns out XDP-native breaks the AF_XDP side of Compressor V1 (A2S_INFO caching in our case). I made a thread on the XDP Newbies mailing list with the following:

     

    2765-05-22-2020-sFYIDgL8.png

     

    I made a test AF_XDP project here to see if it made any difference using the latest recommended AF_XDP code from XDP-Tutorial (Compressor V1 uses outdated AF_XDP code and an older version of LibBPF). This is the biggest issue as of right now, but even when Compressor V1 does have RX queue #1 loaded with AF_XDP, the program isn't sending back A2S_INFO packets to the client. I know it's receiving traffic on the socket from the XDP redirect map function. I think the current AF_XDP functions used to send data back out are outdated. Therefore, I will work to update this.

     

    I've been working hard on this for the last day or so and hope I can get this resolved by the end of this weekend (assuming this isn't an issue with Vultr's NIC drivers).

     

    Thanks!

  12. Hey everyone,

     

    I just wanted to let everyone know I've added a new POP to the Anycast network in Seoul, South Korea. I've also made a pull request to Compressor V1 that adds support for XDP-native here and the POP is currently running with XDP-native. The added POP and XDP-native support should add more capacity to our network to handle (D)DoS attacks, etc. With that said, I plan on adding filtering rules once I figure out which traffic I want to whitelist on the network.

     

    @Dreae and I are still working on Compressor V2 and I recently completed the FOU Wrapping/Unwrapping TC BPF programs here which were needed for outbound server traffic and reporting to the Steam Master Server. We're making great progress so far :) Once this is finished, we'll be highly expanding our network.

     

    If you experience any issues, please let me know!

     

    Thanks!

  13. None of our POP servers are down.

     

    Would you be able to perform a trace route or MTR to 92.119.148.19? In Windows Command Prompt, you may execute:

     

    tracert 92.119.148.19

     

    To perform a trace route to the IP.

     

    Please provide me the results when done.

     

    Thanks.

  14. I just received another email from Toke:

     

    2693-05-13-2020-EWccYkcb.png

     

    It looks like the XDP Maintainer (Jesper) was also having similar issues and made these pull requests just as of last week (very recent):

     

    https://github.com/xdp-project/xdp-tutorial/pull/123
    https://github.com/xdp-project/xdp-tutorial/pull/124

     

    Interesting in my opinion, but glad I'm not the only person having these issues and this seems to be an issue with BPF itself. It shouldn't be this difficult to match payload data in my opinion, lol.

     

    Thanks!

  15. Just an update on this, BPF really doesn't like the payload matching functionality I'm trying to implement into for loops. No matter what I do, I cannot get it working and I've tried asking experienced developers such as @Dreae along with making a thread on the XDP Newbies mailing list. Even some of the experienced kernel devs from the mailing list aren't sure. They've stated the BPF compiler isn't convinced the code is safe (even though it is from our prospective).

     

    Toke from the XDP Newbies mailing list suggested this:

     

    Quote

    Use a matching algorithm that doesn't require looping through the packet byte-by-byte as you're doing now. For instance, you could have a hash map that uses the payload you're trying to match as the key with an appropriate chunk size.

     

    I've asked for clarification this morning. I searched the BPF Helper functions list here, but wasn't able to find any hashing function that supported XDP programs. There were some for BPF programs that support the sk_buff structure. However, this is typically used for TC programs. With that said, the hashing functions supported the entire packet, not the payload, unfortunately.

     

    Hopefully I can find some sort of workaround because having dynamic payload matching in an XDP program would be amazing and I feel it's a necessity for making a firewall that blocks (D)DoS attacks either automatically (based off of patterns) or manually inputted filtering rules.

     

    Thanks.

  16. Hi everyone,

     

    I was thinking about creating some C tutorials utilizing eBPF, XDP, and TC. However, first, I wanted to see if anybody had interest in this. You can give this article a read to understand (e)BPF and XDP more. There are many other resources online that can be found via Google.

     

    Basically, you can make XDP/TC programs for fast packet processing. Unfortunately, the BPF compiler is quite strict and can be a pain while trying to implement more complex features (such as payload matching in my case).

     

    If anybody is also interested in seeing benchmarks with XDP, TC, and other tools such as IPTables, you may view this article or refer to the following image:

     

    numbers-xdp-1.png

     

    Thanks.

×
×
  • Create New...