Roy 10,832 / 0 Report Post Posted December 23, 2020 Hi everybody, Due to recent (D)DoS attacks against GFL/HG's CS:GO Jailbreak server, I've made some upgrades to our Anycast network along with applying filters in certain locations, and more. The following changes have been made. Replaced and upgraded our LA, Miami, and Toronto POP servers. We went from one core to two cores on these servers. Applied filters to the upgraded POPs above along with our Chicago POP. Disabled kernel debugging within Compressor on our NYC POP which was causing higher CPU usage when the attack occurred on this specific POP. I honestly forgot kernel debugging was enabled and initially had it enabled to debug the newer filters in case things broke. Added support for Rust+ on the POPs mentioned above and our NYC POP. Rust+ should now work for players routing through these POPs. Additionally, I've also allocated new IPs for HellsGamers With that said, these attacks were mostly UDP floods and typical reflection attacks. GSK filtered most of these attacks, but some of the malicious traffic going through our NTT carrier was being forwarded to the game server machines causing a single point of failure. If the attacker finds ways through the filters, I'll be looking into the following: Setting up a BGP community with GSK so that when GSK detects an attack, it withdraws NTT from the announcements on our POPs with other hosting providers. This will result in all traffic going through GSK on the servers that are being attacked. Finally implementing global IP list support into my more in-depth filters that'll make applying the in-depth filters to our NA POPs more acceptable and likely to happen. Haven't done it so far because it's more complex and I want to work on Bifrost instead. Spin up more POP servers to absorb the attack if the malicious traffic isn't being forwarded. Push Bifrost forward (I wish, but this will take time). Bifrost Update I just wanted to provide a small update on Bifrost. There are currently two roadblocks when making the forwarding aspect of the program. Finding out to replicate NFTables/IPTables-like forwarding regarding source port mapping in BPF (via BPF maps). The max jump sequence limit is far too low for the amount of complexity we plan to implement into Bifrost if we want to write it all within XDP (this would be the more straight forward and fastest way). In regards to issue #1, I had a plan I shared in the #coding channel on the GFL Discord server. I was planning to build it out with the following structure. Quote Structures ----------- forward_map { uint32_t bindaddr; uint16_t bindport; uint8_t protocol; // 0 = ALL. uint32_t destaddr; uint16_t dstport; } connection_info { uint32_t saddr; uint32_t bindaddr; uint8_t protocol; uint16_t port; } connection { struct connection_info info; uint16_t srcport; uint64_t count; uint64_t lastseen; // If not needed, will probably remove to improve performance. } BPF Maps ----------- Forward Map - BPF_MAP_TYPE_ARRAY - Key Size = uint32_t - Value Size = sizeof(struct forward_map) - Max Entries = 1 * This map should be pinned to allow external BPF programs to read/write. Connection Map - BPF_MAP_TYPE_HASH - Key Size = sizeof(connection_info) - Value Size = sizeof(struct connection) - Max Entries = 100000 * The connection_info represents the source address (u32 bits), bind address (u32 bits), protocol (u8 bits), and port (u16 bits). Key could be declared something like :: uint128_t *key = (uint128_t) iph->saddr << 0 | (uint128_t) forward->bindaddr << 32 | (uint128_t) iph->protocol << 64 | (uint128_t) port << 72; Port Outer Map - BPF_MAP_TYPE_ARRAY_OF_MAPS - Key Size = uint32_t - Value Size = uint32_t - Max Entries = 1 * The outer port map will hold the inner UDP and TCP port maps for a specific bind address. TCP Port Inner Map (ONE MAP PER BIND ADDRESS) - BPF_MAP_TYPE_ARRAY - Key Size = uint32_t - Value = sizeof(connection) - Max Entries = 65535 UDP Port Inner Map (ONE MAP PER BIND ADDRESS) - BPF_MAP_TYPE_ARRAY - Key Size = uint32_t - Value = sizeof(connection) - Max Entries = 65535 I found a more recent BPF function called bpf_map_push_elem() which to my understanding, pushes a random value to a BPF map as a random key (the key in this case being the port number). It operates as LRU (Least Recently Used) meaning if the map is full, it'll remove the lowest updated value which is perfect for our needs. Unfortunately, this function doesn't return the random key after insertion which is needed for what I need to do. Quote long bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) Description Push an element value in map. flags is one of: BPF_EXIST If the queue/stack is full, the oldest element is removed to make room for this. Return 0 on success, or a negative error in case of failure. In regards to issue #2, I was running into this limitation with my XDP Firewall when trying to raise the max filters to above 100. There's no doubt in my mind we'll run into this limitation with Bifrost. However, thankfully, raising the limit here should not be a big deal. We'll just need to rebuild the kernel with the raised limit (we'll need to run custom kernels until/if the kernel developers raise this limit natively, which has happened in the past). The reason the kernel developers implement these limitations is to prevent long BPF programs from impacting performance. However, Bifrost won't impact performance in this case. If you run into any issues, please let me know. Thank you. Share this post Link to post Share on other sites More sharing options...
Roy 10,832 / 0 Report Post Posted December 24, 2020 An Update On Bifrost I am currently recompiling the 5.10.1 kernel to increase the max jump sequences to 126k (from 8192) on an one core VPS Hopefully this doesn't fail at the very end because if it does, I'll jump out the window. It's already taking really long as is! Anyways, I believe Dreae and I have a plan. It turns out after further testing, the bpf_map_push_elem() function is only available for the map type BPF_MAP_TYPE_STACK. When compiling using BPF_MAP_TYPE_ARRAY OR BPF_MAP_TYPE_HASH, I'd always receive the following when executing the program. 36: (85) call bpf_map_push_elem#87 cannot pass map_type 2 into func bpf_map_push_elem#87 processed 36 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 Where map_type was either 1 or 2 depending on the map used above. It compiled using the stack map type, but I could never get bpf_map_push_elem() to work with bpf_map_peek_elem() (hoping this would grab the last inserted key) anyways. Therefore, we're left to do everything manually within a LRU hash map (we want to find a random port available). After discussions on Discord which can be found below. I believe we found the loop that will suit (possibly with slight modifications). In this case, the below would probably be the best way to approach this. uint64_t leasttime = MAX_64BIT_INTEGER; uint32_t leastkey = 0; for (uint16_t i = 1; i <= 65535; i++) { uint32_t key = i; uint64_t *val = bpf_map_lookup_elem(&port_map, &key); if (!val) { leasttime = now; leastkey = i; break; } else { if (*val < leasttime) { leasttime = *val; leastkey = i; } } } bpf_map_update_elem(&port_map, &leastkey, &now, BPF_ANY); I just figured I'd be transparent about this all. The custom kernel I'm building is still compiling, but I'll let you know on the results Assuming we don't hit the BPF stack limit of 512 bytes or any other limitations, I believe we should be good to go if we can get the forwarding aspect working with BPF/XDP Thanks! Share this post Link to post Share on other sites More sharing options...
Roy 10,832 / 0 Report Post Posted December 25, 2020 Another Update I spent most of the day compiling the 5.10.1 Linux kernel. Initially, I tried raising the max jump sequence to 126K here and recompiling the kernel. However, I then started running into buffer size issues regarding iproute2. root@test02:/home/roy/iproute2/ip# ./ip link set ens18 xdpgeneric obj /home/dev/HelloWorld/xdp_bpf_push.o section xdp_prog Log buffer too small to dump verifier log 33554432 bytes (11 tries)! Error fetching program/map! It turns out the buffer was too small for the BPF error. I tried changing the max log size to an unsigned 64-bit integer with UINT64_MAX being the value here and compiling iproute2 on my own. Unfortunately, I had no luck. Although the log size was higher than the original, it still wasn't enough. Therefore, I had to load the XDP program using libbpf in C which allowed me to see the error. 60: (85) call bpf_map_lookup_elem#1 61: (15) if r0 == 0x0 goto pc+38 R0_w=map_value(id=0,off=0,ks=2,vs=24,imm=0) R6=pkt(id=0,off=26,r=34,imm=0) R7_w=inv(id=176463) R8_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)) R9_w=inv58821 R10=fp0 fp-8=mm?????? fp-32=??????mm fp-40=mmmmmmmm 62: (79) r1 = *(u64 *)(r0 +0) R0_w=map_value(id=0,off=0,ks=2,vs=24,imm=0) R6=pkt(id=0,off=26,r=34,imm=0) R7_w=inv(id=176463) R8_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)) R9_w=inv58821 R10=fp0 fp-8=mm?????? fp-32=??????mm fp-40=mmmmmmmm 63: (3d) if r1 >= r7 goto pc+3 R0_w=map_value(id=0,off=0,ks=2,vs=24,imm=0) R1_w=inv(id=0) R6=pkt(id=0,off=26,r=34,imm=0) R7_w=inv(id=176463) R8_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)) R9_w=inv58821 R10=fp0 fp-8=mm?????? fp-32=??????mm fp-40=mmmmmmmm BPF program is too large. Processed 1000001 insn processed 1000001 insns (limit 1000000) max_states_per_insn 4 total_states 11772 peak_states 11772 mark_read 2 The BPF program was too large for the verifier via "BPF program is too large". The check was occurring here. I had to raise the BPF_COMPLEXITY_LIMIT_INSNS constant which was declared here. I raised this to 100 million (instead of 1 million). Sadly, again I ran into a max jump sequence limitation. So I had to raise my already increased max jump sequence limit to 126 million instead of 126K. I also found out how to recompile the Linux kernel into deb files without cleaning everything which resulted in the compilation taking 4 - 6 hours even with 4 cores. I just had to use the make bindeb-pkg -j 4 command instead of make deb-pkg -j 4. This resulted in the compilation only taking around 40 minutes or so to generate those deb files instead of 4 - 6 hours Within my last attempt, I was able to load the BPF/XDP program successfully and also checked /sys/kernel/debug/tracing/trace (or trace_pipe) to ensure the program was working properly as well. root@test02:/home/dev/BPF-Loader# ./loader libbpf: Kernel error message: virtio_net: Too few free TX rings available XDP-Native may not be supported with this NIC. Using SKB instead. root@test02:/home/dev/BPF-Loader# ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 1a:c4:df:70:d8:a6 brd ff:ff:ff:ff:ff:ff root@test02:/home/dev/BPF-Loader# ip link set ens18 xdpgeneric obj /home/dev/HelloWorld/xdp_bpf_push.o section xdp_prog Note: 16 bytes struct bpf_elf_map fixup performed due to size mismatch! root@test02:/home/dev/BPF-Loader# ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdpgeneric qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 1a:c4:df:70:d8:a6 brd ff:ff:ff:ff:ff:ff prog/xdp id 15 tag 3b09307435253d95 jited root@test02:/home/dev/BPF-Loader# ip link set ens18 xdpgeneric off root@test02:/home/dev/BPF-Loader# From the trace pipe: <idle>-0 [007] d.s. 429.892795: bpf_trace_printk: Using port 1 with 50331914 <idle>-0 [007] d.s. 429.937832: bpf_trace_printk: Using port 1 with 50331914 <idle>-0 [007] d.s. 429.982814: bpf_trace_printk: Using port 1 with 50331914 <idle>-0 [007] d.s. 430.027811: bpf_trace_printk: Using port 1 with 50331914 <idle>-0 [007] d.s. 430.073012: bpf_trace_printk: Using port 1 with 50331914 <idle>-0 [007] d.s. 430.073291: bpf_trace_printk: Using port 1 with 50331914 This shows the program is working Here's the sample XDP program I made. #include <linux/bpf.h> #include <linux/bpf_common.h> #include <inttypes.h> #include <linux/if_ether.h> #include <linux/ip.h> #include <linux/tcp.h> #include <linux/in.h> #include "/home/dev/XDP-Firewall/libbpf/src/bpf_helpers.h" #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ #define htons(x) ((__be16)___constant_swab16((x))) #define ntohs(x) ((__be16)___constant_swab16((x))) #define htonl(x) ((__be32)___constant_swab32((x))) #define ntohl(x) ((__be32)___constant_swab32((x))) #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ #define htons(x) (x) #define ntohs(X) (x) #define htonl(x) (x) #define ntohl(x) (x) #endif struct connection { uint64_t lastseen; uint64_t count; uint32_t clientaddr; uint16_t srcport; }; struct bpf_map_def SEC("maps") connection_map = { .type = BPF_MAP_TYPE_LRU_HASH, .key_size = sizeof(uint32_t), .value_size = sizeof(uint16_t), .max_entries = 65535 }; struct bpf_map_def SEC("maps") port_map = { .type = BPF_MAP_TYPE_LRU_HASH, .key_size = sizeof(uint16_t), .value_size = sizeof(struct connection), .max_entries = 65535 }; SEC("xdp_prog") int xdp_prog_func(struct xdp_md *ctx) { void *data = (void *)(long)ctx->data; void *data_end = (void *)(long)ctx->data_end; struct ethhdr *eth = (data); if (eth + 1 > (struct ethhdr *)data_end) { return XDP_DROP; } if (eth->h_proto != htons(ETH_P_IP)) { return XDP_PASS; } struct iphdr *iph = (data + sizeof(struct ethhdr)); if (iph + 1 > (struct iphdr *)data_end) { return XDP_DROP; } if (iph->protocol == IPPROTO_TCP) { struct tcphdr *tcph = data + sizeof(struct ethhdr) + (iph->ihl * 4); if (tcph + 1 > (struct tcphdr *)data_end) { return XDP_DROP; } uint64_t now = bpf_ktime_get_ns(); uint16_t *sport = bpf_map_lookup_elem(&connection_map, &iph->saddr); if (sport) { struct connection *conn = bpf_map_lookup_elem(&port_map, sport); if (conn) { if (conn->clientaddr == iph->saddr) { bpf_printk("Using port %" PRIu16 " with %" PRIu32 "\n", *sport, iph->saddr); conn->lastseen = now; return XDP_PASS; } } bpf_map_delete_elem(&connection_map, &iph->saddr); } // Look for available ports. uint16_t port = 0; uint64_t smallest = UINT64_MAX; for (uint32_t i = 1; i <= 64000; i++) { uint16_t tmp = (uint16_t)i; struct connection *conn = bpf_map_lookup_elem(&port_map, &tmp); if (!conn) { port = tmp; break; } else { if (conn->lastseen < smallest) { smallest = conn->lastseen; port = tmp; } } } if (port > 0) { // New entry. bpf_map_update_elem(&connection_map, &iph->saddr, &port, BPF_ANY); struct connection conn = {0}; conn.clientaddr = iph->saddr; conn.lastseen = now; conn.srcport = port; bpf_map_update_elem(&port_map, &port, &conn, BPF_ANY); } } return XDP_PASS; } char _license[] SEC("license") = "GPL"; Now I'm able to actually start working on Bifrost's forwarding aspect in XDP without running into limitations and if I do run into any more limitations, I know how to raise them now and recompile the kernel Share this post Link to post Share on other sites More sharing options...
Roy 10,832 / 0 Report Post Posted December 25, 2020 Update On New POPs For players routing through our new POPs announced above (LA, Toronto, and Miami), there was a high chance you weren't seeing servers properly within the server browser (e.g. sometimes they'd show, sometimes they wouldn't). This was because for some reason, these new servers in particular have eight RX queues instead of two like the rest of our POP servers. It'd make more sense we had only two RX queues because we have two cores. It doesn't make sense to me, but we just modified Compressor's config to attempt to setup AF_XDP sockets on eight RX queues instead of two. It wouldn't error out if the RX queue wasn't available. Sample on Toronto POP: xxxxx@xxxxx:~/compressor-private# ls -la /sys/class/net/ens3/queues/ total 0 drwxr-xr-x 18 root root 0 Dec 25 02:08 . drwxr-xr-x 5 root root 0 Dec 25 02:08 .. drwxr-xr-x 3 root root 0 Dec 25 02:08 rx-0 drwxr-xr-x 3 root root 0 Dec 25 02:08 rx-1 drwxr-xr-x 3 root root 0 Dec 25 02:08 rx-2 drwxr-xr-x 3 root root 0 Dec 25 02:08 rx-3 drwxr-xr-x 3 root root 0 Dec 25 02:08 rx-4 drwxr-xr-x 3 root root 0 Dec 25 02:08 rx-5 drwxr-xr-x 3 root root 0 Dec 25 02:08 rx-6 drwxr-xr-x 3 root root 0 Dec 25 02:08 rx-7 drwxr-xr-x 3 root root 0 Dec 25 02:08 tx-0 drwxr-xr-x 3 root root 0 Dec 25 02:08 tx-1 drwxr-xr-x 3 root root 0 Dec 25 02:08 tx-2 drwxr-xr-x 3 root root 0 Dec 25 02:08 tx-3 drwxr-xr-x 3 root root 0 Dec 25 02:08 tx-4 drwxr-xr-x 3 root root 0 Dec 25 02:08 tx-5 drwxr-xr-x 3 root root 0 Dec 25 02:08 tx-6 drwxr-xr-x 3 root root 0 Dec 25 02:08 tx-7 Sample on Singapore POP (same package as Toronto POP): xxxxx@xxxxx:~# ls -la /sys/class/net/ens3/queues/ total 0 drwxr-xr-x 6 root root 0 Dec 25 02:11 . drwxr-xr-x 5 root root 0 Dec 25 02:11 .. drwxr-xr-x 3 root root 0 Dec 25 02:11 rx-0 drwxr-xr-x 3 root root 0 Dec 25 02:11 rx-1 drwxr-xr-x 3 root root 0 Dec 25 02:11 tx-0 drwxr-xr-x 3 root root 0 Dec 25 02:11 tx-1 Doesn't make sense to me, but should be resolved at least Share this post Link to post Share on other sites More sharing options...