Jump to content
Roy

IPIP Tunnel Bug We Experienced Last Night

Recommended Posts

Hey everyone,

 

I just wanted to make a knowledge base article regarding an issue one of our game servers (GMod TTT #3) ran into last night. This is related to our Anycast setup and IPIP tunnels on our game server machine. I'm hoping this educates some people and perhaps I can refer back to it again if we run into the same issue.

 

How Our Setup Works

I'd like to briefly go through how our setup works at the moment. As some of you know, we operate an Anycast network. When a client sends traffic to our Anycast network, it routes them to the closest POP based off of the AS-PATH and/or BGP hop count. From here, Compressor (the packet processing software running on our current POPs created by @Dreae) forwards traffic to our game server machines via IPIP based off of the forwarding rule information for the specific IP address (e.g. Anycast IP). IPIP is a protocol in Linux networking that basically adds an outer IP header to a standard packet and encapsulates the standard packet. For more information on IPIP, please read here.

 

On our game server machines, we need to setup IPIP tunnels/endpoints that our game servers bind to. The IPIP tunnel has a remote IP which represents the Anycast IP address typically and also a single /32 assigned to the interface itself that represents the internal IP used for NAT. Since we utilize Docker containers for our game servers, we need to add the IPIP tunnel to the host and then set the link's network namespace to the Docker container's network namespace. We use Docker Gen to automatically do this when a Docker container (game server) starts up and our configuration can be found here.

 

Our Issue And Fix

Last night, each time GMod TTT #3 started up, no IPIP tunnel was attached. At first, I thought it was a standard bug we had where we just needed to restart the server a few times. However, after trying to a few minutes, it came pretty obvious there was another issue. From here, I tried adding the IPIP tunnel manually by doing ip tunnel add ipip01 mode ipip remote 92.119.148.99. This resulted in the following error:

 

add tunnel "tunl0" failed: File exists

 

This typically means a tunnel with the same remote IP exists already. However, there were no tunnels that we could see at the time. I tried a different remote IP and it worked as well which proves somewhere there was an existing tunnel with that remote IP.

 

At first, I thought maybe it was in another Docker container somehow (e.g. we configured another server on the same machine to use the same remote IP), so I outputted everything from the ip netns list command into a file (since each Docker container is in its own network namespace), performed a loop, and did ip netns exec <id> ip a and ip netns exec <id> ip tunnel list to see if I could find any. I had no luck with this unfortunately.

 

I then started reading the manual page for ip netns (e.g. man ip netns) and found the following information:

 

Quote

It is possible to lose the physical device when it was moved to netns and then this netns was deleted with a running
process:

           $ ip netns add net0
           $ ip link set dev eth0 netns net0
           $ ip netns exec net0 SOME_PROCESS_IN_BACKGROUND
           $ ip netns del net0

and eth0 will appear in the default netns only after SOME_PROCESS_IN_BACKGROUND will exit or will be killed. To prevent
this the processes running in net0 should be killed before deleting the netns:

          $ ip netns pids net0 | xargs kill
          $ ip netns del net0

 

I've seen this before when we were running machines on mainline kernels. What would happen is, the network namespace would be removed while the game server was still attached to the IPIP tunnel within the network namespace. We'd simply just kill the game server process (which was usually done automatically by Docker) and then check the host machine for the IPIP tunnel via ip link list and then delete it via ip link del <name>. However, this was ALSO not the case (it was one of the first things I checked since it has happened before).

 

This had me really confused and I actually tried making a thread on ServerFault here. I wasn't expecting this thread to get any replies since IPIP tunnels and these kind of issues have little to no documentation (and as expected, it didn't receive any replies), so I continued investigating a bit later after doing some things.

 

I knew the tunnel was somewhere, but I didn't know where. I tried looking at the Linux kernel source code (specifically IPIP), but couldn't find anything relevant. I was beginning to think we'd have to restart the machine, but I didn't want to do that because GS14 (the machine) has an issue where if we reboot it via the standard reboot command on Linux, it won't come back up. This is due to a setting not properly configured in the BIOS and we still need to schedule a time to go in it and change it (this requires downtime). I really wanted to avoid this if possible because we'd have to restart all the game servers (plus additional downtime while waiting for the machine to come back online) and contact our hosting provider since I'm sure we'd need a hard reboot via KVM. However, I wasn't sure what else we could do, it seemed like it was a Linux bug (I still think it is as you'll see below).

 

After trying to search more, I found this thread. Now this was pretty overwhelming to me at first because I didn't understand how Linux network namespaces worked on a low-level (it's something I'm learning though!). So at first, I gathered Linux network namespaces are technically mounts on the file system. Now, I didn't go in-depth on this because I just wanted to find a solution as soon as possible, so I read everything fast (I still plan on reading everything in-depth later on because I'm interested in this) and I won't be going into the details here since that's out of this KB's scope.

 

Anyways, I ended up running the following Bash script:

 

find /proc/ -mindepth 1 -maxdepth 1 -name '[1-9]*' | while read -r procpid; do
        find $procpid/fd -mindepth 1 | while read -r procfd; do
                if [ "$(stat -f -c %T $procfd)" = nsfs ]; then
                        stat -L -c '%20i %n' $procfd 
                fi
        done
done 2>/dev/null

 

This basically outputs all the network namespaces along with its inode ID and path on the file system (which is in /proc/x/fd/y). I put all the output from this command into a file called namespaces.txt. From here, I ran the following command:

 

while read -r inode reference; do
    if nsenter --net="$reference" ip -br address show 2>/dev/null; then
            printf 'end of network %d\n\n' $inode
    fi
done < namespaces.txt

 

I didn't bother removing any duplicates by performing sort -k 1n | uniq -w 20 because I just wanted to search EVERYTHING. This outputted all the interfaces in each network namespace on the file system. From here, I was able to locate a tunnel inside a namespace with the GMod TTT #3 internal IP. Bingo! From here, I found the inode ID and looked it up in the namespaces.txt file which included the path on the file system.

 

I read the second reply which stated how to execute commands in a network namespace based off of the file system path. So I used the following command:

 

nsenter --net=/proc/x/fd/y ip link list

 

Where x and y were integers representing the process ID and file descriptor ID I got from the namespaces.txt file based off of the inode ID. From here, I saw the IP tunnel with GMod TTT #3's remote IP (92.119.148.99) and the interface name. I simply executed the same command, but ran ip link del <interface name> to remove the IPIP tunnel. So for example:

 

nsenter --net=/proc/x/fd/y ip link del <interface name>

 

This resolved the issue and the server was able to startup normally again :)

 

This was definitely one of the stranger issues I've seen with Linux networking namespaces so far and I believe it was some sort of Linux bug because it should have listed this namespace in the standard list obtained by the ip netns list command (in which, it didn't as that was one of the first things I tried).

 

To conclude, I hope this helps others resolve this issue in the future if I'm not available and we're having a very similar issue. I also hope others outside the community running into the same issue find this because like I said, there is little to no documentation on the Internet regarding these issues.

 

Thank you!

Share this post


Link to post
Share on other sites

I hate everything about Anycast.

 

Thanks ❤️

 


aurora2.png

(signature made by @cockycock)

 

Twitter ❤️Ko-Fi ❤️Github

 

 

Share this post


Link to post
Share on other sites

Just now, Aurora said:

I hate everything about Anycast.

 

Thanks.

Blame Linux network namespaces! It has nothing to do with the Anycast network itself :( 

Share this post


Link to post
Share on other sites

Just now, Roy said:

Blame Linux network namespaces! It has nothing to do with the Anycast network itself :( 

 

While I concede that referring to it as Anycast is a bit of a misnomer, is much easier for others to understand that I'm bitching about the same old stuff (tm) if I just say anycast :)

 


aurora2.png

(signature made by @cockycock)

 

Twitter ❤️Ko-Fi ❤️Github

 

 

Share this post


Link to post
Share on other sites

Just now, Aurora said:

 

While I concede that referring to it as Anycast is a bit of a misnomer, is much easier for others to understand that I'm bitching about the same old stuff (tm) if I just say anycast :)

 

Understandable haha

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...