Jump to content
 Share

Roy

Recent Anycast Events + BGP Adjustments + Chicago Machine

Recommended Posts

Hey everyone,

 

I just wanted to address a couple events with our Anycast network along with BGP adjustments and a possible hosting provider in Chicago.

 

Physical Machines Downtime

Our physical machines went down around 11:30 PM CST. This was due to an upgrade being performed against the machines which resulted in downtime for security purposes. Thankfully, the machines were still accessible, but a Docker script (Docker Gen) that handled the Anycast GRE tunnels crashed resulting in the servers not being able to bind to a LAN IPv4 address correctly.

 

This was quickly resolved by simply restarting the service and we won't be performing any further upgrades to this machine without a maintenance event scheduled. Normally, these type of upgrades don't result in the game servers going offline which is why we pushed it without announcing in the first place, but since this specific Docker script depended on certain packages which were being upgraded, it crashed.

 

Los Angeles And Chicago PoPs Downtime

Some users may have noticed the servers go offline/timeout earlier. This was more than likely due to the PoP server you routing to having an offline Compressor service. The Los Angeles and Chicago PoPs were victims to this. We had some outdated configuration settings in the Compressor service's config file which was causing the service not to come up successfully. I've updated all the PoP's Compressor service configs and reloaded the daemon so this specific issue shouldn't happen again.

 

Physical Machines => Anycast Routing + Telia

Some time last week our physical machines started routing to Chicago instead of Dallas. This resulted in +20ms latency overhead to the network. Vultr had Dallas network maintenance the same day this started occurring and we opened a ticket with them. However, unfortunately they haven't been responsive (as usual, I guess).

 

 

Until early this morning, this was still an ongoing issue. In the thread I listed above, I mentioned that I adjusted all the PoPs to only announce to GTT (AS3257) and NTT (AS2914). This would be ideal considering we've had major issues with providers like Telia (example) and it'd be easier managing one - two (good) providers over the network. Yesterday, as a test, I allowed all providers on the Dallas PoP to see if my home network would go from San Antonio => Chicago to San Antonio => Dallas (as it should be). Oddly enough, it started preferring a path to Dallas via Telia (AS1299) and the route was more optimized than the routes with Telia we had before (e.g. less hops and better latency on the hop with Vultr). I made a support ticket a couple months ago addressing the issues with Telia along with the Vultr's Telia hop having issues (e.g. +10ms latency compared to a hop before it that was also in Dallas). They said they were going to be making changes, but at that point I was just frustrated and didn't want to deal with Telia anymore. Anyways, it looks like Vultr finally took action and heavily improved the routes to their network through Telia.

 

When I allowed Telia on the Dallas PoP, my home network and the physical machines (after some time) started routing to Dallas, TX through Telia which was great (it resolved the main issue). However, when I performed a trace route from multiple locations around the world via Ping.PE, I noticed many locations (even in Europe) preferring Telia's path to Dallas, TX. Therefore, I decided to start announcing to Telia on all America and Europe PoPs which resolved the sub-optimal routing issues that I mentioned before. I'll be making these changes to our Asia PoPs as well since Telia seems decent there.

 

I honestly hope it stays resolved, but considering we've had this issue four times already since the network was released, it probably won't. I will be mentioning a probable solution to this below, though.

 

Chicago Hosting Provider

I've been on the hunt for hosting providers who can host PoP servers for Scorch Host and found one that has pretty good machines in Chicago which I think would do well for GFL. Somebody from a Hosting Provider's Discord server recommended this hosting provider to me and after talking to the owner via DMs, they seem VERY promising.

 

They currently will be offering the following after an upgrade to their Chicago facility which is in 2 - 3 weeks:

 

  • Intel i7-9700K @ 3.6 GHz (not overclocked).
  • 32 GBs of RAM.
  • 500 GBs NVMe.
  • 30 TBs of Bandwidth.
  • $119.00/m (to start, possible discounts as we order additional machines).
  • Chicago, IL.

 

This is around the same price as our i7-7700K machine with Nexril right now. However, we have the i7-9700K which includes 8 cores (instead of 4 with the i7-7700K) and a 500 GBs NVMe drive (instead of a standard SSD).

 

Unfortunately, overclocking isn't possible with the machine at the moment. But they might be doing this in the future (I am still discussing these questions with the hosting provider). I am still wanting to test single-threaded performance with the i7-9700K at normal clock speeds. I will be doing this at some point later. With that said, we should be able to increase the bandwidth limit if needed.

 

I think we're going to try buying one machine from them in the future (after we cancel GS06) and see how they go. If they go well, we'll be moving to them. This will save us money technically and also give us better performing machines. I've also went onto their Discord server and took a look at their announcements. These include very professional reports of network issues they've experienced in the past which was nice to see as well.

 

On top of all of the above, this should resolve the terrible routing issues we've been experiencing in Dallas, TX. I've noticed Vultr's Dallas location has terrible routing unfortunately, but I haven't seen/heard of any issues with Vultr's Chicago location. Obviously, we won't have Vultr for too much longer since we're getting our own ASN and Scorch Host will be building out its own optimized network (stronger machines, better provider(s), and stronger (D)DoS protection).

 

ASN + Scorch Host

Although it has been a very long (and sometimes frustrating) process in acquiring the ASN, we are close to it. We are awaiting the next steps from our provider's RIPE NCC which should be sometime next/this week. Scorch Host is also doing well, I'm currently trying to design the website and @Dreae has been doing a good job at developing Compressor's web-sided panel to handle everything (AKA Cockpit).

 

I apologize for the inconveniences and downtime and I hope you understand.

 

If you have any questions/concerns, please feel free to reply to the thread!

 

Thank you for your time.

Share this post


Link to post
Share on other sites


In addition to the announcement, I have two other things to bring up.

 

Physical Servers Routing Issue

Although there has been no changes to the BGP setup we have through BIRD, the physical servers started routing back to Chicago via NTT this morning. Obviously, this is becoming frustrating. I'm not sure why it is preferring a longer AS-Path/hop count (17 hops through NTT to Chicago compared to 10 hops through Telia to Dallas), but whatever. I'll try to bring this up to Vultr and hope they'll respond. My home network is still routing to Dallas through Telia. It seems they're performing maintenance on the network as well (I've noticed higher overall latency at times on the Dallas PoP).

 

AMD Processor With New Provider

Our new hosting provider will also be providing the AMD Ryzen 7 3700X @ 3.6 GHz (8 cores, 16 threads) and AMD Ryzen 9 3900X @ 3.8 GHz (12 cores, 24 threads). This beats the Intel i9-9900K and Intel i7-9700K in mostly everything besides single-threaded performance in the benchmarks I've seen (unfortunately, single-threaded performance matters the most with the servers we're hosting). The benchmarks I've seen have a 6% decrease in single-threaded performance for the AMD processors compared to the Intel ones. However, it is still quite early and it's possible this might change with future benchmarks. If the Intel i7-9700K is still more powerful than the AMD Ryzen 9 3900X in regards to single-threaded performance, I was planning to order the first machine with the configuration in the original announcement (Intel i7-9700K, etc.). This would be used for servers that require more single-threaded resources. If things go well with the provider, we may be able to get a second machine with the following specs for around $149/m - $180/m:

 

  • AMD Ryzen 9 3900x @ 3.8 GHz (12 cores, 24 threads).
  • 32 or 64 GBs of DDR4 RAM (depending on what we run on the server, Source Engine servers don't take much RAM).
  • 1000 GBs NVMe.
  • 60 TBs of Bandwidth.
  • $149.00/m - $180.00/m (Depending on the bandwidth price which is TBD).
  • Chicago, IL.

That would be a great deal in my opinion considering it's almost equivalent to 3 machines in regards to CPU power. We should be able to handle the same amount of servers as three regular machines (4-core processors) for only $149.00/m - $180.00/m assuming we can control the RAM, space, and bandwidth usage.

 

EDIT

According to this benchmark website, the AMD CPUs beat both the Intel i9-9900K and Intel i7-9700K in regards to single-threaded performance.

 

Thanks!

Share this post


Link to post
Share on other sites


Hidden
45 minutes ago, Roy said:

EDIT

According to this benchmark website, the AMD CPUs beat both the Intel i9-9900K and Intel i7-9700K in regards to single-threaded performance.

 

Thanks!

I would wait a little bit before trusting the PassMark benchmarks for those CPUs, since they just released, the number of samples for them are still very low.

 

9900K

YIOqqgH.png

 

vs

 

3900X

l3lypLF.png

Share this post


Link to post

1 hour ago, Vauff said:

I would wait a little bit before trusting the PassMark benchmarks for those CPUs, since they just released, the number of samples for them are still very low.

 

9900K

YIOqqgH.png

 

vs

 

3900X

l3lypLF.png

Agreed, I'm going to keep an eye on this as time goes on. I've also heard that website is somewhat unreliable. Therefore, I'll be looking at benchmarks from other sources as well.

 

Thanks!

Share this post


Link to post
Share on other sites




×
×
  • Create New...