Portal Home > Knowledgebase > Articles Database > Gnax Network down??


Gnax Network down??




Posted by universal2001, 03-20-2006, 08:26 AM
Quite a lot of our servers are unreachable. Anyone having the same issue?

Posted by cyb, 03-20-2006, 08:29 AM
My server is also unreachable...

Posted by GoTek-JP, 03-20-2006, 08:29 AM
Thank god It's not just me, I received a page at 6:56 AM from hyperspin telling me the servers are down... I can login via console but it's like the gateway/router is not responding. I've opened a ticket with them.

Posted by Apoc, 03-20-2006, 08:33 AM
Yes the GNAX network is down for the biggest part. They indicated it won't take too long before it would be back up.

Posted by toma1708, 03-20-2006, 08:33 AM
Down for me too, it looks like a router problem.

Posted by tracphil, 03-20-2006, 08:33 AM
I opened a ticket as well. I pasted an MTR for the interested: http://pastebin.com/612240 TP

Posted by universal2001, 03-20-2006, 08:33 AM
This is like the first big one we've had since being with GNAX.. I do hope everything will be ok...

Posted by yaax, 03-20-2006, 08:34 AM
My server there is also unreachable, and even their APC reboot URL is also down. According to traceroute it is problem on some of their inner network routers. I also opened support ticket there but now over 30 minutes no one answer there...

Posted by Milovan, 03-20-2006, 08:35 AM
There is a network problem and guys at GNAX are already on it. Some of our servers are down as well, though some haven't been affected at all.

Posted by PersianImmortal, 03-20-2006, 08:46 AM
So does this disprove this claim of theirs:

Posted by Apoc, 03-20-2006, 08:50 AM
Nobody is able to make any claim about that at this stage as nobody (not even we) has been informed as to what the exact problem is. The most recent update we received from them is that they are almost sure what the problem is but cannot release any information about this until they are 100% sure that is the actual problem.

Posted by universal2001, 03-20-2006, 08:50 AM
We've never had a network wide issue like this... does anyone have any ETA when it is coming back up??? :O

Posted by Apoc, 03-20-2006, 08:57 AM
As of the past 5 minutes everything should be back up. SolidHost/DEHE customers may refer to http://www.dehe.com/showthread.php?t=583 for more information (more information will be posted as soon as we hear from GNAX what the problem was).

Posted by tracphil, 03-20-2006, 09:00 AM
No. Everything is not back up. My rack is still unreachable.

Posted by Apoc, 03-20-2006, 09:04 AM
My apologies about that, it also appears that some of our servers are still down. The majority of them came back up though.

Posted by Apoc, 03-20-2006, 09:25 AM
Just to keep everyone informed: There are/were two seperated problems: - A network issue. This has been fully resolved. - Heating issues. In one part of the datacenter there is a heating issue which caused that some servers have shut themselves off to prevent overheating. GNAX is doing the best they can to get this resolved. I do not know the cause of this problem but I suspect they had multiple A/C units failing at the same time, or something like that.

Posted by myusername, 03-20-2006, 09:29 AM
Wow thats a neat trick to turn themselves off like that, and all at the identical moment according to my logs.

Posted by PersianImmortal, 03-20-2006, 09:33 AM
I suspect we will never know the true story. Anyway thankfully my site is back up.

Posted by sailor, 03-20-2006, 09:44 AM
the true story is posted on our forum for our customers.

Posted by swijaya0101, 03-20-2006, 10:04 AM
my server is still down ... anyone else?

Posted by cyberultra, 03-20-2006, 10:35 AM
one was down and backup now. About one hour of downtime

Posted by timdorr, 03-20-2006, 11:44 AM
For those that are lazy: http://www.tranxactglobal.com/forum/...9&postcount=42 Also, I'm showing no breaks on our network graphs and everything's still humming along, so I'm guessing this was only in DC2, correct?

Posted by RyanD, 03-20-2006, 03:14 PM
ASO, yep, didn't harm any of us up on the 17th

Posted by timdorr, 03-20-2006, 03:46 PM
Yeah, I was just in there about an hour ago repair a server that had it's superblock eaten and DRAC fail. They said it's only DC2, so us 17th-ers were fine. I also got a spot reserved for some cage space in the new DC

Posted by RyanD, 03-20-2006, 06:26 PM
maybe we'll be neighbors I've got a garage full of racks and other equipment waiting to go in... Jeff, hurry up!

Posted by ub3r, 03-21-2006, 04:36 AM
I haven't been able to access any of our gnax servers, or gnax.net itself for about an hour now. here's the traceroute: i can access it if i ssh to one of our servers via my server which is at layered tech, savvis. I did file a support ticket, i'm still waiting for a reply there.

Posted by Paul, 03-21-2006, 07:52 AM
We're still seeing a 'routing loop' when trying to access Level3 public resolvers, I had called and they told me they have 'Cisco Techs' looking at the routers. Part of trace posted below, 1 209.51.133.49 (209.51.133.49) 12.947 ms 0.465 ms 0.436 ms 2 209.51.137.2 (209.51.137.2) 16.624 ms 0.590 ms 0.519 ms 3 209.51.149.106 (209.51.149.106) 0.626 ms 0.357 ms 0.368 ms 4 * * atl-core-a-tgi2-1.gnax.net (209.51.149.105) 0.590 ms 5 209.51.149.106 (209.51.149.106) 33.119 ms 0.458 ms 0.672 ms 6 atl-core-a-tgi2-1.gnax.net (209.51.149.105) 0.411 ms 0.782 ms 0.804 ms 7 209.51.149.106 (209.51.149.106) 9.800 ms 0.590 ms 0.312 ms 8 atl-core-a-tgi2-1.gnax.net (209.51.149.105) 0.319 ms 0.816 ms 0.909 ms 9 209.51.149.106 (209.51.149.106) 0.736 ms 0.668 ms 0.440 ms As I posted this one of our servers went down again, I'm sure this is just them fixing the switches that got hit by the breaker. Question, Do you not have environmental sensors in your power room to let you know when the heat is getting dangerously close to tripping the breaker? Regradless, Good show on getting on this quickly.

Posted by Paul, 03-21-2006, 08:35 AM
Seems outbound to level3 resolvers is now fixed, I take back the question regrading the sensors as you run Liebert that most likely have SNMP or Alarm Wire installed and monitoring, Just seems a case of getting someone there that understands the HVAC system before the breakers tripped.

Posted by sailor, 03-21-2006, 08:46 AM
we do not have sensors in this room - however our new dc will have a completely automated system that monitors all that live. we were not fixing a router from yesterdays issue btw - this was an issue that was very complex that took 3 different cisco engineers 4 weeks to catch. there were actually 2 different problems with the loops in the past month - one was fixed with an upgrade and abug fix on our avaya ans platform and this one hopefully will be fixed now as well - it was a known problem - but could not be pinpointed until this morning. it had to do with the maximum routes a TCAM can handle in a 6500 default. we surpassed this amount and it was dropping routes - about 6000 to be exact . we increased this in software this morning and rebooted all the routers and hopefully this will be the final solution.

Posted by toma1708, 03-21-2006, 12:12 PM
Hi, The connectivity issues are still present, please post an update when you have time. Catalin



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
PHP Error (Views: 622)
Cpanel SFTP Issue (Views: 620)

Language: