Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > Data Center's Canada DOWN


Data Center's Canada DOWN




Posted by Chasem, 02-09-2012, 10:42 AM
Wouldn't it be nice to have an update?

Posted by leckley, 02-09-2012, 10:56 AM
Would be nice if you actually took the time to post some relevant information.....

Company Name ... would be an excellent start.

Posted by Chasem, 02-09-2012, 10:56 AM
The company name is Data Center's Canada...........

Posted by leckley, 02-09-2012, 10:58 AM
Quote:
Originally Posted by Chasem
The company name is Data Center's Canada...........
My apologies!

I saw you other posts and assumed you were trying to bring up someone else based on your response to the Continuum outage thread.

Posted by Chasem, 02-09-2012, 10:59 AM
Sorry to be snappy... Frustration sets in fast when customers are calling you, and it's not your fault.

Posted by OpenLeaf, 02-09-2012, 11:01 AM
I've been calling into their NOC for updates since about 7am ... our servers started to have problems about 3:45am then at 6am we went into a complete outage situation. Up until then we were troubleshooting our own gear ... i called into the NOC at about 7am and they said there was a peering problem with or between Rogers and Bell ... it's unclear the extent of the issue but it certainly is an issue for DCC (and us as well)

I've called in several times and the last update i got was that they were trying to re-route ... I've noticed since that services worked from my home Rogers connection but no where else (that was about 9am) then about 10 min ago i couldn't connect from Rogers anymore but can from the US ... so obviously there are still routing issues.

Any other customers of DCC have updates from their end they'd like to share?

I have many peeved off clients calling me directly now ... so the more info the better.

ETA would be nice.

Posted by OpenLeaf, 02-09-2012, 11:32 AM
I just called for an update and the mailbox seems full now ... we've been up and down since just before 4am ... this is insane.

Anyone have a recent status?

Posted by matador, 02-09-2012, 11:55 AM
Same here, been up & down, but mostly down since 3.30 ET.

Quote:
Originally Posted by OpenLeaf
I just called for an update and the mailbox seems full now ... we've been up and down since just before 4am ... this is insane.

Anyone have a recent status?

Posted by Chasem, 02-09-2012, 12:02 PM
And the first order of business after the servers are up and running.... Going on 7 hours of downtime now.

Posted by OpenLeaf, 02-09-2012, 12:33 PM
I'm at the data centre now - there are about 12 or more clients here I think ... DCC staff just provided an update to one of the guys saying Bell injected a whole bunch of bad routes ... No word on why it's not resolved though.

After almost 18 years of managing BGP connections for ISPs I've never seen this ... This should be solvable quickly ... And we should be getting updates.

Posted by Chasem, 02-09-2012, 12:37 PM
Thanks for the update OpenLeaf. Maybe you can help them fix it? haha

I just saw the Uptime Robot Alert, servers started having issues at 1:14AM

Pretty mind boggling that its actually over 10 hours of service outtage

Posted by Chasem, 02-09-2012, 01:40 PM
Any word on the disaster...?

Posted by rawdigits, 02-09-2012, 02:08 PM
The DCC people should have a customer communication plan in place for issues like this.

The fact that I cannot reach anyone for an update (nor their website/email because their DNS is also hosted entirely internally) is bad news. Redundancy doesn't just mean multiple connections.

I started having connectivity problems around 2am with full loss at 5am. According to my contract I am entitled to one day of service for each hour of downtime, so we are at roughly 11 days of service compensation so far.

Can anyone out there confirm that this is going to be resolved today?

Posted by OpenLeaf, 02-09-2012, 02:14 PM
I have a VoIP client claiming they are back up - but we need to verify still.

Posted by rawdigits, 02-09-2012, 02:29 PM
Here is a very fresh press release (yesterday). Anyone want to speculate on whether this change went badly and is related:

[missing]

the forum won't let me post a link because i'm new .. google for the terms:

"gtt data centers canada" and click the first link

Posted by OpenLeaf, 02-09-2012, 02:37 PM
Now ... that IS fascinating ... my traceroutes from the US were showing hops with GT in the DNS name ... i immediately thought GroupTelecom or something like that ... but now that we (appear) to be back online i don't see those hops any more ...

We are still checking - some customers seem up but this happened earlier today to so we're not counting the chickens quite yet ...

Posted by Chasem, 02-09-2012, 02:47 PM
We are STILL down... *sigh*

Posted by OpenLeaf, 02-09-2012, 02:47 PM
Another interesting issue ... since this is a BGP issue - why has one of my access ports to the access switch at DCC bounced twice ... seems like someone is rebooting switches ... but what would access layer devices have to do with it ... i still think they are doing stuff ...

we are not out of the wood ...

Posted by Chasem, 02-09-2012, 03:17 PM
If I was in Ontario I would be pulling all my equipment right now.

Unfortunately I'm in Alberta...

Posted by matador, 02-09-2012, 03:21 PM
So your done with things ?

I'm sure Openleaf, my tech, or someone local could assist with that process

I know this sucks, but we can only hope things will be better whenever its finally back up.

Quote:
Originally Posted by Chasem
If I was in Ontario I would be pulling all my equipment right now.

Unfortunately I'm in Alberta...

Posted by Chasem, 02-09-2012, 03:25 PM
Quote:
Originally Posted by matador
So your done with things ?

I'm sure Openleaf, my tech, or someone local could assist with that process

I know this sucks, but we can only hope things will be better whenever its finally back up.
Thanks for the support, I have a business partner going there tonight to start offloading data.

I chose the absolute wrong place to host Telecom hardware. Phones being down for 11+ hours doesn't exactly work out.

It was my mistake for not being geodiversified.

Posted by OpenLeaf, 02-09-2012, 03:30 PM
We have 2 sites and a VPS for DR name services in the US ... today though may have cost us our largest client - but in the end we too need to beef up the other DC and get even more redundancy - at least our corp services and email was up and running so that we could work today ... very frustrating - but the last place we left over a year ago had a 26 hour outage (4 hour planned maintenance that went wrong) and no updates ... i think this will hopefully be a learning experience for EVERYONE that colos and the colo themselves ...

Especially if this was due to a new upstream being turned online today (as per the press release) ... i'll hold judgement for when i see the incident report and the rebate.

Posted by Chasem, 02-09-2012, 04:28 PM
Up and running now... Hopefully the compensation balances out the customers I lost... heh

Posted by OpenLeaf, 02-09-2012, 05:17 PM
interesting ... their website is not up ... everything seems fine but their website is still not accessible

Posted by mcianfarani, 02-09-2012, 05:20 PM
They are back up, I just got home from the facility. Was like a movie scene - cars everywhere, people everywhere... Glad it's back up. Looking forward to getting a formal RFO.

Posted by OpenLeaf, 02-10-2012, 04:04 PM
RFO came out - i'm not too happy with it as it seems to try to indicate this was not a problem that fits the SLA criteria and therefore a financial reimbursement.

Posted by matador, 02-10-2012, 04:16 PM
OpenLeaf,

I never got the RFO, would be interested to see this.

Posted by OpenLeaf, 02-10-2012, 04:20 PM
I think you have to open a ticket with support or noc to get the report. if you send a request they will send it over.

Posted by rawdigits, 02-13-2012, 11:34 AM
Openleaf,

Can you clarify what in the RFO makes the SLA not apply?

Posted by OpenLeaf, 02-13-2012, 12:07 PM
There was a note at the end "No physical hardware or fiber failures occurred during this outage, the issue lay outside the network edges." ... it concerned me.

Hoping to clarify today with them ... by now everyone else should have gotten the incident report if you were affected ... did you get it?



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
Is Pingdom down? (Views: 1214)

Language: