RockMyWeb.net down

Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > RockMyWeb.net down

Posted by madebyglue, 01-06-2013, 09:39 AM
RockMyWeb has been down for just over 3 hours now and I have had zero response from support. They also haven't tweeted since November and (afaik) they have no separate support site outside of their network.

Does anyone know what is going on? I have emailed the owner (Devon) directly too and again had no response yet (that was only about 25 mins ago though to be fair).

This is not the first large outage they've had since I've been a customer and I'm curious of people's opinions regarding their network stability.

In particular - why EVERYTHING seems to go down. I can understand individual machines going down due to hardware or whatever, but when they have an outage all of their own sites seem to go offline too.

This seems odd to me.
Posted by zed173, 01-06-2013, 10:11 AM
It's all down because their DNS (both) is in the same data center. There's been a couple hiccups in the last few days leading up to this, I just hope they resolve it soon. Also this affects Cloud3k (same company, sells VPS).
Posted by F-DNS, 01-06-2013, 10:12 AM

Quote:

Originally Posted by madebyglue

RockMyWeb has been down for just over 3 hours now

Down @ 10:30am GMT to be precise

Quote:

Originally Posted by madebyglue

I have emailed the owner (Devon) directly too and again had no response yet (that was only about 25 mins ago though to be fair).

If that's to a rockmyweb or cloud3k email address it won't get there right now. Besides their web sites both their nameservers are hosted at South Bend so they're down too... No DNS = No MX = No email

http://www.intodns.com/rockmyweb.net

Quote:

Originally Posted by madebyglue

In particular - why EVERYTHING seems to go down. I can understand individual machines going down due to hardware or whatever, but when they have an outage all of their own sites seem to go offline too.

As above, they have no redundancy in their DNS setup so if South Bend goes down so do their sites etc.
Posted by madebyglue, 01-06-2013, 10:20 AM
Thanks for the replies guys, I (naively perhaps) assumed that a hosting company would not make a simple mistake like that so hadn't checked DNS.

Coming up to 4 hours downtime now.

I would hope that Devon is aware of his entire network being offline by now so perhaps he'll come on here and respond (being that he also knows his support address is offline too).
Posted by F-DNS, 01-06-2013, 10:24 AM
One bit of good news - There's some chatter about this on LET too. Apparently 24khost is hosted at RockMyWeb and he has a Gmail address for Devon... Fingers crossed!

EDIT: Phone number from WHOIS is +1.7083747625 Anyone in the USA want to give it a try?
Posted by madebyglue, 01-06-2013, 10:47 AM
Thanks, I'm in the UK but just called the number. Answerphone.

We're now well into 5th hour of downtime.

Does anyone know anything about this company? number of staff etc. The answerphone message sounds like it was recorded in somebodies bathroom with a $2 mic.
Posted by madebyglue, 01-06-2013, 10:51 AM
Also, just want to point out that their whois says Michigan where it's currently 9:51am so it's not like everyone is asleep either.
Posted by F-DNS, 01-06-2013, 10:52 AM
They're not a "fly by night" if that's what's worrying you. Devon's been a member here since 2005 (and he was online here late yesterday, so let's assume he's fit and well too! )
Posted by madebyglue, 01-06-2013, 10:58 AM
Yeah that's where I was headed . Thanks for clarifying.

The lack of communication however is still worrying. I can only assume that they have nobody actively working on getting things back up else they would at least be monitoring WHT and/or sending a quick tweet.
Posted by jj@24khost, 01-06-2013, 11:00 AM
They do have people working on it. It looks from where I am, we still don't have confirmation but a router issue. I believe hands are on there way to the facility.
Posted by F-DNS, 01-06-2013, 11:03 AM

Quote:

Originally Posted by madebyglue

I can only assume that they have nobody actively working on getting things back up else they would at least be monitoring WHT and/or sending a quick tweet.

Or he's in the datacenter pulling his hair out over a naughty router. A tracert to our Cloud3K VPS that we use for monitoring other places (rather ironic) stops short within the datacenter but not at the host node's IP. That phone number looks to be a T-Mobile cellphone - He may have turned it off so he can concentrate.

Yeah I know - You're the pessimist and I'm the optimist LOL
Posted by F-DNS, 01-06-2013, 11:04 AM

Quote:

Originally Posted by sosolabs

They do have people working on it. It looks from where I am, we still don't have confirmation but a router issue. I believe hands are on there way to the facility.

Thanks for that
Posted by madebyglue, 01-06-2013, 11:11 AM
lol

I think my optimism escaped after about hour 3. I've already had one client call me for 3rd time (had nothing to tell him, still), give me some abuse and then cancel his contract so I'm not in the most forgiving of moods
Posted by F-DNS, 01-06-2013, 11:20 AM
If anyone else calls you can tell them there's progress. I'm getting pings from an IP further in now so maybe it's a router coming back up and figuring out what to do

Off topic, but just to lighten the mood, when Soso posted...

Quote:

Originally Posted by sosolabs

I believe hands are on there way to the facility.

I cringed.

Back in the distant past in my schoolboy years, and because my name's Andy, I got the nickname AndyPandy

More recently someone asked me if I'd ever fancied being a hands-on tech in a datacenter. I said "No way! I'll end up being called AndyPandyHandy"!

</offtopic>
Posted by F-DNS, 01-06-2013, 11:21 AM
They're back up!
Posted by zed173, 01-06-2013, 11:26 AM
Still very much down from here...

2 13 ms 12 ms 12 ms bas1-burlington02_lo0_SYMP.net.bell.ca [64.230.200.192]
3 12 ms 12 ms 12 ms dis16-hamilton14_3-1-0_100.net.bell.ca [64.230.59.32]
4 28 ms 30 ms 29 ms bx6-chicago23_POS0-2-0-0.net.bell.ca [64.230.186.198]
5 29 ms 207 ms 211 ms te4-1.ccr01.ord09.atlas.cogentco.com [154.54.11.29]
6 228 ms 223 ms 203 ms te3-5.mag02.ord01.atlas.cogentco.com [154.54.29.193]
7 29 ms 29 ms 29 ms te0-5-0-3.mpd22.ord01.atlas.cogentco.com [154.54.45.213]
8 57 ms 202 ms 203 ms te3-2.ccr01.sbn01.atlas.cogentco.com [154.54.25.61]
9 30 ms 30 ms 30 ms 38.104.216.162
0 * * * Request timed out.
1 * * * Request timed out.
Posted by F-DNS, 01-06-2013, 11:29 AM
Give it time...

2 <1 ms <1 ms <1 ms border6.po2-bbnet2.chg.pnap.net [64.94.32.75]
3 3 ms 3 ms 4 ms giglinx-44.border6.chg.pnap.net [69.25.148.66]
4 3 ms 3 ms 4 ms . [206.212.240.22]
5 4 ms 4 ms 4 ms 67.214.182.75
6 4 ms 4 ms 4 ms www.rockmyweb.net [67.214.182.214]

Posted by madebyglue, 01-06-2013, 11:31 AM

Quote:

Originally Posted by F-DNS

They're back up!

need a drink now...
Posted by jj@24khost, 01-06-2013, 11:31 AM
all seems to be up now.
Posted by zed173, 01-06-2013, 11:44 AM
There's still an issue, see traceroute:

2 12 ms 12 ms 13 ms bas1-burlington02_lo0_SYMP.net.bell.ca [64.230.200.192]
3 12 ms 12 ms 12 ms dis16-hamilton14_3-1-0_100.net.bell.ca [64.230.59.32]
4 31 ms 29 ms 29 ms bx6-chicago23_POS0-2-0-0.net.bell.ca [64.230.186.198]
5 28 ms 34 ms 28 ms te4-1.ccr01.ord09.atlas.cogentco.com [154.54.11.29]
6 29 ms 28 ms 28 ms te3-5.mag02.ord01.atlas.cogentco.com [154.54.29.193]
7 29 ms 29 ms 28 ms te0-5-0-3.mpd22.ord01.atlas.cogentco.com [154.54.45.213]
8 173 ms 203 ms 203 ms te3-2.ccr01.sbn01.atlas.cogentco.com [154.54.25.61]
9 30 ms 30 ms 30 ms 38.104.216.162
0 33 ms * 33 ms 206.212.240.81.gw.colostore.net [206.212.240.81]
1 * * * Request timed out.
Posted by F-DNS, 01-06-2013, 11:50 AM

Quote:

Originally Posted by zed173

There's still an issue, see traceroute:

That's just due to the IPs being re-announced. Some places will pick it up quicker than others. Our monitors as far away as mainland Europe show them up, so it's just a matter of patience now.
Posted by F-DNS, 01-06-2013, 12:06 PM

Quote:

Originally Posted by zed173

There's still an issue, see traceroute:

A tracert from them back to the first IP in your tracert gets out of the building and as far as Level(3) in Chicago before it fails.

2 edge-a.colostore.com (67.214.180.225) 0.415 ms 0.373 ms 0.340 ms
3 10.smart-dns.net (206.212.241.10) 66.069 ms 66.007 ms 65.965 ms
4 ae6-220.edge2.Chicago2.Level3.net (4.28.67.129) 4.840 ms !N 4.779 ms !N *
Posted by ExpertWebHostNET, 01-06-2013, 12:39 PM
Largest outage in their Chicago facility.

Still my box is unpingable from almost every location worldwide.
Posted by zed173, 01-06-2013, 12:47 PM
Mine is still totally unreachable as well. More disturbing was the brief time I was able to access vgrid it wouldn't accept my username and password.
Posted by F-DNS, 01-06-2013, 01:09 PM
When I checked earlier our node was fine. However we're now seeing packet loss from a FEW locations, which might mean the router (if it was that) is still unstable.

Just-Ping also show packet loss from a few random locations: http://just-ping.com/index.php?vh=ro....net&c=&s=ping!

The fact that Devon hasn't posted here yet suggests to me that he's still working on this. I'm pretty sure he'll be here with an update once he gets on top of things.
Posted by zed173, 01-06-2013, 01:27 PM
Yeah, there's definitely some routing issues (at the very least). I'm more concerned why Vgrid wouldn't accept my account login, hopefully it's because some backend system was simply unreachable and not something worse.
Posted by zed173, 01-06-2013, 02:10 PM
Ok, so now I got into VPSGrid but I am not able to start my VPS:

COMMAND OUTPUT
Failure. Could not access physical device, contact provider.

I'm starting to wonder if this was more than just a router meltdown.
Posted by ExpertWebHostNET, 01-06-2013, 02:16 PM
My vps seems to be back online.

But the total down time as reported is : it was down for 07 hour(s), 21 minute(s) and 25 second(s)
Posted by devonblzx, 01-06-2013, 02:46 PM
Sorry to not respond sooner, this has been quite a hectic day. This is the largest outage we have had in several years, doesn't help that we have several more servers than we had a few years ago too.

We are working hard to restore service to the remaining customers, most servers should be back online. This outage only affected our Chicago Metro clientele.

My official statement if you did not receive the email:

Quote:

Dear Valued Customers, A power failure in our Chicago Metro datacenter caused our primary rack of servers to go black. Our secondary rack was unaffected by the power failure but was affected by a network outage due to our main router being in the other rack.

The power supply was fixed around 11AM EST which restored service to our network and our secondary rack. Since then we have been working to restore service to the servers in our primary rack. At this time we have restored service to 95% of our customers, however, are still working to restore data from a few servers and are working hard to get everything done as soon as possible.

Thank you for your patience and I do apologize for any inconvenience this has caused you. We do work hard to minimize damages from hardware issues and downtime, however, when the power feed to our rack is cut, it is beyond our control.

Posted by zed173, 01-06-2013, 04:21 PM
Hi Devon,

Do you have any ETR for the remainder of us that are still offline? I understand that stuff happens, just wondering about a general timeline?

Thanks!
Posted by devonblzx, 01-06-2013, 05:59 PM
All servers have been restored.

http://bit.ly/ZcKIEP
Posted by zed173, 01-06-2013, 06:13 PM
All back now!
Posted by nixcom, 01-07-2013, 01:04 AM
kind of strange... suppose to be online with their fail over setup, no?