Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > Crucial Paradigm Server Down - Now Over 15 Hours


Crucial Paradigm Server Down - Now Over 15 Hours




Posted by OnlineWebSales, 05-15-2010, 11:57 AM
Here is the current response:

You are receiving this email as you currently have an account on the server mentioned in this announcement.

Current Situation:

As you may already be aware, s307 has experienced an extended outage for the last few hours. On initial inspection it became apparent that the server had a hardware related issue which was resulting in the server not booting. We quickly ran tests on the server's core components in an attempt to track down the issue. After some time it became apparent that the server hardware appeared to be in working order, however the filesystem on the server appeared to be corrupted. We have attempted to restore the filesystem, however after a considerable amount of time it has become apparent that the corruption is beyond repair.


What is being done?

We have started the process of restoring accounts from our daily backups. The first step in this process is creating an entire image of the existing server for disaster recovery purposes. The next stage will involve rebuilding the server configuration, and then finally restoring accounts from our daily backups.


When can I expect my site to be back up and running?

We expect the process of restoring all accounts will take approximately 24-48 hours. Be rest assured we will be performing this task as quickly as possible, as we fully understand the need for your sites to be up and running as soon as possible.


Can I request SLA credit?

Yes, you may request SLA credit due to this outage. Please contact us AFTER we have you back up and running, so your situation can be fully assessed. You will need to include the primary domain of your reseller account, and the time frames in which your site was down.


During this process please keep in mind our support team will be under significant load, so please keep ticket submissions to a minimum, and please do not open multiple tickets regarding the same issue.


Please be rest assured we are working on resolving this issue as quickly as possible for you, we understand the importance of your data's security, as well as need to have your sites up and running as quickly as possible.

Posted by mdzidic, 05-15-2010, 04:29 PM
I'm also affected with this issue (s307)...

Posted by OnlineWebSales, 05-16-2010, 09:38 AM
05/15/2010 UPDATE INFORMATION

Hello,

Sorry for all the inconveniences caused. We understand your concern. We are on process of restoring the accounts. It will take around 24 hours for the entire processes to be finished.

Thank you for your understanding.

Regards,
Joby

Posted by OnlineWebSales, 05-17-2010, 07:20 AM
Monday Update - After 60 Hours Down Now

The OS reload has been completed and now we are restoring all domains from the backup in its alphabetical order of usernames. As of now, 40% is completed and we are unable to give an appropriate time frame to complete this task since it is obviously depends on the size of the domains that are yet be restored. However, it will be finished within hours. Thank you for your understanding.

Posted by OnlineWebSales, 05-17-2010, 08:22 PM
Down since Friday at 7:00pm CST. Here is the latest from Crucial Paradigm regarding the system being down. Certainly not a happy camper at this point.

The intial imaging of the server for disaster recovery purposes took twice as long than expected.
Due to restoring backups from both the image of the server, as well as the daily backups to bring the most up to date data to your accont, this took slightly longer than expected.
A drive in the RAID 5 array holding the backups for s307 failed, as a result the array is degraded and running a lot slower than it usually would.  This results in backups taking longer again than expected.

NOTE: Per the last email we sent, we will be providing SLA credit upon request.  Please wait until your account is up and running before you submit the SLA request.  SLA requests should be submitted to Accounts & Billing, and should include your main domain name along with the start and end time of the outage you experienced.


Please accept our sincere apologies for any inconvenience caused, and be rest assured we are working as quickly as possible to restore your accounts.

Kind Regards,

Aaron Weller
Crucial Paradigm

Posted by watco, 05-19-2010, 10:58 AM
My outage lasted for 85 hours.

I noticed my email not working on Friday around 5pm MST. I think they acknowledged the outage on Saturday afternoon or so.

I *cannot* believe that it took that long to get a server back online. Something pops, you put in spare parts and you get that machine back up and running.

The SLA refund for this apparently is $5, which is more an insult than a compensation or customer retention strategy. I mean, if I go to my clients and say "sorry that your site was unavailable over the weekend, here's $5 to make it up to you" that's not going to go over very well.

This is not the first time their servers experience troubles. I've got some other accounts with them (on servers that weren't affected by this outage) and they've had their share of outages, too.

Add to that the lousy support they've offered over the last while and I'll definitely be moving all my accounts away.

Posted by OnlineWebSales, 05-19-2010, 10:14 PM
I think the worst part of it is the fact that Crucial P does a lot of advertising (which is how I found them) on this forum but did not acknowledge our comments here, did not post an announcement about the outage, has removed their U.S. Forums from their site and made no public announcement on their site concerning this incident.

When I submitted a ticket after 8 hours of being down their reponse was "We noticed the problem...". After 8 hours I certainly hope that you have noticed the problem and are doing something about it.

This incident went against everything advertised on their website. The following direct from their site:

Extreme Backups - We keep 7 backups of all your files, including a local RAID copy, daily local, weekly local, monthly local, daily offsite, weekly offsite, and monthly offsite backups.

We know two things - the backups were not a daily backup and that the backups failed due to corrupt files.

Standby Servers
We keep spare servers on-line of all CPU configurations. If a server were to experience a hardware failure, we would turn a key, grab the handle on the drive, pull it out, and insert it into an identical standby CPU. We would then reboot the second machine and the server would be up and running again in a matter of minutes.


We also know this is not the case. Down 85 hours and compensated $5.00. What about the compensation for the end user or the customer that picked up his files and went elsewhere? I know Crucial will rely on their terms of service which translates, keep your own files, it's not our fault.

Posted by LeftToDie, 05-20-2010, 09:48 AM
Quote:
Originally Posted by 334online
I think the worst part of it is the fact that Crucial P does a lot of advertising (which is how I found them) on this forum but did not acknowledge our comments here, did not post an announcement about the outage, has removed their U.S. Forums from their site and made no public announcement on their site concerning this incident.

When I submitted a ticket after 8 hours of being down their reponse was "We noticed the problem...". After 8 hours I certainly hope that you have noticed the problem and are doing something about it.

This incident went against everything advertised on their website. The following direct from their site:

Extreme Backups - We keep 7 backups of all your files, including a local RAID copy, daily local, weekly local, monthly local, daily offsite, weekly offsite, and monthly offsite backups.

We know two things - the backups were not a daily backup and that the backups failed due to corrupt files.

Standby Servers
We keep spare servers on-line of all CPU configurations. If a server were to experience a hardware failure, we would turn a key, grab the handle on the drive, pull it out, and insert it into an identical standby CPU. We would then reboot the second machine and the server would be up and running again in a matter of minutes.


We also know this is not the case. Down 85 hours and compensated $5.00. What about the compensation for the end user or the customer that picked up his files and went elsewhere? I know Crucial will rely on their terms of service which translates, keep your own files, it's not our fault.
You should post at http://www.webhostingtalk.com.au/

I am pretty sure the Australian based customers would be interested to know since they target Australian customer now more than international customers it seems.

they also had some similar issue with their VPS server earlier this year with long delay in restores etc.

Posted by templ33, 05-20-2010, 10:16 PM
Quote:
Originally Posted by 334online

This incident went against everything advertised on their website. The following direct from their site:

Extreme Backups - We keep 7 backups of all your files, including a local RAID copy, daily local, weekly local, monthly local, daily offsite, weekly offsite, and monthly offsite backups.

We know two things - the backups were not a daily backup and that the backups failed due to corrupt files.

Standby Servers
We keep spare servers on-line of all CPU configurations. If a server were to experience a hardware failure, we would turn a key, grab the handle on the drive, pull it out, and insert it into an identical standby CPU. We would then reboot the second machine and the server would be up and running again in a matter of minutes.


We also know this is not the case. Down 85 hours and compensated $5.00. What about the compensation for the end user or the customer that picked up his files and went elsewhere? I know Crucial will rely on their terms of service which translates, keep your own files, it's not our fault.
Lets see, it sounds like you are running on some shared or vps environment and expect 100% uptime. Then when something occurs, like the server filesystem becomes corrupted, you are upset because it takes time to restore the entire server.

Hate to say it, but the SLA and Terms of Service are your legal agreement with the hosting company. If you don't agree with the compensation or feel you should be entitled to more, either look at other hosting companies or purchase liability insurance for these cases.

If you need 100% uptime, mirror your site with two or three datacenters and use a thirdparty DNS provider.

Posted by OnlineWebSales, 05-20-2010, 10:45 PM
I certainly don't expect 100% uptime from ANY hosting company but I do expect a company to be able to do what they advertise. The backups that were restored were OLD as dirt and not a DAILY backup as shown in their advertisements.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
jaguarpc down ? (Views: 1048)
Internet.bs down? (Views: 1116)
Is Jaguarpc down? (Views: 1126)
Dotster down? (Views: 1058)

Language: