Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > Netriplex/Uberbandwidth Asheville, NC power outage


Netriplex/Uberbandwidth Asheville, NC power outage




Posted by pubcrawler, 02-24-2011, 03:20 PM
Netriplex/Uberbandwidth in Asheville, North Carolina, had a power outage effecting some lucky customers this morning.

Looks like 20 minutes approximately - 9:50AM-10:10AM.

All seems back to normal.

Dealing with joyous experience of rebuilding MySQL tables crashed in that process.

Posted by FastServ, 02-24-2011, 03:29 PM
Unless you crashed extremely hard (filesystem damage, ect), this might help speed up your recovery:

http://djlab.com/2009/06/mysql-how-t...all-databases/

Posted by pubcrawler, 02-24-2011, 03:59 PM
Thanks for the link FastServ.

Good MySQL command there... running it across our many and large tables... slow, but easier I think than how we were doing the checks and repairs.

Posted by TQ Mark, 02-26-2011, 08:13 PM
What was the nature and extent of the outage? We have servers there and didn't notice any issue.

Posted by pubcrawler, 02-26-2011, 08:40 PM
There was an email circulated about the power outage to some customers --- saw it online elsewhere. It said:

"Please be advised that the Netriplex AVL01 data center facility team has reported that the UPS on Power Bus 1 momentarily dropped the critical load impacting some customers in our AVL01 datacenter. Our records indicate that some or all of your infrastructure is being powered by this Bus. This power Bus currently serves approximately one fifth of the customers in our AVL01 data center.

Facilities management is currently investigating this issue and has called our UPS service & support vendor to come onsite to investigate this issue with our electrical team. Further information is not known at this time...."

Unsure if there was a follow up email as Netriplex has been rather poor about coming clean and wrap up post issue.

(I like the hands on staff @ Netriplex, but have lost patience with their monthly issues)

My opinion of them as a top tier provider is that's bogus, no matter how much press and PR they push claiming otherwise. This was another outage during a non maintenance window --- actually during East Coast prime time (9-10AM). Unacceptable.

For the record, I never received an email about the outage, even though my server there was effected (and I am known to them as per other posts and a small pile of tickets).

I actually am within someone else's rack and they were totally unaware of the outage (unsure why).

Just too much trouble at Uberbandwidth now for me to run a business out of their facility. Shame, because the rate there can be affordable and the facility has lots of promise.

Unsure where in the leadership stack the problem is. But if it were my company I'd be looking for someone to own up to the issues and figure out how to stop having them at this frequent pace or be shown the door.

Posted by Rogean, 02-27-2011, 03:17 AM
They sent 4 emails, this was the last one.

Quote:
This update is to inform you that as of 12:56PM EST our UPS service & support vendor found the cause of the power incident and has resolved the issue. A shorted cell in a battery within the battery string was to blame, and has now been replaced and tested. Because our Bus 1 UPS is approaching its expected battery life, all batteries were slated for replacement this coming September. We are currently in process of escalating this replacement. Bus 2 and Bus 3 UPS systems are manufactured by a different vendor and have dual battery strings, making them far less vulnerable to a single battery failure. A full RFO with an action plan for upgrading Bus 1 Power infrastructure will be sent within 3 business days.

Posted by pubcrawler, 02-27-2011, 03:44 AM
Thanks Rogean.

Well this power outage was totally preventable = engineering deficiency.

Cells go bad *all the time* in today's batteries. Tends not to totally kill a battery, typically. Nor does a single battery typically drag down your power.

This facility, UberCenter/UberBandwidth isn't that old to have failing batteries.

This may be part of the problem:
"Uber Center exercises its generators biweekly and for 1-2 hours per exercise at full load in real-time tests using a 50+ item checklist. Yes, it costs us more time and money, but it provides our customers with 100% uptime."

If they are doing that and testing it by also using battery power, then they are shocking batteries 1-2 hours every week with draw down and rapid recharging.

" but it provides our customers with 100% uptime. And that’s what the Uber Center is known for."

You can read the rest from their blog boasting, here:
http://blog.uberbandwidth.com/?p=17

Shame, cause Uber looks good on paper.

Posted by dotHostel, 02-27-2011, 05:21 AM
Quote:
Originally Posted by pubcrawler
"Uber Center exercises its generators biweekly and for 1-2 hours per exercise at full load in real-time tests using a 50+ item checklist.
They exercise the generators, not the UPS.

Posted by pubcrawler, 02-27-2011, 05:38 AM
Nothing would surprise me with their facility at this point

Fascinating that a 50 point comprehensive 1-2 hour generator test wouldn't be testing transfer switches and actual load, including the UPS systems.

So they crank the gen sets up and set the idle at 80% operation setting? Doesn't seem like much of a test really, certainly not an actual test of the entire power system.

They should have been doing comprehensive monitoring of their UPS systems and would have detected unequal cells and been able to spot replace one off batteries.

Batteries are way far more likely to fail prematurely than an industrial diesel generator. There are 80-100 year old diesel generators that have been running 24/7 out there making power in communities all over the world.

Posted by dotHostel, 02-27-2011, 05:44 AM
Quote:
Originally Posted by pubcrawler
There are 80-100 year old diesel generators that have been running 24/7 out there making power in communities all over the world.
The issues with data center generators usually happen because they are not running 24x7

Posted by pubcrawler, 02-27-2011, 05:53 AM
Yes, stop and start cycles *can* be hard on diesels, when not properly maintained. There are million mile driven diesels out there - some with well over 500k miles between oil changes (tractor and tractor trailer).

Diesels can be hard to get started but rather idiot proof to maintain.

Someone needs to incorporate power generation for a datacenter within a datacenter compound. There are universities, hospitals, etc. with their own generation facilities.

Uber makes me sad. They have the potential and some smart folks. Just way too much preventable breakage. Someone take 10 of those generator test steps and apply them to the UPS systems.

Posted by dotHostel, 02-27-2011, 06:17 AM
Not that simple. Data centers must spend a lot of money with diesel and preventive maintenance just to keep the generators ready.

Quote:
... one problem that affects diesel fuel is microorganisms such as fungi and bacteria. These are living creatures, and literally billions of them actually live and multiply in your diesel fuel. These microorganisms and fungi colonies grow into long strings and form large masses of globules. They appear slimy, and are usually black, green or brown in co1or. This living contamination can be found throughout practically all diesel fuel. The fungi and microorganisms utilize your diesel fuel as their main source of energy, and as they feed and multiply, they chemicallv alter your fuel producing byproducts acids and common sludge, Where they cling they hold these acids and other waste products against the metal and other surfaces of your fuel system. This all results in damage to your fuel system and the clogging of your fuel filters. The metal components of your fuel system, includlng the fuel tank itself, the expensive high pressure fuel pump, tbe injector tips and fuel lines, and connecting hardware can corrode to a point where they must be replaced. The symptons are easy to spot. Your fuel filters can clop, the engine will begin missing and making excessive noise. You'll notice rough idling, there may be a loss of power, the engine may stall, and you may notice a marked increase in heavy black exhaust smoke.
Quote:
Someone needs to incorporate power generation for a datacenter within a datacenter compound. There are universities, hospitals, etc. with their own generation facilities.
There are some initiatives in this direction as the Syracuse University data center. http://www.syr.edu/greendatacenter/GDC_facts.pdf

Posted by dotHostel, 02-27-2011, 06:48 AM
OFF-TOPIC - Interesting read:

http://www.debugamericalatina.com/ba...in-diesel.html

Quote:
No one knows when they receive contaminated diesel, but once contaminated diesel enters the fuel system, it is very difficult to eradicate.

...

Microscopic in size, they can develop into a mat easily visible to the naked eye very quickly. A single cell, weighing only one millionth of a gram can grow to a biomass of 10 kilograms in just 12 hours, resulting in a biomass several centimetres thick across the fuel/water interface.
and
http://www.dieselcraft.co.uk/test_kits.htm


Quote:
FACT: ALL fuel producers admit that diesel fuel is inherently unstable. This instability causes diesel fuels to form sludge and or insoluble organic particulates. Both asphaltene compounds (sludge) and particulates may contribute to build up in injectors and particulates can clog fuel filters plus add to the service issues common to diesel engines.

FACT: Diesel fuel contamination problems have two different areas to consider, biology and chemistry. On the biology side is "Fuel Bugs" and on the chemistry side is "Asphaltenes". Thinking you have a biological problem and treating it with a biocide when in fact you have a chemical problem will not solve the problem.
"Asphaltenes" aka diesel sludge is the most common chemical problem and the most misdiagnosed problem in diesel fuel. Asphaltenes are brown and slimy and resemble algae. BUT Asphaltenes are not algae. NO ALGAE GROWS IN DIESEL FUEL. The natural chemical process that goes on in diesel fuel as it ages creates Asphaltenes. The asphaltene molecules will tend to precipitate out of the fuel over time and settle on the bottom of the tank. Once picked up by the fuel pump filters clog and engines stop.
Those that call diesel sludge algae are misinformed and not knowledgeable on the subject and are mis-diagnosing the problem.
"Fuel Bugs" aka bacteria and fungus, primarily Cyanobacteria, in diesel fuel are the other problem but less prominent that Asphaltenes.
Most diesel users have very little knowledge of this costly problem. There are over 100 types of Fuel Bugs that can live in diesel fuel. Fuel Bugs feed on the oil in the fuel and use the water in the fuel for their oxygen supply. They grow in your fuel at different rates and can easily cost thousands of dollars in damage to each contaminated vehicle.

FACT: University of Idaho scientists have conducted tests to determine the timeline and percentage of degradation of stored diesel fuel #2. The results of this testing was that the petroleum diesel fuel #2 degraded 26% after 28 days of storage. See: Petroleum and Environmental Engineering Services
Masoud Mehdizadeh, Ph.D. http://www.fueltechinc.com/diesllf.htm

This is a direct result of the early-stage fuel clustering passing through the filtration systems and into the combustion chamber. These clusters cause greater difficulty as they increase in size, failing to burn correctly, thereby exiting the system as unburned fuel in the form of smoke. This problem is exacerbated as the clusters eventually reduce the fuel flow to the point of clogging the filters.

Filtration does not solve the core issue.

Posted by pubcrawler, 02-27-2011, 02:24 PM
Diesel isn't as big of a problem as many of the reports out there, inclusive of the above indicators.

While all posted is indeed true, the problems vary greatly with diesel. The issues with fuel are varied based on:
1. Low sulphur seasonal blend (more prone to oil going bad faster)
2. Storage temperature of the fuel
3. Tank construction materials and interaction with the diesel
4. Water drain off and tank cleaning maintenance

I routinely fire up diesel powered gear on 5+ year old fuel without any problem (aside from say a battery being bad that starts such unit).

Bringing 24/7 gen sets to the datacenter is next obvious step. Already have infrastructure in place for notably natural gas generation. On site generation should eliminate the need for massive and costly UPS units that are a very weak link in the overall operations. Smaller higher quality UPS design could be implemented or perhaps almost avoided with redundant gen sets. Storage mechanisms like ultracapacitors are slowly eroding some use of batteries and are going to show up in datacenters soon I suspect (more widely).

Good to see the shared piece from Syracuse University. That sort of implementation is where the industry leaders are headed --- especially in smaller tier markets and where nearby space is plentiful.

Posted by TQ Mark, 02-27-2011, 03:17 PM
pubcrawler, have you considered getting A+B power feeds? There is a reason why most datacenters offer it.

Posted by pubcrawler, 02-27-2011, 03:26 PM
A+B power feeds sounds reasonable, but I've always thought of it as being a feature for failure at and on the server level (i.e. bad power supply)--- not as a workaround for bad power from a provider.

It certainly would be a prudent decision (A+B power), but considerably more costly (higher end servers with such dual PSUs, typically 2U and above server size - more space rental for less density, additional power cost, another long term monitoring issue).

Is anyone aware of say a 1U A+B feed power aggregation distribution unit? Bring in A+B power to PDU then just single power cable out to existing servers? Obviously, the idea there is to work around facility power issues and not the occasional server level PSU failure.

Posted by Dougy, 02-27-2011, 03:27 PM
Quote:
Originally Posted by pubcrawler
Yes, stop and start cycles *can* be hard on diesels, when not properly maintained. There are million mile driven diesels out there - some with well over 500k miles between oil changes (tractor and tractor trailer).

Diesels can be hard to get started but rather idiot proof to maintain.

Someone needs to incorporate power generation for a datacenter within a datacenter compound. There are universities, hospitals, etc. with their own generation facilities.

Uber makes me sad. They have the potential and some smart folks. Just way too much preventable breakage. Someone take 10 of those generator test steps and apply them to the UPS systems.
Even though they always break down, I love our Ford E-450's in the ambulances.. diesel powahhhhhhh


For what it is worth, dirty diesel is a bad excuse. Dupont Fabros here in NJ circulates their fuel through some filtration setup every week to make sure their fuel is nice and clean.

Posted by freethought, 02-27-2011, 06:14 PM
Quote:
Originally Posted by pubcrawler
A+B power feeds sounds reasonable, but I've always thought of it as being a feature for failure at and on the server level (i.e. bad power supply)--- not as a workaround for bad power from a provider.

It certainly would be a prudent decision (A+B power), but considerably more costly (higher end servers with such dual PSUs, typically 2U and above server size - more space rental for less density, additional power cost, another long term monitoring issue).

Is anyone aware of say a 1U A+B feed power aggregation distribution unit? Bring in A+B power to PDU then just single power cable out to existing servers? Obviously, the idea there is to work around facility power issues and not the occasional server level PSU failure.
Any decent provider should be able to give you A+B feeds that are at least diverse at the distribution level, if not also a UPS level.

If you want to feed a server with a single PSU off dual feeds then APC have a range of 1U ATS boxes for various voltage types that can take diverse feeds and then switch over between them fast enough that it won't drop the critical load if there is a problem on one of the feeds: http://www.apc.com/products/family/i...CountryCode=us

Posted by dotHostel, 02-27-2011, 06:23 PM
Quote:
Originally Posted by Dougy
For what it is worth, dirty diesel is a bad excuse. Dupont Fabros here in NJ circulates their fuel through some filtration setup every week to make sure their fuel is nice and clean.
Bad excuse for what?

Posted by pubcrawler, 02-27-2011, 07:45 PM
@freethought

Thanks for the link to the APC transfer switches. I *figured* such a thing existed, just had never used them before or had customer requirements for such a device.

Sounds like a solution, especially when we roll more gear out in a single rack.

Wondering what sort of additional Amp draw these devices will add up to (if anything significant). Obviously, there is second drop cost on monthly basis.

Posted by freethought, 02-27-2011, 07:54 PM
No problem, happy to help

I don't have an data on how reliable these things are, as you are introducing a potential sigle point of failure onto either path. APC probably have a whitepaper on it though.

Posted by pubcrawler, 02-27-2011, 08:06 PM
@freethought,

Cost prohibitive typically to eliminate every point of failure. This is a good bandaid though for unreliable power. *VERY APPRECIATED*

This is second time in less than a year that we've been bitten by random power drop in two different data centers

Big deal for us is that MySQL isn't happy about being dumped in a power outage. It requires checks and rebuilds and we have some huge tables that take more time than I have to wait.

We run multiple locations in a hot-hot mode, but that represents issues also with data synchronization. Means we have to continue engineering ever more complex replication, checksums and other hacks to deal with such an outage. Nice to be able to pull such a thing off, but it's very suspect to failure.

Been a long year since we rolled out to multiple colocation facilities. Based on experiences so far, we preferred to have servers at our office. Had far better uptime at this point compared to remote colocation facilities.

Guess it seems to me that there are more failures each year industry wide.

The industry needs some independent uptime auditing with public reporting. Really would sort out good facilities from mediocre ones and show a pricing correlation perhaps (i.e. define better what you get for dollars spent).

Posted by Dougy, 02-27-2011, 10:12 PM
Quote:
Originally Posted by dotHostel
Bad excuse for what?
It was mentioned before that sludge buildup in diesel can cause issues.

Posted by Henrik, 03-01-2011, 03:10 PM
Quote:
Originally Posted by Dougy
It was mentioned before that sludge buildup in diesel can cause issues.
It should be more the engines themselves then. Diesel don't go sludgy that easily. You have issues with condensed water and such too. However that's a separate issue.

(Talking from "farm experience" here )

Posted by ChrisGragtmans, 03-03-2011, 02:07 PM
Hello WHT Community,
Although we usually try to maintain our focus on providing the best possible service to our customers, and avoid entering into discussions on online forums, I believe that it is appropriate to respond to this thread. Like any datacenter in the industry, our business is a work in progress, and we are constantly striving to better ourselves and avoid events such as the subject of this thread. We are working with each of our customers on an individual basis to make sure that any issues that arose as a result of this event are rectified.

The tone of this thread is rather unfortunate, because we match and work to exceed what the best in our industry do. We are a SAS 70 Type II certified facility, and all critical systems have third party service contracts with minimum semi-annual preventative maintenance. I won’t speak to the specific criticisms outlined above, but I would like to request that current clients speak with us directly rather than immediately jumping to public forums. We do everything in our power to respect our clients’ privacy, and if you reference the confidentiality disclaimer at the bottom of Netriplex emails, you’ll see that we ask the same of you.

Rather than speculating online, I’d like to ask you to please contact us with your questions. The information is here, and we would be happy to share it with you. Thank you all for your business, and we will continue to take every step possible to be the best in the industry.

Chris Gragtmans
Interactive Marketing Manager, Netriplex

Posted by pubcrawler, 03-03-2011, 02:32 PM
Welcome to the community Chris.

I've posted most of this and other commentary on the various outages and other snafus at your facility in recent times.

It's important that customers and potential customers know how a facility operates and what the issues are ahead of time or in case of an outage, what is going on. It's great cost and aggravation to move to a facility and have many issues and inevitably the cost of massive redundancy workarounds and eventually the cost of moving gear again to another facility.

Your company hasn't been forthcoming about problems (in Asheville) and in case of outages, less than responsive (like when your network a few months ago was blackholed and totally offline, inclusive of your own website). The reference to the confidentiality disclaimer is a tad chilling and I have some real questions about why you mentioned that.

Please read my posts on Netriplex on here and let's work on discussing the varied matters offline in email. I have another matter also that you should be aware of that I haven't posted publicly.

Posted by xbxbxc, 03-03-2011, 06:23 PM
I can't believe a provider would insinuate that his clients shouldn't express their feelings about service levels on this forum. That is very threatening of a remark to make and I would have to consider removing my equipment from there.

I am referring to "We do everything in our power to respect our clients’ privacy, and if you reference the confidentiality disclaimer at the bottom of Netriplex emails, you’ll see that we ask the same of you." posted above by ChrisGragtmans a relatively new member with next to no posts.

Posted by andrewipv4, 03-14-2011, 02:38 AM
Quote:
Originally Posted by pubcrawler
Cells go bad *all the time* in today's batteries. Tends not to totally kill a battery, typically. Nor does a single battery typically drag down your power.
You missed the part where they said it was a shorted cell, not just that it was bad. If it were bad, the whole string would drop a couple of volts - not a life or death issue. If it shorts, it could very feasibly break the entire link, as UPS batteries are typically series'd together to form 400 to 500 volts of DC. I understand that you're ready to throw them under the bus, but just know that this failure scenario as depicted is quite possible.

Posted by andrewipv4, 03-14-2011, 02:44 AM
Quote:
Originally Posted by pubcrawler
Your company hasn't been forthcoming about problems (in Asheville) and in case of outages, less than responsive (like when your network a few months ago was blackholed and totally offline, inclusive of your own website).
Do you have any threads or online resources that reference this blackholing?

Quote:
The reference to the confidentiality disclaimer is a tad chilling and I have some real questions about why you mentioned that.
I agree. I think every company would prefer to keep outages completely quiet and out of public view. But to chastise a customer for doing so is quite absurd.

Quote:
Originally Posted by ChrisGragtmans
The tone of this thread is rather unfortunate, because we match and work to exceed what the best in our industry do.
That's pretty easy to type, but much more difficult to do. It's a rather bold claim, and since you said so publicly, perhaps you'd care to back that up by publicly outlining your testing procedure for UPS battery strings, if any.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
2host.com down? (Views: 1105)

Language: