Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > Infolink/Serverpronto


Infolink/Serverpronto




Posted by Mormegil, 06-25-2010, 09:29 AM
Double threat. My server went down twice this morning, the system logs indicate a hard drive issue. The server's been up and running for nearly four years with no significant issues, and this kind of thing happens. When it went down again, though, I went to open a trouble ticket only to find that their website wouldn't load. Now I can't ping or resolve serverpronto.com or infolink.com.

And the waiting game begins...

Posted by lemon09, 06-25-2010, 09:32 AM
can not access my server either. their web site never loads

Posted by Mormegil, 06-25-2010, 09:40 AM
It usually works for me. Either way, they did have some sort of network issue a few years ago that caused an outage similar to this. It's always troubling when your hosts become unreachable, though.

Posted by Mormegil, 06-25-2010, 09:45 AM
Infolink's network seems to be coming back up.

Posted by stablehost, 06-25-2010, 09:46 AM
Both of my serverpronto servers went down, one came up fine, one now has a corrupted file system.

Not good.

Posted by Mormegil, 06-25-2010, 09:48 AM
Not good at all...

I'm wondering if they haven't experienced some sort of power outage. A networking issue wouldn't cause downed servers or corrupted filesystems.

Posted by psalzman, 06-25-2010, 09:49 AM
I'm having the same issue. My machine, running ESXi, rebooted twice this morning. One of the times several of the VMs seemed to have 'disappeared' Now it appears to just be down hard - and I'm having the same issue with none of the InfoLink sites resolving in DNS.

Sounds like a power failure...

Posted by psalzman, 06-25-2010, 09:54 AM
Crap - so your machines are back online? Hopefully mine will be soon... I'm worried about those disks

Posted by Mormegil, 06-25-2010, 10:12 AM
I'm up again.

Posted by psalzman, 06-25-2010, 10:16 AM
You're pretty lucky it seems. I can't get to the infolink servers nor my colo. Traceroute's are also dropping off their edge router from Cogent.

Posted by King_Arthur, 06-25-2010, 10:56 AM
I can't access my colo server or any of the infolink resources:

Code:
traceroute myserver.mysite.us
traceroute to myserver.mysite.us (64.251.29.88), 30 hops max, 60 byte packets
 1  192.168.0.1 (192.168.0.1)  0.316 ms  0.241 ms  0.209 ms
 2  cpe-075-189-144-001.nc.res.rr.com (75.189.144.1)  14.193 ms  14.408 ms  14.764 ms
 3  66.26.44.53 (66.26.44.53)  14.334 ms  14.547 ms  14.513 ms
 4  ge-2-3-0.rlghncpop-rtr1.southeast.rr.com (24.93.64.164)  14.595 ms  14.584 ms  14.769 ms
 5  ae-3-0.cr0.dca10.tbone.rr.com (66.109.6.80)  21.842 ms  21.790 ms  21.771 ms
 6  ae-2-0.pr0.dca10.tbone.rr.com (66.109.6.169)  21.981 ms  20.953 ms  20.924 ms
 7  te0-7-0-7.mpd21.iad02.atlas.cogentco.com (154.54.13.157)  21.054 ms  20.231 ms  19.128 ms
 8  te0-1-0-4.ccr21.iad02.atlas.cogentco.com (154.54.31.93)  19.086 ms  19.230 ms te0-2-0-4.ccr21.iad02.atlas.cogentco.com (154.54.31.97)  19.201 ms
 9  te7-2.mpd01.atl01.atlas.cogentco.com (154.54.1.225)  29.140 ms te8-1.mpd01.atl01.atlas.cogentco.com (154.54.28.198)  29.087 ms te0-2-0-4.ccr21.dca01.atlas.cogentco.com (154.54.1.77)  20.258 ms
10  te3-1.ccr01.atl01.atlas.cogentco.com (154.54.28.202)  28.976 ms te8-2.ccr01.atl01.atlas.cogentco.com (154.54.1.169)  27.987 ms te7-3.ccr02.atl01.atlas.cogentco.com (154.54.28.57)  27.950 ms
11  te3-4.ccr01.mia01.atlas.cogentco.com (154.54.24.162)  41.744 ms te4-1.ccr01.mia01.atlas.cogentco.com (154.54.7.145)  41.670 ms  41.655 ms
12  vl3512.na21.b015452-0.mia01.atlas.cogentco.com (66.250.14.182)  42.122 ms  48.623 ms  48.771 ms
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

Posted by King_Arthur, 06-25-2010, 02:43 PM
So my box became accessible again at 11:18:05AM EST. Based on mail logs and the server's last-log I have put together the following assumption. There was a massive power failure (so they don't have UPS/Genset?) and likely due to the utility/power failure they had network issues:

My busy mail server stopped talking at 04:12:49AM EST.
Server shows system boot at 07:39AM EST
Mail server talked briefly at 08:46:31AM EST
Server shows system boot at 09:31AM EST
Mail resumes talking and I can get to the server at 11:18:05AM EST

So based on my calculations that puts the outage at just over 7 hours.

I've not yet received a mass ticket from Infolink/ColoPronto explaining the problem. Going to try and hit their sites now and see what has been said.

Posted by psalzman, 06-25-2010, 02:47 PM
here's what I got from them:
---

Your server is back online.

This morning one of our data centers, experienced major multiple critical power systems failures as follows:

* At approximately 0500 Eastern Time, we experienced a total loss of utility power, our backup generators and UPS systems functioned as designed, and this data center was functioning as it should.
* At approximately 0700 Eastern Time, the local utility restored power and our building was automatically switched back to utility power.
* At approximately 0710 Eastern Time, the local utility power dropped yet again, at this juncture one of our backup generators experienced a failure in both of its two starter motors. Our UPS systems functioned as designed, but exhausted their battery capacity, dropping power to most of the systems in this data center.
* At approximately 0730 Eastern Time, the local utility restored power again.
* At approximately 0845 Eastern Time, the local utility power dropped yet again, we were in the process of changing the starters on the generator, and at 0930, we were able to re-start our emergency generator.
* Although our local utility claims to have corrected their power issues which precipitated this issue, we are still running on generator power until we can perform a coordinated and monitored cut-over back to utility power. This is scheduled to happen at 1430 Eastern Time, as we want to ensure the technicians from the utility are on-site for this event, and will remain on-site to monitor this situation moving forwards.
* All UPS Systems are back online, and our emergency generator is functioning as it should be.

We apologize profusely for this issue, and can assure you we wish it did not happen. We have been working on an increased staffing level since this incident occurred to ensure we are doing everything possible to address any outstanding issues.

Please do contact us and let us know if your are experiencing any issues regarding this.

Again, a profuse apology,

The ServerPronto Team

Posted by kaisersouse, 06-25-2010, 02:54 PM
You guys are lucky. I'm still down and the ticket I put into them is unanswered.

Unlike a lot of the "haters" I see in this forum, I really like ServerPronto. They've never done me wrong, and in the 4 years I've been with them this is the first real major issue I've ever had.

I will say though: If I had PAYING customers...I'd be >>pissed<< right now.

Posted by King_Arthur, 06-25-2010, 03:31 PM
You make a good point kaisersouse. Being a new ColoPronto customer (been on-line less than a month) I was wondering if I should be kicking myself right now. My primary mail server plus multiple sites/services run off the box in Miami, but if you haven't had any problems for 4 years until now maybe it will be ok

Hope your box comes back on-line and isn't suffering from hardware failure.

Glad I went w/ a RAID-1 on my box

Posted by psalzman, 06-25-2010, 03:34 PM
When I first signed up with them I was down every week, sometimes multiple times a week. They had a faulty power strip, they said, but also said they replaced it --- repeatedly. Tho in the last year I haven't had any issues. I use colopronto.

Posted by psalzman, 06-25-2010, 03:37 PM
Also- RAID-1 isn't going to help you with power issues. It'll help, but you can still get whats called Silent Data Corruption from flaky, or unclean, power. XFS, from Sun (er, Oracle) helps protect from this... but unfortunately it's not always practical and doesn't really support much more than Solaris and I believe FreeBSD.

Posted by kaisersouse, 06-25-2010, 03:52 PM
Quote:
Originally Posted by King_Arthur
You make a good point kaisersouse. Being a new ColoPronto customer (been on-line less than a month) I was wondering if I should be kicking myself right now. My primary mail server plus multiple sites/services run off the box in Miami, but if you haven't had any problems for 4 years until now maybe it will be ok

Hope your box comes back on-line and isn't suffering from hardware failure.

Glad I went w/ a RAID-1 on my box
The biggest problems I had I brought on myself. Thats the thing a lot of people don't get about SP...if you break it, its YOUR fault. If you want them to fix it...you pay them to fix it. If you want to fix it yourself...have at it.

They get to charge so little money for their service because they aren't tossing cash out the window by pandering to those sysops who like to blow things up and then blame everyone else until it gets fixed for free.

"You break it, you buy it" hahaha

Posted by psalzman, 06-25-2010, 03:56 PM
Sometimes things just break though. I'd really like to see them start offering a console server product. That would be real nice... I should say a remote-KVM product, actually.

Posted by kaisersouse, 06-25-2010, 03:58 PM
Oh yes absolutely...things can just go wrong.

And last I knew they did offer such a service. Contact their customer service dept (or tech) and ask...because last I knew they did offer such a thing.

Posted by King_Arthur, 06-25-2010, 04:09 PM
Quote:
I'd really like to see them start offering a console server product. That would be real nice... I should say a remote-KVM product, actually.
psalzman, when I signed up I inquired into their KVM solution and I was told they will supply an IP/KVM for $25/2 hours. My guess is that it is a small/portable Raritan model.

Posted by psalzman, 06-25-2010, 04:11 PM
oh... nice... I had no idea. =)

Posted by King_Arthur, 06-25-2010, 04:13 PM
Quote:
Also- RAID-1 isn't going to help you with power issues. It'll help, but you can still get whats called Silent Data Corruption from flaky, or unclean, power. XFS, from Sun (er, Oracle) helps protect from this... but unfortunately it's not always practical and doesn't really support much more than Solaris and I believe FreeBSD.
I know RAID-1 will not protect against OS-level data corruption that can be caused by power failure, but I used a RAID card w/ battery backup so any changes queued are written upon power restore which helps some. Hopefully two disks will reduce downtime risks if power loss is frequent - ie if one of the disks fails due to frequent power loss the other can sustain the system until I can ship a replacement.

Posted by King_Arthur, 06-25-2010, 04:17 PM
Chat transcript regarding the KVM, it's actually $25/24 hours:

Arthur: [15:32] is there a cost associated with the KVM
[15:32] and/or a cost for reboots
[15:33] or drive replacements etc
SP-Sales: [15:33] kvm is $25/24 hours access
[15:33] reboots... )is anyone charging for this, because we dont)
Arthur: [15:33] you would be surprised
SP-Sales: [15:33] Ha!!!!
[15:33] you made me laugh

Posted by psalzman, 06-25-2010, 04:21 PM
nice...

Yeah, they've been really good about not charging for reboots. A while back I had to use them quite a few times for that, and expected to be charged... I think they have the cost assigned in their system to handle 'abuse' of it.

Posted by anomaly65, 06-26-2010, 10:20 AM
Customer of colopronto for a few years now. I don't expect any facility to withstand the repeated ups and downs FP&L (guess the F no longer stands for "Florida" :-) ).
They've been very good about getting no-cost reboots done and ($25/24hour) remote KVM setups done in short order, even during "non bankers" hours.
For the price, or more, it's difficult to beat their service.

Posted by cresci, 06-26-2010, 04:22 PM
Strange. We are what, 3 blocks away from them and on our building there was no FP&L outage at all. They should be on the same power grid(s) as us; plus it's a power grid for hospitals and courthouses so it should come back in the quickest way possible - downtown Miami is a place that can't really stop.

If I am not also wrong, Cogent's mia01 pop is the exact same building as Infolink (100 N Biscayne). If Cogent's router was responding, then there was power to Cogent, but not to them?

Just my 2 pennies.

Posted by (Stephen), 06-26-2010, 07:58 PM
Quote:
Originally Posted by iptelligent
Strange. We are what, 3 blocks away from them and on our building there was no FP&L outage at all. They should be on the same power grid(s) as us; plus it's a power grid for hospitals and courthouses so it should come back in the quickest way possible - downtown Miami is a place that can't really stop.

If I am not also wrong, Cogent's mia01 pop is the exact same building as Infolink (100 N Biscayne). If Cogent's router was responding, then there was power to Cogent, but not to them?

Just my 2 pennies.
Does anyone know where there servers are, could it have been the 22nd street location?
By mention of a couple years, I have a feeling it is that location.

However we are in CoreSite and did not experience FPL issues either. (they are in a warehouse/retail center behind the coresite location, and were there before coresite, when it was wiltel)

We actually have one server there as a monitoring station, and it is still reporting false positives due to routing problems right now.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
Directspace down? (Views: 1087)
ifusehosting is down (Views: 1082)
Never mind. (Views: 1075)
GuernseyHosting.com DOWN (Views: 1036)

Language: