Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > troubleshooting DNS nightmare


troubleshooting DNS nightmare




Posted by durangod, 11-19-2013, 01:31 PM
I hope this is posted in the right place, i did a search for DNS and found stuff all over so i took my best guess.

I am not even sure where to begin, this is so confusing and just when you get to a point that it starts to make some sense, you get a monkey wrench and your back to square one. I will try to do my best to explain.

Issue:

I am a reseller and my clients are telling me that their sites are not accessable "page not found" but i checked with my host and i also checked the server from WHMCS and the server appeard to be up and running.

So of course the next thing is that you tell your client to clear their cache or that it must be a local DNS issue or they need to contact their ISP because the server is up and running and site is available both by proxy and without.

This has been happening for over a week now. Then it started happenning to me as well. I also have serveral personal sites and i started getting the page not found. So when i confirmed that the server was still up and running i was like something is not right here.

I thought maybe the server was banning me so i checked and my host agreed to clear all the IPS for me from the system. I also checked my site to be sure i was not banned for some reason from htaccess. Still cant access... So i checked the server firewall, nothing found. I called my ISP and they reset my modem and it seemed to work.

But then a few hours later here we go again, page not found.. I went thru the whole process again but didnt call my isp because all they would do is reset my modem again which basically just gives me a dif ip..

So i reset my modem myself and got a new ip and no dice, then i changed DNS manually thru the tcpip window and whalla there was the site.

So now i know it has to be some kind of DNS blocking. So i looked into that... My host told me that one of my IP was found on the list today seems my third party email failed to signon a few times checking webmail so it thought i was a spammer i guess and blocked me.

So now i have been doing that manually instead of every 20min and i also turned off my spamassassin just in case. I also forwared all my webmail per cpanel into one notify email so i only check one mail per cpanel rather than check them all. I just wanted to be sure nothing was hitting the server too much as i troublshoot this.

So yesterday i went to a site of mine and page not found, so i went to the root site and it was up and running.. I was like WHT how can a root site be up and running and an add on domain site on the same cpanel give me page not found. So i looked at the site using hide my ass proxy lookup and mouse lookup and the site showed on both..

I was like wth is going on here why am i being blocked by ip.. So i got out of proxy and just changed my dns, and all sites where up and fine..

So im starting to think its a regional DNS server that is blocking me for some reason but why would it block me im just doing normal functions during the day. And so im thinking that i am narrowing in on the culprit.. when..

I get an email from my client that his website is down. I take a look, the server is up, i walk him thru the steps and he cannot get access no mater what we do. Then i contact a friend of mine in texas, and he can see the site fine and i can see the site fine. But another friend in chicago cannot, and also friends in florida and california cannot. So that blows the idea of a regional DNS server being the issue.

So now i am totally upside down, i dont know whats happening or where to even look anymore, right now i am changing DNS servers and IP at least 10x a day just to get access and the whole time the server is up and running just fine.

I will probable lose a $300 a year client over this as i have no idea who to call, is there a number to call to check a regional DNS server, i am just so lost i cant think straight..

Does anyone have any idea where to even start looking, it is happening to me and my clients all over the united states its not local but yet some can see the sites and some cant and why i dont know.

Here is the kicker, if i wait like half a day it all goes back to normal no proxies, no alt dns, all totally accessable and we start all over again..

ps one thing i did disover is that any block at any place along the data stream gives you the page not found, so it could be dns, firewall, anything at all.. Which makes it all the more confusing.

Posted by vtechpk, 11-19-2013, 01:43 PM
During downtime, do trace root. If there is a problem reaching the web site you entered the message "Destination Unreachable" will appear.

If you see much larger than normal hop times in any of the first hops that Trace Route shows, that indicates that there may be a network issue with your Internet Service Provider. If the hops appear normal and you see higher times in the last hops, or if it fails to go through fully, there may be a problem with the data center or server.

Posted by durangod, 11-19-2013, 01:51 PM
Thanks, that is a start, i will do that for a day or so when this happens and try to find out where the issue is, that helps alot thanks so much...

Posted by foobic, 11-19-2013, 02:16 PM
First thing is to check your DNS. Run the online checker at intodns.com and see what errors it flags up.

Posted by chadlnc, 11-19-2013, 03:41 PM
Who is the domain you use for NS registered with?

Posted by durangod, 11-19-2013, 08:39 PM
here is one that is doing this now.. i get a page not found on all browsers

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.

C:\Users\xxxxxx>tracert durangodaves.com

Tracing route to durangodaves.com [198.15.78.59]
over a maximum of 30 hops:

1 <1 ms 1 ms 1 ms 192.168.0.1
2 17 ms 17 ms 17 ms albq-dsl-gw49.albq.qwest.net [67.42.200.49]
3 17 ms 72 ms 24 ms albq-agw1.inet.qwest.net [71.222.249.129]
4 26 ms 48 ms 26 ms dvr-brdr-02.inet.qwest.net [67.14.24.114]
5 * * * Request timed out.
6 57 ms 56 ms 56 ms vlan51.ebr1.Denver1.Level3.net [4.69.147.94]
7 57 ms 57 ms 57 ms ae-2-2.ebr2.Dallas1.Level3.net [4.69.132.106]
8 57 ms 57 ms 56 ms ae-72-72.csw2.Dallas1.Level3.net [4.69.151.141]

9 56 ms 57 ms 56 ms ae-71-71.ebr1.Dallas1.Level3.net [4.69.151.138]

10 57 ms 57 ms 57 ms ae-1-8.bar1.Phoenix1.Level3.net [4.69.133.29]
11 57 ms 56 ms 56 ms ae-0-11.bar2.Phoenix1.Level3.net [4.69.148.114]

12 57 ms 57 ms 59 ms PHOENIX-NAP.bar2.Phoenix1.Level3.net [4.28.82.13
8]
13 60 ms 60 ms 60 ms 108.170.0.29
14 * * * Request timed out.
15 94 ms 101 ms 100 ms blade.neh27.com [198.15.78.58]
16 94 ms 100 ms 99 ms Razor.NorthXpro.com [198.15.78.59]

Trace complete.

now one note, we did take this off my custom dns name and ded ip and put it over on a temp dns last week just to see if we could track the problem so if you do a dns report you will see the two dns items, thats why.

however this site http://www.mycompanyaffiliate.com/ which is a add on domain on the same cpanel as the one above comes up without any issues.

And thats what blows my mind... because its not the site itself doing this, i dont have anything in that site durangodaves.com that auto bans anyone at all..

Posted by foobic, 11-19-2013, 09:23 PM
http://intodns.com/durangodaves.com

The problem appears to be with the nameserver IPs 108.170.18.178 and 179, which are not authoritative for the domain. Those IPs came from your zone files - they're A records for ns1 and ns2 under the domain durangodaveshosting.com.

Posted by durangod, 11-20-2013, 12:28 AM
Yes we were testing and my host said he wanted to move my reseller account and all my sites over to northxpro for a week to see if things were better, so i had to shut down my ded ip and my custom NS for the time being.

What does not make sense to me is that i have not found anything that is causing the site to be down "only sometimes". Please correct me if i am wrong, if the zone records were the issue wouldnt that mean that the durangodaves site would never work as it is now..

That is what is so puzzling, it works great most of the time, just like the other domains, and then all of a sudden for no reason i can think of they just stop for 6 hours or so and then they are back up again. But its only some of them, the others work great. Out of the 10 Domains one or two (and never the same ones allways) will get page not found but the others are fine, but i change DNS on my windows tcpip option to use a open DNS like google and they are back..

It is really strange..

And this is without changing anything DNS wise on the host server.

Has to be something else going on right, or am i totally wrong on that? Thanks

Posted by vtechpk, 11-20-2013, 12:42 AM
http://prntscr.com/25enya

edit dns, change this to

ns1.northxpro.com
ns2.northxpro.com

issue will be resolved.

Posted by durangod, 11-20-2013, 12:44 AM
Ok i will give that a try but i wanted to also post one more just to compare the two.

This site has been up all day and now i get a page not found and nothing on the server has been changed. www.americanpatriotleague.com

so i ran a world view, image attached. You can see denver and phoenix on there which is where my traceroute goes thru but they can get the site and i cannot... And people around the world can see it fine but at times people all over the usa cannot, its so on and off kind of thing that messes with me.

here is the traceroute


Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.

C:\Users\xxxxxx>tracert americanpatriotleague.com

Tracing route to americanpatriotleague.com [198.15.78.59]
over a maximum of 30 hops:

1 <1 ms 1 ms 1 ms 192.168.0.1
2 17 ms 20 ms 17 ms albq-dsl-gw49.albq.qwest.net [67.42.200.49]
3 17 ms 16 ms 16 ms albq-agw1.inet.qwest.net [71.222.249.129]
4 26 ms 26 ms 26 ms dvr-brdr-02.inet.qwest.net [67.14.24.114]
5 26 ms 28 ms 27 ms 63.146.26.134
6 56 ms 57 ms 56 ms vlan51.ebr1.Denver1.Level3.net [4.69.147.94]
7 57 ms 57 ms 57 ms ae-2-2.ebr2.Dallas1.Level3.net [4.69.132.106]
8 57 ms 57 ms 57 ms ae-72-72.csw2.Dallas1.Level3.net [4.69.151.141]

9 57 ms 59 ms 57 ms ae-71-71.ebr1.Dallas1.Level3.net [4.69.151.138]

10 78 ms 57 ms 58 ms ae-1-8.bar1.Phoenix1.Level3.net [4.69.133.29]
11 57 ms 57 ms 57 ms ae-0-11.bar2.Phoenix1.Level3.net [4.69.148.114]

12 57 ms 58 ms 57 ms PHOENIX-NAP.bar2.Phoenix1.Level3.net [4.28.82.13
8]
13 81 ms 76 ms 78 ms 108.170.0.29
14 * * * Request timed out.
15 99 ms 101 ms 96 ms blade.neh27.com [198.15.78.58]
16 97 ms 101 ms 99 ms Razor.NorthXpro.com [198.15.78.59]

Trace complete.

so how is the traceroute able to see the path and i cannot, makes no sense to me.. i get page not found.. but if i wait a while or change my DNS in windows it will be ok. And as you can see the world view is ok..

Posted by foobic, 11-20-2013, 12:46 AM
The DNS misconfiguration you've got commonly causes exactly the symptoms you're describing. If you "shut down" a nameserver so it's no longer authoritative for your domain then you need to remove all references to it, otherwise resolvers will (intermittently) try to use it, and fail to resolve your domains.

Posted by durangod, 11-20-2013, 12:55 AM
holy moly ok thanks that makes sense

thank you so much foobic awesome i am calling my host now... we need to go back to my custom NS and my ded ip anyway so i will tell him its time to put it back... thanks sooooooooooooooooo much everyone,, you have all been AWESOME!!!

Posted by durangod, 11-20-2013, 01:14 AM
one more quick question on this, does this mean that i should go into my cpanel for those domains and delete the A records or edit them and it will resolve or does this all have to be done by my host at server root level ?

Posted by foobic, 11-20-2013, 02:24 AM
If you have access to edit zone files through cPanel you can do that (my recollection is that the zone editor was only in WHM from reseller-level up but I may be wrong).

And it probably doesn't matter too much which nameserver you use - the main thing is to have a consistent set of records at the registrar and in your zone files, for both the hosted domain and the nameserver domain(s). Intodns is good at pointing out inconsistencies. Then longer term if you want to improve reliability try to get a second nameserver in a different location - perhaps your host already offers this.

Posted by durangod, 11-20-2013, 02:45 AM
thanks, yeah im looking at my open srs now and i see what you mean.. and i can change it from my WHM or cpanel im sure but i have to ask host first because i went in one time and changed the dns from WHM when i xfered one time thinking it would speed things up and they got really upset with me saying to never ever ever make any changes directly to the zone.. so i never touch it without asking them first. They said it will resolve on its own without touching that... I dont believe that but im folowing their orders is all...

thanks so much, all the domains are fine regarding this, no other record has issues the only one is the hosting domain i think its A records are the problem here... its causing the DNS to try to use it for the other sites when its not valid... The sad thing is that i had a gut feeling about this 5 days ago because it was the only thing that was even remotely out of wack but since it said this

The DNS/Zone information on this page will have no effect because your nameservers need to be set to use our nameservers.

for the hosting NS i ignored it since we had planned to go back.. crazy crazy crazy



your a lifesaver, you all are.....thanks again.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
Hostthebest.com (Views: 1060)
hivelocity.net down ? (Views: 1221)
CloudFlare Global Outage (Views: 1209)
BytesRack.com - Outage (Views: 1074)

Language: