dns problems: multiple sites, dcs, subnets, and roaming laptops

xSauronx

Lifer
Jul 14, 2000
19,582
4
81
I started work at a company a few months ago that has 6 sites, each with its own DC [handling dns/dhcp] and subnet. 4 of the sites are directly connected via MPLS circuits, 2 are using site to site vpns.

there is wireless at each site, but the wireless clients all join the same dhcp scope as the wired network local to each site [there are 30 - 50 clients at each site getting dhcp addresses, very few of them wireless]. the dhcp leases are set at the default of 8 days.

the problem: several users with laptops might be at multiple sites during the week, so when we have to help them remotely, dns is not always up to date so we have to try and get an ip for the user. its a pain, i want it fixed properly, but im not sure of the best practices for this. typically if they travel they are only on the wifi at another site. at their home site they have a dock and are wired/wireless

an MSP in the past set DNS scavenging to be aggressive, with the timers set to 12 hours each. but dhcp is at the default of 8 days.

what id like is dns always, or mostly always up to date so i can easily help/keep up with mobile users

also, these mobile users sometimes use a vpn through an asa 5510, but that hands out dhcp. typically this happens at night and isnt a concern to me, but it can come up that someone on the vpn during the day might need help.

whats the best way to manage all of this?

/in the meantime i am trying to find out why a DC kept dropping out of dns, and why it cant write to dns, but i am sort of getting somewhere with that.
 

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
set dns manual

Yeah no...

Sounds like the clients can't update their DNS. Is DHCP being handed out from Windows boxes? The clients should take out a DNS entry with write permissions assigned to the computer account. Does a DCdiag report the domain as healthy? If the DCs are dropping out of DNS you likely have serious corruption issues. What are the DNS Event logs telling you?
 

Mushkins

Golden Member
Feb 11, 2013
1,631
0
0
If how you describe it is how it's happening, this honestly doesn't sound like a DNS cache issue or lingering leases. Each office (and thus each wireless network) is on its own subnet, which means you're getting a brand new lease and a brand new set of DNS servers from the DHCP server every single time you switch offices, and wiping out the old information. This is a server issue, not a client PC issue. My first guess would be that whatevers going on with that DC dropping out of DNS is the root cause here, and office mobility problems are merely a symptom, especially if people running into those problems are visiting that office with the bum DC somewhere in the mix. Their most recent DNS entry associating that PC name with its latest spot on the network is probably updating in that bum DC and not propagating to the other DNS servers causing duplicate entries. I'd focus on getting that issue resolved and see if your mystery connectivity issues go with it.
 

xSauronx

Lifer
Jul 14, 2000
19,582
4
81
Yeah no...

Sounds like the clients can't update their DNS. Is DHCP being handed out from Windows boxes? The clients should take out a DNS entry with write permissions assigned to the computer account. Does a DCdiag report the domain as healthy? If the DCs are dropping out of DNS you likely have serious corruption issues. What are the DNS Event logs telling you?

lots of things, ill get a better update later. one DC is dropping out. after talking to the boss today that DC didnt install properly [a year before i was here]. there wasnt even a sub-folder in the _msdcs subdomain for it until i deleted the CNF entry with ADSI edit.

Before deleting the CNF entry i restarted NETLOGON to try and write the SRV records [no site folder, no SRV records, it was not cool] and it wouldnt work. Deleting the CNF and another restart of NETLOGON at least got those all created, which did sync to the other DCs.

That troublesome DC cannot write itself back into DNS, i didnt have much time to dig in today and didnt think to see if it has synced down DNS entries from the rest of the domain but a quick vpn and a look-see shows

error event id 4011
The DNS server was unable to add or write an update of domain name $DCNAME in zone $DOMAIN.com

yadda yadda

SecErr: DSID-03150BB9, problem 4003 (INSUFF_ACCESS_RIGHTS)

i did some digging the other day but got distracted by eh, work. now, what WAS happening was that the DC was getting removed from dns DAILY.

i will log into my notes tomorrow and put up a timeline of what i have done and when. the DC is not getting deleted daily now, but it is bitching that it cant write its own record. I changed security on one level of DNS the other day because whenever it got deleted users couldnt pull reports from that site, since...they couldnt find that site. I had to stop it somehow.

it has dns entries from today from DHCP clients at other sites, so some portion of dns is working right...just not the important part where a DC can manage its own record. not great. I am pretty sure i undid that security change i made but ill have to check in the morning.

Sounds like the clients can't update their DNS. Is DHCP being handed out from Windows boxes? The clients should take out a DNS entry with write permissions assigned to the computer account.

Yes, each site has a DC doing DNS and DHCP for the site.
Looking at a few DNS entries for DHCP clients the computer account has write permissions to the entry.

i have dcdiag reports already, one from the troubling DC and one from the main site i work at. Ill dig into those again tomorrow. Lots of fails/errors from the problem DC, but i dont think the working one had the same/as many issues. I tried to spend some time wednesday and thursday troubleshooting through the logs but didnt really get far.

Except: DNS is still set in win2k configuration. That is, _msdcs folder is a subdomain of $DOMAIN.com, instead of its own zone. I have found the MS KB article for updating that to server 2003+ level but i didnt want to go through those steps with the DNS issues I am having right now, for fear of making things worse.
 

drebo

Diamond Member
Feb 24, 2006
7,034
1
81
Check your intersite topology links in AD Sites and Services. It sounds like AD thinks there's a full mesh but that there might not actually be. In a topology like yours, you probably want to manually create your links anyway to make sure the "hub" is the correct place.

Also, 8 day DHCP leases are unnecessary. 1 day tops.

As long as you're not running any Server2k DCs or Exchange 2000, you can safely upgrade your domain and forest functional levels to 2003 native without issue.
 

xSauronx

Lifer
Jul 14, 2000
19,582
4
81
Check your intersite topology links in AD Sites and Services. It sounds like AD thinks there's a full mesh but that there might not actually be. In a topology like yours, you probably want to manually create your links anyway to make sure the "hub" is the correct place.

Also, 8 day DHCP leases are unnecessary. 1 day tops.

As long as you're not running any Server2k DCs or Exchange 2000, you can safely upgrade your domain and forest functional levels to 2003 native without issue.

yeah i know 8 days are too long, especially with the 12 hour savenging periods.

Essentially, i had no lincolnton folder in _sites, and no SRV records for the lincolntondc. I created the record manually the other day, restarted netlogon on that dc, and the SRV records and _sites folder showed up on ALL DC/DNS servers

BUT the A record did not. even when i manually went to NTDS settings and said "replicate from here [a dc with the a record] to this site]. I disabled scavenging over the weekend. The problem DC showed for a little while this error: http://technet.microsoft.com/en-us/library/cc735774(v=ws.10).aspx
Event Details
Product: Windows Operating System
ID : 800
Source: Microsoft-Windows-DNS-Server-Service
Version: 6.0
Symbolic Name: DNS_EVENT_ZONE_BAD_PRIMARY_SERVER
Message: The zone %1 is configured to accept updates but the A record for the primary server in the zone's SOA record is not available on this DNS server. This may indicate a configuration problem. If the address of the primary server for the zone cannot be resolved DNS clients will be unable to locate a server to accept updates for this zone. This will cause DNS clients to be unable to perform DNS updates.

so when I created the A record on my site dc, the problem dc didnt get the synce of that A record either, even though ALL DNS SERVERS got that site/srv records added to their _sites folders. bizarre. The problem dc [lincolntondc] did get the record eventually, so the error hasnt come up again. But only 3 of 7 DNS servers have that A record.

/grumble

i havent had to manually setup sites and services before, ill have to do some reading before i change anything in that configuration and run it by my boss. same goes for changing the DNS structure. Functional level is at 2003, but the dns structure is not: http://support.microsoft.com/kb/817470
 

hoboville

Junior Member
Nov 10, 2013
9
0
0
The zone %1 is configured to accept updates but the A record for the primary server in the zone's SOA record is not available on this DNS server. This may indicate a configuration problem. If the address of the primary server for the zone cannot be resolved DNS clients will be unable to locate a server to accept updates for this zone. This will cause DNS clients to be unable to perform DNS updates.

I believe that is your problem, the A record for the root DC may either be invalid, or your DNS settings aren't allowing the changes to replicate from the primary DC. This is probably why you are having to manually update the records at the other sites.

I would check the basic DNS settings to see if they are allowed to receive dynamic updates from the root DC. If they are, then you'll need to check the authentication status of the root DC as the secondary DC's see it. That could involve anything from certificates to Kerberos.
 

xSauronx

Lifer
Jul 14, 2000
19,582
4
81
I would check the basic DNS settings to see if they are allowed to receive dynamic updates from the root DC. If they are, then you'll need to check the authentication status of the root DC as the secondary DC's see it. That could involve anything from certificates to Kerberos.

secure updates are allowed, but that last bit will give me something to consider, though i do not recall seeing errors that would have led me there, i may have overlooked them.

I had seen some suggestions for a similar issue [disappearing DC A record] that included turning DNS into non-AD integrated, seeing if it would catch up and properly sync all records, then turning AD integration on again. not wild about the idea [sounds a little drastic?], as I havent done it before, so i dont know if there are any gotchas.