Mysterious year-long problem with network share: solve this and you are God :)

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
Dynamically configured computers do show up in forward lookup zone (A record), but not in the reverse lookup zone. In the DHCP server, the options for dynamic updating of DNS are enabled.

Any idea why no entries in the reverse lookup zone are created?

All devices with static IPs (outside of the range of assignable addresses of the DHCP of course) do create entries in both zones after startup and can resolve all names just fine.

If the entries existed they get reused. You may need to delete them then do a ipconfig /registerdns. It is not as important that the reverse zone work right but eventually after DNS does it's housecleaning you would start seeing entries appearing in that zone. Did you AD integrate the reverse zone? If not you should.
 

IronCrown

Junior Member
Apr 23, 2012
12
0
0
It was already integrated. I also deleted all old entries and then started to reboot client computers. All the computers with static setup register in forward and reverse zone. The clients that pull their data via DHCP only show in forward.

Maybe I'll setup DHCP and DNS from scratch in the next days. Especially if what I did today does not fix the problem...

If I do manage to solve the problem I'll definitely advocate that we change our IT contractor. I usually solve any IT problem when I try hard enough, but this is really something they should be able to do better than me.

We have a long-standing contract though and have been working with them since many years before I joined the firm. Replacing them would be a major step... and if it turned out that the new contractor cannot solve the problem either because it's something really, really complicated, then people would ask questions why we went through all the hassle for basically no improvement.
 

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
I would keep at it a bit. Realized that we could be completely off base. We are not there to troubleshoot so we get the straw hole view to your problem. Let us know what you see.
 

drebo

Diamond Member
Feb 24, 2006
7,034
1
81
This actually sounds a fair bit like a problem one of our customers had.

It was with a specific piece of legal software that requires file shares (Abacus).

The problem ended up being (among other things) mostly related to off-line files being enable on the clients that were seeing the issue. Disabling off-line file access resolved the problem for good.

It didn't help that the person who originally set up the network was a moron, though.
 

IronCrown

Junior Member
Apr 23, 2012
12
0
0
This actually sounds a fair bit like a problem one of our customers had.

It was with a specific piece of legal software that requires file shares (Abacus).

The problem ended up being (among other things) mostly related to off-line files being enable on the clients that were seeing the issue. Disabling off-line file access resolved the problem for good.

It didn't help that the person who originally set up the network was a moron, though.
We don't use Abacus atm but my new boss told me recently that he would like to have it installed :eek:

The offline files was something I considered only several weeks ago. I had noticed that offline files was enabled on the share although we don't need it, and figured that it might be the culprit. I disabled it and we did not have the problem for several days, I think it was even for more than a week. I was already triumphant that I had found the solution... and then it was back one day all of a sudden.

This happened several times already with many totally different "solutions"... right after I do something, the problem vanishes for a few hours, days, a week. It has to be coincidence but it's really weird...

edit: On your suggestion I thought again about offline files and noticed that the network share has them disbaled, but the clients have them enabled. The server should be the important part I guess, but I have now disbled them on two of the most affected machines and will monitor what happens...
 
Last edited:

her209

No Lifer
Oct 11, 2000
56,336
11
0
Only SBS enforces the CAL limits. The the "full" server OS's only monitor usage and report it to the licensing service for logging. You should never see Windows [OS] licensing blocking access as long as you are not on SBS.

That link is about configuring the licensing service and would have no bearing on connectivity.
Can OP verify that its not the Small Business Server version that is running the file share?
 

IronCrown

Junior Member
Apr 23, 2012
12
0
0
System info says "Windows Server Standard".

Another massive hit today about 20 minutes into "prime time" (when all people are at the office).

One thing I noticed today for the first time when I monitored the open sessions and opened files on the file server: Almost all users were, according to the server, "reading" a certain file in the base directory of the network share. The interesting part is that this file is a 33 gigabytes zip archive. This archive was created in May 2011 and contained all the content of the network share at that time, about 65,000 files.

Of course no one was actually using it, but obviously the clients automatically tried to read its contents nonetheless... the why of it is beyond me. I have now deleted this file. It does not seem entirely impossible to me that several users constantly accessing a huge compressed file may cause network problems. I shall see...
 

her209

No Lifer
Oct 11, 2000
56,336
11
0
System info says "Windows Server Standard".

Another massive hit today about 20 minutes into "prime time" (when all people are at the office).

One thing I noticed today for the first time when I monitored the open sessions and opened files on the file server: Almost all users were, according to the server, "reading" a certain file in the base directory of the network share. The interesting part is that this file is a 33 gigabytes zip archive. This archive was created in May 2011 and contained all the content of the network share at that time, about 65,000 files.

Of course no one was actually using it, but obviously the clients automatically tried to read its contents nonetheless... the why of it is beyond me. I have now deleted this file. It does not seem entirely impossible to me that several users constantly accessing a huge compressed file may cause network problems. I shall see...
Could be the anti-virus on the user's PC (assuming there is one installed) tries to open it and scan it.

:sneaky:
 

IronCrown

Junior Member
Apr 23, 2012
12
0
0
Yeah, though I had already tried to disable file scanning by the AV... but not on all affected computers at the same time.

I'm quite excited now because it all fits so well... I checked old invoices and we bought our new switch in July 2011... and I remember that the problem appeared some time, but not a long time, before our old switch broke. May 2011 could be it (me saying that we've had the problem for 14-15 months was only an estimate; because I couldn't intially know that this would be such a persistent problem I didn't keep written records).

Initial check seems to indicate that all the computers that did not have the problem have a different AV software, mostly TrendMicro instead of McAfee. But there are two computers that run McAfee and did also not have the problem – probably the main reason I did not detect this pattern. These are the ones that do not have Win 7 but Vista/XP. (edit: Ok that didn't make sense. What I wanted to say: all affected machines have McAfee, but two of the not-affected machines do also have McAfee.)

It could still be just coincidence, but I'm crossing my fingers...
 
Last edited:

oynaz

Platinum Member
May 14, 2003
2,449
2
81
Crossing my fingers for you.

I have had the exact same problem twice.

One was crappy antivirus software (an old version of F-Prot).
The other was a user who had brought his own crappy switch and hidden it so he could download porn to his private laptop (yes, really!)
 
Last edited:

Rhyseh

Junior Member
Apr 2, 2012
5
0
0
Crossing my fingers for you.

I have had the exact same problem twice.

One was crappy antivirus software (an old version of F-Prot).
The other was a user who had brought his own crappy switch and hidden it so he could download porn to his private laptop (yes, really!)

Through a switch!? I don't even.... huh?

This is the reason why you hardcode all switch ports to either be trunks or access ports...
 

IronCrown

Junior Member
Apr 23, 2012
12
0
0
It's only been one day, but I'm fairly certain that I fixed the problem. That is because unlike with previous "solutions", I see a clear explanation why it happened and why my fix would work.

The situation was: A 33 gigabytes compressed file was lying unused and forgotten in the base directory of the network share, a folder everyone opens all the time. All affected machines accessed this file all the time according to the file server session monitor. After I deleted the file the problems abruptly stopped and did not appear again, without any client or server reboots.

So the file was the immediate problem. Why should McAfee be a part of the problem as well? All affected machines are running McAfee, and no machine that does not run McAfee was affected; they might be slow sometimes when accessing the network, buy they never got hung up.

Two McAfee-using machines did not get hung-up either, those have an OS other than Windows 7.

My conclusion is that the problem occurred by a combination of huge-compressed-file + McAfee + Windows 7. Though it should be said that the unwanted and erroneous scanning of the huge file (McAfee is configured to NOT scan on network drives on all clients!!) likely slowed down our network and would have been a source of bad performance even without the hang-ups.

Thanks again to everyone for the input. It made me think again about many possible causes and look at things I thought were better left to our contractor...
 

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
It's only been one day, but I'm fairly certain that I fixed the problem. That is because unlike with previous "solutions", I see a clear explanation why it happened and why my fix would work.

The situation was: A 33 gigabytes compressed file was lying unused and forgotten in the base directory of the network share, a folder everyone opens all the time. All affected machines accessed this file all the time according to the file server session monitor. After I deleted the file the problems abruptly stopped and did not appear again, without any client or server reboots.

So the file was the immediate problem. Why should McAfee be a part of the problem as well? All affected machines are running McAfee, and no machine that does not run McAfee was affected; they might be slow sometimes when accessing the network, buy they never got hung up.

Two McAfee-using machines did not get hung-up either, those have an OS other than Windows 7.

My conclusion is that the problem occurred by a combination of huge-compressed-file + McAfee + Windows 7. Though it should be said that the unwanted and erroneous scanning of the huge file (McAfee is configured to NOT scan on network drives on all clients!!) likely slowed down our network and would have been a source of bad performance even without the hang-ups.

Thanks again to everyone for the input. It made me think again about many possible causes and look at things I thought were better left to our contractor...

Interesting, I would be curious why that would cause the server to stop responding. Does the server have AV? Maybe the AV is blocking the server from sharing it out and hanging it up while the AV tries to scan 33GB of files in the ZIP. Try moving the file out of a network share maybe?
 

drebo

Diamond Member
Feb 24, 2006
7,034
1
81
It's only been one day, but I'm fairly certain that I fixed the problem. That is because unlike with previous "solutions", I see a clear explanation why it happened and why my fix would work.

The situation was: A 33 gigabytes compressed file was lying unused and forgotten in the base directory of the network share, a folder everyone opens all the time. All affected machines accessed this file all the time according to the file server session monitor. After I deleted the file the problems abruptly stopped and did not appear again, without any client or server reboots.

So the file was the immediate problem. Why should McAfee be a part of the problem as well? All affected machines are running McAfee, and no machine that does not run McAfee was affected; they might be slow sometimes when accessing the network, buy they never got hung up.

Two McAfee-using machines did not get hung-up either, those have an OS other than Windows 7.

My conclusion is that the problem occurred by a combination of huge-compressed-file + McAfee + Windows 7. Though it should be said that the unwanted and erroneous scanning of the huge file (McAfee is configured to NOT scan on network drives on all clients!!) likely slowed down our network and would have been a source of bad performance even without the hang-ups.

Thanks again to everyone for the input. It made me think again about many possible causes and look at things I thought were better left to our contractor...

Well, consider that if the clients are using Offline files, that 33gb file is being synced to their systems, where McAfee scans it and changes its modified date. The systems then all sync it back to the server.
 

ScottMac

Moderator<br>Networking<br>Elite member
Mar 19, 2001
5,471
2
0
Through a switch!? I don't even.... huh?

This is the reason why you hardcode all switch ports to either be trunks or access ports...

...and set the hop count / Time-to-Live (TTL) such that the addition of another router causes TTL to expire (prevents NATing to multiple hosts behind the router).
 

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
Well, consider that if the clients are using Offline files, that 33gb file is being synced to their systems, where McAfee scans it and changes its modified date. The systems then all sync it back to the server.

Good point, sync down scan, sync back [server scans] Offline files would see the change, rinse and repeat.
 

IronCrown

Junior Member
Apr 23, 2012
12
0
0
Only that offline files had been disabled server-side for about two months. If the file server does not allow offline files on the network share, it shouldn't happen, no? (And the big file still had its 2011 modified date, so it was not changed.)

Also, the clients' McAfee was configured to not scan on network drives. So why did they access this file when no user ever knowingly used it?

I don't know. But it's gone now and everything continues to be fine.
 

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
Only that offline files had been disabled server-side for about two months. If the file server does not allow offline files on the network share, it shouldn't happen, no? (And the big file still had its 2011 modified date, so it was not changed.)

Also, the clients' McAfee was configured to not scan on network drives. So why did they access this file when no user ever knowingly used it?

I don't know. But it's gone now and everything continues to be fine.

Offline files continues to try and sync from the clients even if it is disabled on the share. It also does it by directory so that backup file is likely floating around your network and it is getting time stamp updates that are cause attempted and failed syncs.

PS now that stuff is more stable it is a good time to go through and review settings to further improve stability.