Network Gurus Only!!

BIGMACC

Member
Oct 8, 2001
92
0
0
Here's the senerio: I admin a network of both Macintosh and Windows systems. 29 Macs running Mac OS 9.x, 9 Windows running 98, one Mac file server running ASIP 6.3, and one Windows NT 4 server. Seven 10/100baseT switches make up the network over two floors of an office condo, serving a T1 internet access protected by a IPsec Firewall/VPN/Network appliance that is DHCP serving the entire network at this time (will be switching to Mac OS X server to serve the Macs and NT 4 server will serve the Windows systems).

All systems are playing incredibly well together with one glitch! I have one system (a Mac) the keeps getting an unexpected diconnect from the Apple file server. I have given this system a static IP address that is outside the addresses being served by the network appliance. have switched patch cable, switched ports on the switch, hell, I've plugged her into a different switch.

THIS IS NOT A MAC ISSUE, so please keep the comments to yourselves!! I have heard all the comments. I admin both Mac and Windows and can tell you the pros and cons of bothe systems. This is a networking issue, and I have racked my brain out to figure this one out. Any help would greatly be appreciated!!!
 

vi edit

Elite Member
Super Moderator
Oct 28, 1999
62,484
8,345
126
Does the disconnect happen at any specific times, or is it completely random?

Sounds to me like you are having a rogue machine/appliance/piece of equipment hopping on there every now and then with the same IP addy.
 

BIGMACC

Member
Oct 8, 2001
92
0
0
Completely random, in any application, most of the time when printing and there are linked files, located on the server, to the file that is printing; but, not every instance is when she is printing.

I ruled out that problem by giving the system an ip addy that is not in the range of addy's that are being served (i.e. addy's being served are 20 -100), I gave the system and ip addy of 10. Nothing else has the ip addy.
 

ttn1

Senior member
Oct 24, 2000
680
0
0
If it's a complete network failure, I would check the power supply. I have had 2 PCs that would randomly lose the network. It went on for over a month without any reason that I could find. Turned out that the power supply fans had slowed and the power supplies were overheating. The computers ran fine other than losing the network. Never crashed or had other problems. Replaced the fans and they haven't lost the network in over 4 months.
 

vi edit

Elite Member
Super Moderator
Oct 28, 1999
62,484
8,345
126
Is there any remote chance that somebody could be jacking into the hub with a laptop/pda/printer/ect?

Not any of your equipment per se, but just a tech/user/employee of sorts plugging into an open port?
 

BIGMACC

Member
Oct 8, 2001
92
0
0
ttn1 thanx, I will def look into that!!

vi_edit No, I keep a tight grip around here, espcialy on the ones that think they know what they are doing! They are the most dangerous! The majority of the employees are graphic artist. Just a little insight, the laptops are rarely moved within the office, the is no WLAN, and one big piece of info, nothing has been added or taken away from the network. Well, that's not true, but this problem was occuring before the addition.
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
Macs are funny little things when they're on a network. :)

I remember a very pecular problem with MACs and switched networks that causes extreme headaches. Macs announce their name and other info on boot up. If they don't hear a challenge then they assume all is well and start using that name and other info. All well an good, but in a switched network the port is probably still in a listening/learning state for spanning-tree.

Cisco switches are notorious for this as it takes a good 50 seconds for the mac to truly join the bridge, during this time any traffic the mac transmits is dropped.

What kind of switches are these and can you bypass the listening/learning states of spanning-tree as well as any auto trunking/channel? Also what is the network layout, are the switches just chained together willy-nilly or is there some hiearchy to the net?
 

BIGMACC

Member
Oct 8, 2001
92
0
0
The switches up stairs are stacked not so "will-nilly", each dept. has it's own switch, server and printers on their own switch. Not using AppleTalk, commonly known as ChattyTalk for each node wanting to be recognized with each network query, but using TCP/IP which cuts way down on the unessisary network traffic.

Most of the systems do not shut down at night during the week, and are re-leased the same ip addy's. But, I removed that variable from the equation. Also, if there is another device on the network with the same ip addy, the Mac will let you know as soon as the system queries the network.

The switches are 5 Asante and 2 Lynksys.
 

Tallgeese

Diamond Member
Feb 26, 2001
5,775
1
0
Originally posted by: BIGMACC
Here's the senerio: I admin a network of both Macintosh and Windows systems... (will be switching to Mac OS X server to serve the Macs and NT 4 server will serve the Windows systems).
a bit OT, but why have two different systems serving your network? NT or 2000 could easily handle both sets of clients...just a thought.
I have one system (a Mac) the keeps getting an unexpected diconnect from the Apple file server. I have given this system a static IP address that is outside the addresses being served by the network appliance. have switched patch cable, switched ports on the switch, hell, I've plugged her into a different switch.
Some Qs:
What model Macintosh?
Is ARA installed on this machine?
What version of OT and AppleShare client are installed?

Also, your t-shooting of the physical wiring and infrastructure was a good start.
Did you eliminate her in-wall wiring drop as a possible culprit? (don't think this is it, but when t-shooting: thoroughness is next to godliness, and gets you home earlier)

The switches you mention will make it harder to t-shoot errors, since they are most likely unmanaged (dumb) switches.
Also, I believe that 7 switches is the top end of spec for chaining switches (yes, there is a limit)
 

Tallgeese

Diamond Member
Feb 26, 2001
5,775
1
0
Originally posted by: BIGMACC
Completely random, in any application, most of the time when printing and there are linked files, located on the server, to the file that is printing; but, not every instance is when she is printing.
Does this machine lose access to the Internet (through the T-1) when this happens or not?
If not, then it's probably some interaction with the ASIP server directly.
 

Garion

Platinum Member
Apr 23, 2001
2,331
7
81
Interesting problem. Let me give you a bit of insight into my troubleshooting process. Keep in mind that I'm primarily a network guy - If I can ping it and I know the ports are open, it's a server or a client thing, which I don't usually get into much anymore.

I always do this kind of troubleshooting by the layers of the OSI model. Easier that way, and you know you've covered all the bases.

There's a few questions, however. I'm rusty on my Macspeak, but.. Are you using straight IP, or is there any Appletalk involved? When this computer "breaks", how do you fix it?

1: Physical layer - Cabling

You've tried another patch cable and switch, so you know that's not the problem. How about plugging the machine into a totally separate jack leading through the wall? Tried another machine on that same wall jack to see if it does the same thing, to tie the blame to the cable?

Do you have a link light after it's dead? Do you see the same kind of traffic activity blinking (assuming that you do have a traffic LED on the NIC) after it's died than when it's working?

2: Data link layer (NIC<->Switch communications)

Look at the speed/duplex settings on the Mac and the switch. Make sure it's all at auto/auto. If it is, try locking everything down to 10BaseT, half duplex. Yeah, it's slower, but he'll probably never notice. You might be getting a NIC that's having errors due to speed/duplex problems and either the switch or the NIC is shutting him down to prevent it from corrupting the rest of the network. This is more likely if it happens when printing, as that usually sends a LOT of data across the network, typically WAY bigger than the ordinary size of the file. Lots of traffic could mean lots of errors which means it gets shut down.

A big FTP or file transfer might also help identify this. Try to find a file that's 200+MB and transfer it from the server.

4: Make sure that you have one "main" switch that has connections to all the rest of the switches and to the file server. Make sure you don't have switches cascade one to another to another, except where absolutely necessary. Never go more than 3 cascaded switches deep. More than 5 switch hops is bad.

If you can scrounge up a plain 10BaseT hub, try and put the hub between the Mac and the switch. That should elliminate speed/duplex problems. Not a 10/100 or any kind of "smart" hub or switch. Just a dumb little hub.

Make sure that all "fancy" features are turned off on the switch port it's plugged into. For example, a Cisco port will go through it's speed/duplex check, then try to look for an EtherChannel, then make sure there's not a spanning tree loop, THEN go into active mode. This often breaks DHCP on Win9X machines, as they give up trying to get an address before the switch port goes active.

Any chance you've got a bad NIC in this box? Can you change it, or is it onboard? How about snagging a firewire or USB NIC to try and see if it makes a difference.


Layer 3/4 - Network (IP communications)

1: Change it to a different static IP and see if that helps, to rule out someone ELSE with the same static IP plugging in occasionally.

2: Setup a ping -t from a Windows machine to do a constant ping to the device. next time he has a problem, check and see when it stopped. Better yet, use a free ping checker app like Ping Plotter to watch the success rate. If this continues and his server access stops, then you've got some kind of client/server problem. (Not Mac bashing, just that the network is OK). If you CAN still ping it and he's still broken, look at your ARP tables to see if you've got a different MAC address. If so, you've got another machine using the same IP.

Try and setup a similar ping from the mac to the printer he's sending to. If the ping stop when his system looses network connectivity, it's probably a Layer 1 or Layer 2 problem.

As TG suggested, see if you can ping anything or browse the Internet after the communications die.

Best of luck!

- G
 

brisco

Senior member
Apr 17, 2001
420
0
0
Well you sound like you know what you are doing so I doubt this is the answer but it happened in my network exactly the same way and I found the problem to be that the system that kept losing connectivity was on a cable segment that was about 104 meters long. Woops! 4 meters too long accordin to the rule book. As soon as I took about 10 meters out of that cable the system never had a problem again.

My situation was identicle and I figured it out because I noticed that the system in question only went down when the network traffic started to get moderate to heavy.

Good luck, have to let us know what the problem was when you figure it out.
 

BIGMACC

Member
Oct 8, 2001
92
0
0
MAN, I REALLY DO APPRECIATE ALL OF THE RESPONSE, AND HOW PROFESSIONAL THE REPLIES HAVE BEEN!!!

THANK YOU VERY MUCH!!

Now, bear with me, as I will make every effort to reply to your questions and comment on your responses.

TallGeese I work for a national tourist publication and work with an incredible amout of files and 95% of these files are very large in size. I'm not knocking Windows in any way, but NT server just runs too slow to handle these files compared to ASIP or Mac OS X Server. It is also what was in place when I took this position, and would do better at moving a mountain. The system is a Powermac G4/400 AGP, and has the latest versions of OT (OpenTransport), and is running Mac OS 9.2.2 which before Mac OS X is the latest AppleShare.

I have not switched her out to a different lan drop. We are still in the busiest time of our year and I can only do but so much t-shooting.

Also this mobo has a Cuda Chip on it and I have reset that. I have also reset the Open Firmware on the system (done by booting up in the Mac OS promt and keying in the line command).

Thanks TallGeese!!


Garion Answer is above on trying another lan drop.

No link lights as it is an on-board ethernet.

The system does have a static ip addy, about the 3rd one i've given it, and no AppleTalk involved.

Dump switches won't allow me to set the speed of the ports, nor will OS. Have run network performance tests using Apple Network Assistant (remote admin software), and all test passed. Even ran the same tests on systems that are slower and do not have this network problem, and got the same test results.

The file transfers are done, oh, about 80 to 100 times a day and are on the average of 45 to 60MB, with the occasional 150MB to 200MB file here and there.

Ok, on the switches cascading, you have hit something. We have four that cascade.

Mac's are notorious for kicking out an error at the first hint of another hardware address on the network with the same ip addy. This doesn't appear to be the issue. All other workstations are DHCP served. Only the file servers and the printers have static ip addys.

My biggest disadvantage is that the operator continually reboots her system when this happens, so pinging it or checking the port connectivity becomes nul.

Thanks Garion!!

brisco Well, it isn't the farthest drop in the building. I will take a look in the plenum to see if there is any excess cabling goin to that drop.

Thanks brisco!!

Again, thank you for all the replies and the good, professional advice!!!
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
So it only happens with a single Mac? Must be something with that particular switch port/cable drop or computer.

Pre-press networks are a pain in the arse. The files are so big and Macs just up and get screwy. As an aside you could improve performance greatly with a single 48 port switch or a chassis switch with multiple gig ports for the servers, AGFAs and heavy hitting pre-press guys. Having a single switch is great for performance but also helps a lot with troubleshooting.

good luck.
 

Tallgeese

Diamond Member
Feb 26, 2001
5,775
1
0
Originally posted by: BIGMACC
Ok, on the switches cascading, you have hit something. We have four that cascade.
I'd think that the same symptoms might show up on other machines if cascading was the problem, but better to remove that as a possible issue regardless. Post a link to a network diagram, and we might be able to suggest a better topology.