Spidey quiz - why is this tunnel having problems?

spidey07 · May 15, 2003

OK, time for some head knocking and scratching.

Why is this VPN tunnel between two cisco routers having trouble with all applications...

Debugs from "debug ip packet 130" where ACL 130 matches two IP addresses used for testing in both directions...

HLDS_CA-2621-RTR#
May 15 13:14:22.055: IP: recv fragment from <tunnel endpoint IP removed> offset 0 bytes
May 15 13:14:22.055: IP: recv fragment from <tunnel endpoint IP removed>offset 1480 bytes
May 15 13:14:22.063: IP: recv fragment from <tunnel endpoint IP removed>offset 0 bytes
May 15 13:14:22.063: IP: recv fragment from <tunnel endpoint IP removed>offset 1480 bytes
May 15 13:14:22.071: IP: recv fragment from <tunnel endpoint IP removed>offset 0 bytes
May 15 13:14:22.071: IP: recv fragment from <tunnel endpoint IP removed>1480 bytes
May 15 13:14:22.079: IP: s=172.21.155.3 (FastEthernet0/1), d=172.21.33.62 (FastEthernet0/0), g=172.21.33.62, len 588, forward
May 15 13:14:22.083: IP Fragment, Ident = 34014, fragment offset = 4440
May 15 13:14:22.083: IP: s=172.21.155.3 (FastEthernet0/1), d=172.21.33.62 (FastEthernet0/0), len 588, sending last fragment
May 15 13:14:22.083: IP Fragment, Ident = 34014, fragment offset = 4440
May 15 13:14:27.059: IP: recv fragment from <tunnel endpoint IP removed>offset 0 bytes
May 15 13:14:27.059: IP: recv fragment from <tunnel endpoint IP removed>offset 1480 bytes
May 15 13:14:27.067: IP: recv fragment from <tunnel endpoint IP removed>offset 0 bytes
May 15 13:14:27.067: IP: recv fragment from <tunnel endpoint IP removed>offset 1480 bytes
May 15 13:14:27.075: IP: recv fragment from <tunnel endpoint IP removed>offset 0 bytes
May 15 13:14:27.079: IP: recv fragment from <tunnel endpoint IP removed>offset 1480 bytes
May 15 13:14:27.087: IP: s=172.21.155.3 (FastEthernet0/1), d=172.21.33.62 (FastEthernet0/0), g=172.21.33.62, len 588, forward
May 15 13:14:27.087: IP Fragment, Ident = 34029, fragment offset = 4440
May 15 13:14:27.087: IP: s=172.21.155.3 (FastEthernet0/1), d=172.21.33.62 (FastEthernet0/0), len 588, sending last fragment
May 15 13:14:27.087: IP Fragment, Ident = 34029, fragment offset = 4440

This debug is of a 5000 byte ping from 172.21.155.3 to 172.21.33.62. The DF (don't fragment flag) is cleared and I've even created a global route map to clear the DF bit on all IP frames. Normal pings work fine. Most apps work fine. But some don't.

*UPDATE*
Still same symptoms but new data - I CANNOT PING THE OUTSIDE INTERFACE OF THE LINUX BOX RUNNING IPTABLE/IPCHAINS (WHICH IS IT NOW ADAYS?) with any packet larger than 1472. So is there a "don't fragment" setting in iptables or something like that that could cause trouble?

Thanks

MysticLlama · May 15, 2003

Well, I'm waaaay below you in experience, but I'll just go for the random guessing thing and maybe I'll come up with something really simple that points you the right direction...

Could there be something with the MTU screwing with the packets? Just looking at it, I see that the frag offset is 1480, which happens to be pretty close to the 1492 of DSL or 1500 of Ethernet. Maybe in the process of going from one point to the next it hits a smaller MTU vs. a bigger one which freaks it out?

Is there anything similar between the applications that work/don't work? Like maybe anything with burstable information (e-mail, downloading a quick file, database queries) works, but sustained data (streaming, snmp info logging, copying lots of files at once) doesn't? (Or the reverse of that)

Or maybe (and this is really silly) it works better on lower port ranges than higher ones, or stuff on "normal" ports work, but dynamically allocated stuff works weird?

Saltin · May 15, 2003

Off hand I'd think it was the MTU as well. Are the two routers using different MTU settings?

spidey07 · May 15, 2003

MTU on both routers is 1500.

PMTUD works both ways.

Of course only packets (including layer 3 and above) bigger than 1480 have problems. That's a hint because 1480 is a magic number for IPsec frames.

Good luck.

Another hint - Tunnel is IPsec, 3des, ESP, tunnel mode.

Xanathar · May 15, 2003

Check your IOS Version, Cisco has had more then a handful of 12.x problems with fragmentation. Ive had this problem (not only with tunnels) appear a couple of times.

DoctaMason · May 15, 2003

Reboot the router.

Garion · May 15, 2003

Methinks this is one of the Spidey problems where he's already figured out the answer and wants to see if anyone else can..

Sounds like some things are getting through and some not. Have/did you try to do ICMP pings (with the DF bit set) of increasing payload size to see how big of a packet you can push through the tunnel before it screams for mercy?

Some other questions.. I believe you can set a MTU on the tunnel interface - Have you done so, or is there a default value that gets set? I know fragmentation in IPSec can get rather messy, especially when it's the IPSec packet that gets fragmented, not just the payload by the tunnel interface.

Are there other networks between your routers that could have a smaller MTU causing problems? I don't think PMTUD will catch that, will it? Is the circuit running across anything like a MPLS network which adds additional headers which might be pushing you over the limit?

You've got the DF bit off on the payload traffic, but what is getting set on the outbound IPSec traffic? If it's got the DF bit set, it could explain some of the symptoms you're seeing.

- G

spidey07 · May 15, 2003

Actually this one has me stumped.

I'm sure it is a fragmentation issue but it looks like the crypto process rejects the reassembled packet due to an mac veryfiy failed.

Something is doing the ipsec de-capsulation when it shouldn't. And I can't figure it out. Cisco has sent this case to the developers because the configurations have been verified.

ScottMac · May 15, 2003

My first guess would be the MTU as well. Since that's being discounted, would it be possible that there's a VLAN involved and the 802.1q tagging or ISL header is throwing a furball into the security verification / crc or pushing the MTU beyond an acceptable limit?

My recollection is that most / all of this runs with "Do Not Fragment" set.

Is this / can this be running on the Native VLAN (not a normal thing, but for testing ... )

Whaddya think?

.02

Scott

spidey07 · May 15, 2003

Here's a basic diagram from end-2-end

LAN---3030---GW1---I---IPchains---2620---LAN

Vlan and tagging seem to be OK as I have verified packet/frame sizes and there is not tagging going on. Its really got me stumped and cisco as well. More sniffing and debugging leave me with the FACT that the 3030 is:

1) fragmenting a packet that needs to pass the tunnel (as matched by the SAs). DF bit is cleared from the source and from the 3030...this looks like normal fragmentation to me.
2) Those fragmented packets are arriving on the outside interface of the 2621 router intact with the DF bit cleared.
3) The 2621 receives IP layer fragments as noticed by the debugs.
4) Keep in mind I have a global IP policy (route-map) that clears the DF bit on any IP packet.

Pay attention to the frame sizes and offsets in the debugs. It appears to me that the packets are NOT getting fragmenting in transit on the tunnel. The packets are fragmented (rightly so) at the 3030 end.

Once the 2621 router tries to re-assemble the packet and de-encrypt is where the trouble lays. That's what I'm seeing.

Now the only thing that could be funny about this is the IPchains box in between. It is outside of my administrative control.

Sounds like fun don't it?

spidey07 · May 15, 2003

Notice in the debugs that the full packet is never sent/forwarded. Only the remaining fragment.

I would say it is a bug but I have also upgraded the IOS to 12.2.(13b).

Most of the fragmentation bugs deal with CEF, fast switching or hardware accelerators. For debugging puposes I have placed both interfaces in process-switching mode.

ScottMac · May 15, 2003

Maybe try (for testing) moving the right-hand endpoint to the outside of the IPCHAINS box. At least that would eliminate (or implicate) the piece of the network you aren't controlling (or add a temporary component to be your endpoint).

Also, is there any chance of a split route (the return path being different than the originating path ... i.e., in one gateway and out another)? Are you catching traces of both the gozinta and gozoutta traffic from the 2600?

My home network has a LinkSys '41 that I use for SSH ingress, aimed at my Red Hat box. When I put in the PIX, I defined that as my default gateway, so traffic from the RH box would come in on the LinkSys and out the PIX .... SSH wouldn't work ... until I set the Linksys up as the default gateway for the RH box (back to the ISP). It's using 3DES, SHA1, etc...

Also, just for testing, would it be feasable to eliminate the policy routing ... just put in a static route or two (as needed) to eliminate the possibility that there's some policy that's twiddling something it shouldn't be? You know, kinda like "start basic" and add things till it breaks ....

(Have you looked under all the desks at the remote end to make sure someone doesn't have a red herring stored under there?)

Just tossing out some more straws, I know just enough about this stuff to be dangerous....

Good Luck

Scott

Saltin · May 15, 2003

So it's the IPsec headers that are making the packets exceed the MTU, when the packet is close to the size of the MTU to begin with, right?

Generally, a Cisco router will pre-fragment in this case ? This improves performance somehow, if I remember correctly? CEF path over process path...? (I'm not a plumber by trade)...

So you have them set to re-assemble in the process path in this case?

Poontos · May 15, 2003

Probably, IOS related, but could it be anything to do with IPCHAINS? What version of OS and IPCHAINS are you running?

Santa · May 15, 2003

From what I have read Ipsec requires an MTU of 1480. Approximatly 20 byte of overhead I think.

Can you change your MTU to approximately ~1400-1450 MTU to see if it fixes it?

Also in searching for similar issues such as this I come across many DSL PPPoE, VPN, Client Sending too large a packet size.

Link

What apps do not work well? Just curious

Hope some of this sparks some brain cell..

spidey07 · May 16, 2003

To answer some questions...

The whole problem lies in delivery of packets (layer 3 and above) over 1500 bytes. When the 3030 tries to encapsulate and add IPsec overhead (20 bytes) the originating packet needs fragmentation. No big deal.

It is the re-assembly and de-encapsulation that is failing.

My main guess is IOS related or IPtables/chains that I cannot control however I have talked the administrator into allowing any and all traffic flowing across the linux box. So at that point PMTUD should work as it should.

Unfortunately I cannot take the linux box out of the path. The Internet T1 goes directly into an HDLC/sync card on the linux machine therefore I don't have total free-reign internet access.

I'm seriously leaning towards IOS problems as their T train of code has commands specifically dealing with this - "does the router apply crypto/ipsec to the pre-fragmented packets or after they've been assembled?"

grrrr.

Thanks for the tips guys. If I find a damned blue box in there somebody's getting fired. We've got payroll to run tomorrow.

<---wipes hands clean.

-edit- Santa it seems like you are getting close and the same kinds of symptoms are occuring. But generally a work around for this "broken Internet...we see it often" is to mark all IP packets with the DF bit cleared - that's where my route-map comes in. still puzzeling.

spidey07 · May 16, 2003

bump for update.

Buddha Bart · May 16, 2003

he's not NATing your stuff is he? Like if he's got some global 'mangle everything out this interface' chain?

spidey07 · May 16, 2003

I've been assured there is no natting going on.

The weird part is if you look at the debugs the router is receiving both fragments of the original packet.

Santa · May 16, 2003

This is something that I hope only plagues SOHO cheaply/poorly implemented routers but...

Linksys routers are known to drop packets that do not match a CRC internal check and the packets that tend to match this are VPN type traffic.

Our VPN (Checkpoint) absolutly does not work behind the current Linksys routers out there on the market because of this.

If this is the case for you then it is very hard to track down and the only sure way we even found out about it was to take the device we thought may be the issue out of the chain and it started working.

I would be very worried if there is a little blue box like this in a mix.

Santa · May 16, 2003

Double Post - Deleted

chsh1ca · May 16, 2003

Spidey, Netfilter CAN be configured to block fragmented packets, in which case you would probably see the results you are getting where one half of the fragmented packet is hitting the 2620, and the other is not. It may also depend on the version of netfilter on that box.

If it's ipchains, then I recall seeing something a while back about kernel options to always defragment packets as they traverse the box. I certainly don't think it would give you the behaviour you're seeing.

Have you tried sticking another machine in between the two Ciscos and trap the traffic that way to see how the 3030 is sending the stuff out, and if it's proper that way?

Is this something that just suddenly game up, or has this been going on since you tried to implement?

Garion · May 16, 2003

Rule of thumb: If it isn't the cabling or the network, it's always the firewall.

Can you take the EXACT same config on the far side and plug it in somewhere that's not behind an ipchains box and see if it works? That is the one wildcard in this problem and you need to prove it's not causing the problem.

- G

spidey07 · May 16, 2003

Update -

Reproduced in the LAB. Same symptoms.

Setting up a few sniffers (inside 2621, internet segment, inside 3030 concentrator) revealed the true causes..

1) 2621 is fragmenting first and applying IPsec to both fragments resulting in two complete IP datagrams
2) 3030 decrypts both complete datagrams and sends as two fragments
3) host responds
4) 3030 encrypts entire datagram and sends as two fragments
5) 2621 reassembles two fragments and applies IPsec decryption.
failure

Cisco has now created a new bug for this condition and have recreated in their lab. This is on even 12.2.13b. Sun servers seem to be the most affected because between 1412-1418 datagrams the reponses actully make it but they are decrypted properly and sent on the wire, but the resulting datagram has a corrupt payload.

Never had a problem before because we're using PIX for all the other tunnel enpoints. Pix uses a MSS of 1360 by default so these fragmentation problems don't arise.

Man that is ugly. Lesson learned? Try not to fragment the IPsec packets themselves, fragmenting before IPsec is fine...just not on IPsec/tunnel encapsulated packets.

chsh1ca · May 16, 2003

Ouch, nice to see the problem is a real problem. Hopefully Cisco can fix it quickly for ya.

Spidey quiz - why is this tunnel having problems?

No Lifer

Golden Member

Platinum Member

No Lifer

Golden Member

Member

Platinum Member

No Lifer

Moderator<br>Networking<br>Elite member

No Lifer

No Lifer

Moderator<br>Networking<br>Elite member

Platinum Member

Platinum Member

Golden Member

No Lifer

No Lifer

Diamond Member

No Lifer

Golden Member

Golden Member

Golden Member

Platinum Member

No Lifer

Golden Member