HSRP + BGP Troubles

randal

Golden Member
Jun 3, 2001
1,890
0
71
Ok, so I'm working my lab on some neato things like HSRP + BGP. End goal is LAN & WAN redundancy.

Please see the attached diagram here: http://www.data102.com/~randal/hsrp_bgp_fun.gif

The general gist is that I have two routers, 7204 (RTR_A) and a 3640 (RTR_B), that are running HSRP on the LAN side. The 7204 has two BGP sessions on it (faked from an openbgpd machine) that include default and three 10.x.x.x networks, working fine.

I have read that using HSRP on the WAN side is a bad time, especially when it comes to maintaining BGP sessions and convergence times. Instead, both of the routers peer with both of the upstreams, with RTR_B being AS-Prepended by one as-hop to make it less desirable when RTR_A is up. This is up and running.

If I unplug RTR_A's ethernet interface, HSRP does work and the LAN side fails over appropriately. The problem I have is that if RTR_A LAN interface goes down, the WAN interface + BGP sessions don't! All of the outgoing traffic does go out RTR_B as it should, but the return traffic goes to RTR_A, which then goes nowhere.

So basically, my question is this: in the case of RTR_A's LAN failure, is there a way to make it close all of it's BGP sessions or at least stop announcing the routes so that inbound traffic comes through RTR_B?

Followup question then turns to tracking the WAN links, decrementing priorities & removing routes, all of which should be straightforward.
 

Pheran

Diamond Member
Apr 26, 2001
5,740
35
91
There are a couple of ways to deal with this, but the first question I have to ask is where is the IGP route for 10.10.0.0/24 pointing? If it's heading out the LAN interface and that interface goes down, the route should be withdrawn from the routing table on RTR_A and thus BGP should stop announcing it, since BGP won't announce networks that are not in the routing table. The question is why isn't this happening? Do you have a null0 route in place that's keeping the announcement alive?
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
sorry...keybo@rd...

iBGP between routers, with HSRP for g@tew@y is how it is done.

BGP peer with RE@L IPs(loops or re@l IP),never HSRP virtu@ls. (e or i)

HSRP is only done for g@tew@y protection (like for hosts/st@tic routes)
 

randal

Golden Member
Jun 3, 2001
1,890
0
71
Yes, there is a null0 route for the 10.10.0.0/24 network. I presume that is why it is still being advertised ... the larger picture, though, has multiple small subnets hanging off of the 10.10.0.0/24 interface, though, none of which are bigger than a /24, meaning that when it goes down the little /26 & /27s will go down, but the bigger /20 that holds them will still be null0'd.

Spidey, I am not sure what you are saying. Running iBGP between the two routers doesn't make sense to me unless they each have one of the upstreams, effectively load sharing. That still presents the problem that if LAN on RTR_A goes down, it will continue advertising and receiving traffic for the AS even though it can't route it to the LAN hosts.

And get a new keyboard ... I thought you were an all powerful network guy! ;)
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
Well that's how it is normally done in your scenario. And you don't have to have load balancing - you just use a prefix list/route map to set local preference when advertising out eBGP.

In essence it takes care of what you are trying to do. Maintain correct reachability info between both routers.
 

randal

Golden Member
Jun 3, 2001
1,890
0
71
Originally posted by: spidey07
Well that's how it is normally done in your scenario. And you don't have to have load balancing - you just use a prefix list/route map to set local preference when advertising out eBGP.

In essence it takes care of what you are trying to do. Maintain correct reachability info between both routers.

Right, but the problem persists in that if RTR_A's LAN side goes down, its WAN is still up and advertising (maybe 1/2, 3/4, whatever) the address space, effectively blackholing that portion of the net. This leaves RTR_B to sit and, with lower preferenced announcements, take whatever RTR_A isn't taking. Thus the problem - if LAN/RTR_A goes down, I need it to pull all of its announcements, shut off its WAN interfaces or something similar.

My end game is not reachability for both routers - my end game is being reliant on a single router with 100% failover & throughput onto the secondary.
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
running iBGP will take care of this for you.

-edit- oh, I see you have forced the route up with a null interface. don't do that. That's why it is still being advertised.

you shouldn't be having any blackhole problems if setup correctly.

I've done this scenario no less than 50 times and never had a problem? It's kinda like the standard way for redundancy for Internet connections. There is no single point of failure.

1) two routers with one or more circuits each, peering with the serial connections IP address. Use local pref if you want to prefer one router over the other.
2) no synchronization
3) iBGP between the routers
4) HSRP/GLBP for gateway redundancy (your internal defaults point to this)
5) one router into one switch, the other into another.

Maybe I'm confused on what you want to accomplish? This method is normally how you accomplish redundancy.
 

randal

Golden Member
Jun 3, 2001
1,890
0
71
Right, that makes sense if I want to have two internet feeds, one of which is idle unless in failover. I am aiming to to have the primary router actually have both feeds active on one router for optimal routing, then have both feeds fail to the secondary. In the real situation I am working on it, it is 3 internet feeds/2 routers, and then again on 2 feeds, 2 routers.

So basically I need NxPeers to be active then failover - both WAN & LAN - to a secondary router. Which is why it's a tough situation ... idling one circuit is not an option as they are 100mbps links that have significant costs. Yes, I know, spend the money and do it in idle+failover, but unfortunately that is not in the budget. What is in the budget is the already existing spare routers that are doing nothing -- I can idle routers (HSRP) but not the uplinks.

OH, and the reason for all this. Two weeks ago one of our edge 7206s took a dive and took out 2xFE internet feeds (totalling 120mbps). Unfortunately I don't work at a megacorp, so yes, it was single threaded by design. The other edge took over on its 3xFeeds, but we limped along on barely-enough-bandwidth for almost an hour. So, the situation is that we need a router to totally blow up, failover WAN & LAN 100% onto another router with no loss in capacity, and not blackhole traffic. Asking for the world on a shoestring, as usual :)

 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
oh, I see. Maybe if your diagram included actuall prefixes on what you are announcing and receiving and I can help.

It really shouldn't be any problem. If you lose a router the peering will tear down and you'll be fine (in your up state all your paths should be known). Just make sure you peer with the actual serial interfaces, or a loopback.

But do stay away from peering with an HSRP address. If you want full balancing with failover look at GLBP on your routers...it's a way of balancing traffic from inside to the outside via different routers...the return will follow BGP. Another way is to use two more routers as your WAN distribution layer and use GLBP and also run BGP on these to your WAN edge routers with the circuits. So four routers.
 

randal

Golden Member
Jun 3, 2001
1,890
0
71
Originally posted by: spidey07
oh, I see. Maybe if your diagram included actuall prefixes on what you are announcing and receiving and I can help.

It really shouldn't be any problem. If you lose a router the peering will tear down and you'll be fine (in your up state all your paths should be known). Just make sure you peer with the actual serial interfaces, or a loopback.

But do stay away from peering with an HSRP address. If you want full balancing with failover look at GLBP on your routers...it's a way of balancing traffic from inside to the outside via different routers...the return will follow BGP. Another way is to use two more routers as your WAN distribution layer and use GLBP and also run BGP on these to your WAN edge routers with the circuits. So four routers.

What if I don't lose a whole router, but just the LAN side of it? I need to make the router say "LAN is down, stop announcing!" -- your suggestion of putting another layer behind this that originates the prefixes is very interesting, as it accomplishes that by having the iBGP-received routes pulled.

A 4-way router setup though is pretty complex. Anybody else have any ideas on how to have RTR_A track its LAN interface and drop other interfaces if it goes down?
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
well it really should drop those routes if the FE goes down.

Why is the null0 route in there?

I still say iBGP needs to be exchanging information between the routers. But I'm no BGP expert.
 

randal

Golden Member
Jun 3, 2001
1,890
0
71
Originally posted by: spidey07
well it really should drop those routes if the FE goes down.

Why is the null0 route in there?

I still say iBGP needs to be exchanging information between the routers. But I'm no BGP expert.


The null0 is there to announce a larger aggregate block instead of a multitude of sub-/24 blocks, as the lab is not fed by an IGP that is then aggregated. That is planned, as the real network also announces large blocks that are comprised of several small nets.

iBGP would work great if each router handled a single upstream - in the case of failure though, we would not have 2 or 3 upstreams, but instead just the one on the not-down iBGP peer (thus the requirement that all upstreams stay up). iBGP already in use between our colorado springs and Denver sites and works exactly that way.

I don't even know if what I am trying to do is possible without a much larger infrastructure behind it. Any help or ideas will continue to be very appreciated.
 

randal

Golden Member
Jun 3, 2001
1,890
0
71
Well, after a lot of blood sweat and tears, to include the Cisco netpro forum, the solution is here! Cisco's advanced object tracking feature is the thing to use - basically it provides the ability to tag just about anything and then, if if the state changes on it, do something (like HSRP notify).

In my situation, I would track (1) my null routes for BGP announcing, as well as track (2) my LAN interface. If track(2) goes down, then the object on track(1) goes with it, AKA the null routes. Pulling the null routes removes the networks from the BGP announcement, and voila a solution.

Problem is that my IOS feature set does not include this, and the IOS on another router that does support this consumes memory like no other, reducing how many full BGP views I can take from 3 -> 2. Which makes the whole thing essentially pointless.

Good learning exercise with little results. As previously suggested we will go with HSRP + one or two peers per router. If our primary router fails ... well ... slow is better than down.

Thanks again for the help & ideas.