Tuesday, August 4, 2009

Configuring Server Load Balancing

Hi all, at my workplace, we're considering the purchase of a load balancer to provide better redundancy and services availability to our customers.
Well, a load balancer with decent performances will cost a lot!
So my attention it's focused on the ip slb feature that it's present on 12.1 and 12.2 Ios.

First read the IOS "Server Load Balancing Feature in IOS Release 12.2(18)SXF" document on cisco site

Second, I'll try it in my lab, with an old 3660, simulating two web servers with routers ;--)

Let's take a look to the testing topology:


As we can read on the document above, there are several mode for slb, depending on what service is hosted on real servers: a simple one is slb for http servers

After configuring basic ip addressing and ip routing for this topology, we can try to assign a virtual server ip address to the slb router:


!-- configure first the real server farm
ip slb serverfarm HTTPSERVERS
nat server !-- notes on nat below...
predictor leastconns
real 10.0.0.2
weight 2
faildetect numconns 3 !-- note: this means something line "3 failed conns from 3 different clients"
inservice
real 10.0.0.3
weight 3
faildetect numconns 3
inservice

!-- then configure the virtual server
ip slb vserver 10.5.5.5
virtual 10.5.5.5 tcp www !-- it balances only for www port
serverfarm HTTPSERVERS !-- associates the real server farm to this virtual server
sticky 180 !-- same client will use the same real server for 180 secs
inservice


SLB_ROUTER#sh ip route 10.5.5.5
Routing entry for 10.5.5.5/32
Known via "static", distance 1, metric 0 (connected)
Redistributing via eigrp 35
Advertised by eigrp 35
Routing Descriptor Blocks:
* 10.5.5.5, via Null0
Route metric is 0, traffic share count is 1

SLB_ROUTER#


Here I used the server NAT feature, so web servers are completely unaware of the load balancer, and they can be several hops away from slb....
If you don't use server NAT, the load balance acts only at L2 level, on MAC addresses, so you have to configure a loopback with the virtual server ip on real servers in order to accept L3 packets with dest address the virtual ip.
A static route pointing to null0 interface is automatically added for each vserver.

Verify if the slb is up and running:

SLB_ROUTER#sh ip slb vservers

slb vserver prot virtual state conns
-------------------------------------------------------------------
10.5.5.5 TCP 10.5.5.5:80 INSERVICE 0

SLB_ROUTER#sh ip slb serverfarms

server farm predictor nat reals bind id
---------------------------------------------------
HTTPSERVERS LEASTCONNS S 2 0

SLB_ROUTER#sh ip slb reals

real server farm weight state conns
-------------------------------------------------------------------
10.0.0.2 HTTPSERVERS 2 OPERATIONAL 0
10.0.0.3 HTTPSERVERS 3 OPERATIONAL 0

The state of real servers is "OPERATIONAL" after a try, or "READY_TO_TEST" before the first connection is received.

On real "servers" I have configured only "ip http server" and the necessary route to reach clients.... let's try from client perspective....


Client#telnet 10.5.5.5 80
Trying 10.5.5.5, 80 ... Open

...and then...

[Connection to 10.5.5.5 closed by foreign host]
Client#

and on the slb router you can see:

SLB_ROUTER#sh ip slb conns

vserver prot client real state nat
-------------------------------------------------------------------------------
10.5.5.5 TCP 172.17.0.25:14455 10.0.0.3 ESTAB S

SLB_ROUTER#sh ip slb sticky

client netmask group real conns
-----------------------------------------------------------------------
172.17.0.25 255.255.255.255 4097 10.0.0.3 1

SLB_ROUTER#sh ip slb reals detail
10.0.0.2, HTTPSERVERS, state = OPERATIONAL
conns = 0, dummy_conns = 0, maxconns = 4294967295
weight = 2, weight(admin) = 2, metric = 0, remainder = 0
reassign = 3, retry = 60
failconn threshold = 3, failconn count = 0
failclient threshold = 2, failclient count = 0
total conns established = 0, total conn failures = 0
server failures = 0

10.0.0.3, HTTPSERVERS, state = OPERATIONAL
conns = 1, dummy_conns = 0, maxconns = 4294967295
weight = 3, weight(admin) = 3, metric = 0, remainder = 1
reassign = 3, retry = 60
failconn threshold = 0, failconn count = 0
failclient threshold = 0, failclient count = 0
total conns established = 2, total conn failures = 0
server failures = 0


Now, if a server fails, what's happening? Let's try to shut down a "server" interface:

Real-Web-Server2#conf t
Enter configuration commands, one per line. End with CNTL/Z.
Real-Web-Server2(config)#int eth 0/0
Real-Web-Server2(config-if)#shut
Real-Web-Server2(config-if)#shutdown
Real-Web-Server2(config-if)#end
Real-Web-Server2#


SLB_ROUTER#debug ip slb all
SLB All debugging is on
SLB_ROUTER#
4w3d: SLB_CONN_DEBUG: TCP event= SYN_CLIENT, state= INIT -> SYNCLIENT
4w3d: v_ip= 10.5.5.5:80 ( 3), real= 10.0.0.3, NAT(S)
4w3d: client= 172.17.0.25:21706
4w3d: SLB_CONN_DEBUG: TCP event= SYN_CLIENT, state= SYNCLIENT -> SYNCLIENT
4w3d: v_ip= 10.5.5.5:80 ( 3), real= 10.0.0.3, NAT(S)
4w3d: client= 172.17.0.25:21706
4w3d: SLB_CONN_DEBUG: TCP event= SYN_CLIENT, state= SYNCLIENT -> SYNCLIENT
4w3d: v_ip= 10.5.5.5:80 ( 3), real= 10.0.0.3, NAT(S)
4w3d: client= 172.17.0.25:21706
4w3d: SLB_CONN_DEBUG: TCP event= SYN_CLIENT, state= SYNCLIENT -> SYNCLIENT
4w3d: v_ip= 10.5.5.5:80 ( 3), real= 10.0.0.3, NAT(S)
4w3d: client= 172.17.0.25:21706
4w3d: SLB_REAL_DEBUG: 10.0.0.3 (HTTPSERVERS) event = SLB_CONN_FAIL state= OPERATIONAL -> OPERATIONAL
4w3d: SLB_REAL_DEBUG: 10.0.0.3 (HTTPSERVERS) event = SLB_REAL_FAILURE state= OPERATIONAL -> FAILED
4w3d: SLB_CONN_DEBUG: TCP event= SYNACK_SERVER, state= SYNCLIENT -> ESTAB
4w3d: v_ip= 10.5.5.5:80 ( 3), real= 10.0.0.2, NAT(S)
4w3d: client= 172.17.0.25:21706
4w3d: SLB_CONN_DEBUG: TCP event= DATA_CLIENT, state= ESTAB -> ESTAB
4w3d: v_ip= 10.5.5.5:80 ( 3), real= 10.0.0.2, NAT(S)
4w3d: client= 172.17.0.25:21706
4w3d: SLB_CONN_DEBUG: TCP event= DATA_CLIENT, state= ESTAB -> ESTAB
4w3d: v_ip= 10.5.5.5:80 ( 3), real= 10.0.0.2, NAT(S)
4w3d: client= 172.17.0.25:21706
SLB_ROUTER#
SLB_ROUTER#sh ip slb reals detail
10.0.0.2, HTTPSERVERS, state = OPERATIONAL
conns = 1, dummy_conns = 0, maxconns = 4294967295
weight = 2, weight(admin) = 2, metric = 0, remainder = 1
reassign = 3, retry = 60
failconn threshold = 3, failconn count = 0
failclient threshold = 2, failclient count = 0
total conns established = 3, total conn failures = 0
server failures = 0

10.0.0.3, HTTPSERVERS, state = FAILED
conns = 0, dummy_conns = 0, maxconns = 4294967295
weight = 3, weight(admin) = 3, metric = 0, remainder = 0
reassign = 3, retry = 60
failconn threshold = 0, failconn count = 1
failclient threshold = 0, failclient count = 1
total conns established = 2, total conn failures = 2
server failures = 1

SLB_ROUTER#

!-- after 60 sec the failed server is placed in "READY_TO_TEST" state
SLB_ROUTER#
4w3d: SLB_REAL_DEBUG: 10.0.0.3 (HTTPSERVERS) event = SLB_REAL_TIMEOUT state= FAILED -> READY_TO_TEST
SLB_ROUTER#





As next step I'll test it on my production 6509...

5 comments:

Sara said...

what are your thoughts on kemps load master 2000?

Marco Rizzi said...

I don't have any experience with Kemps products, so, can't say.
Anyway, nice to see prices on every product page ;-)

Zachary said...

Load Balancers come with (A) web user interface for access through the internet; (B) command line interface (CLI) for local setup or (C) both. A load balancer is commonly compared to a Swiss Army Knife. It can perform many different functions simultaneously. (A) The load balancer is constantly checking the health of your servers and applications providing high availability to the users and automatically removes apps and/or servers that fail a health check; (B) It accelerates performance depending on the manufacturer and the processing power of the load balancer through offloading CPU intensive processing such as SSL encryption and/or by compressing data transmitted. (C) Is often the last line of defense sitting in front of the servers providing security. (D) Greater user satisfaction with persistence and high availability. Load balancers typically provide this and more depending on cost and functionality. It used to be the lower the cost less performance and functionality was available. In the year 2009 vendors such as Cainet, KEMP, Coyote Point deliver significant value in terms of cost and functionality. However if you have a big IP budget and need every function available to the market it can also be had. F5 and Citrix are examples of powerful, most functional, luxurious load balancers. “Free” load balancers are also available.

Marco Rizzi said...

Thanks for contribution, Zachary, I agree, real load balancers can do much more than simply balancing connections, however this is an example on how realize it with a Cisco sw running 12.2 or with a router, a sort of zero-budget implementation... ;-)
Indeed, an F5 pre-sales has avoided to call their products "load balancers".... he preferred "application delivery", but that's another story....

adidibra said...

Did you do any change in the NAT configuration?

Any ip nat commands?