Unequal Load-Balancing on Cisco IOS

I just wanted to share a neat trick that a fellow CCIE colleague showed me.

In case of being connected to two ISPs, there is a way of doing unequal load-balancing with the help of static routes. For example, ISP X provides you with 25Mbps, and ISP Y with 50Mbps – a 2:1 ratio.

In order to achieve any kind of load-balancing on the Cisco IOS, we need multiple entries in the routing table, pointing to the same specific destination. As we would like to load-balance our uplink traffic towards the internet, we would need multiple entries towards our default gateways.

We are all familiar with the concept that there can be only one default route for a specific gateway – you can’t have multiple routing entries pointing to the same default gateway. That means that if we have multiple ISPs and multiple default gateways, our load-balance ratio will always be 1:1, as there is just a single entry in the routing table for each default gateway.

However, we can install multiple routing entries for seemingly different default gateways. That way, we can fool the device and have the same default gateway listed multiple times in the routing table. Ok, it sounds confusing, but just take a look at the configuration and it’ll become clear.

ip route 10.0.1.1 255.255.255.255 192.168.1.2  #(ISP X)
ip route 10.0.2.1 255.255.255.255 172.16.1.2  #(ISP Y)
ip route 10.0.2.2 255.255.255.255 172.16.1.2  #(ISP Y)

ip route 0.0.0.0 0.0.0.0 10.0.1.1
ip route 0.0.0.0 0.0.0.0 10.0.2.1
ip route 0.0.0.0 0.0.0.0 10.0.2.2

First, we define static routes for a couple of fake default gateways. Those IPs do not exist, and will only be used for the current load-ballancing trick, so be careful when setting up those and try not to assign some IPs in use.

After that, we define these fake IPs as default gateways. Having in mind that the ratio of the link bandwidth is 2:1, we created two routes towards the faster ISP and a single route towards the slower ISP.

What happens is the IOS uses all three of these default gateways, because the destination is seemingly different during the first look up in the routing table. The second look up will reveal that the fake default gateway’s IP is reachable only by either ISP X or ISP Y’s next-hop router. This is quite the ingenious way of tricking the device into installing multiple entries in its routing table.

Voice VLANs on HP Networking

In order to configure voice vlans, we need to play around with trunks and vlan tags. However, one may be surprised to find that on H3C HP hardware, there are three port link types:

  • Access
  • Trunk
  • Hybrid

So what on earth is a hybrid port? In order to answer that question, it is necessary to point out that VLAN classification of frames/packets can be based on the following:

  • Port-based
  • MAC address-based
  • Protocol-based
  • IP-subnet-based
  • Policy-based
  • Other types

Normally, everything happens on port level – depending on the VLAN access port setup, traffic entering a certain port will get classified into the correct VLAN.

A Hybrid port is  a port that can belong to multiple VLANs, can receive or send packets for multiple VLANs, used to connect either user or network devices.

That basically means that a hybrid port can do almost whatever you want it to do. For example, you can assign the port to appear as an access port to a specific MAC address, while still functioning as a trunk, while having a native vlan for untagged traffic. Also, a hybrid port can function as a trunk port with a native vlan.

So, in a nutshell, in order to configure your regular port with tagged vlan for VoIP phones + access vlan for the PC, you can either choose the classic method, or use a hybrid port. Since I presume everybody knows how to do it the old-fashioned way, let us configure it using a hybrid port.

So, what do we need for an IP phone and a PC? A tagged vlan for the Voice packets and an access vlan for the PC. Let us assume that vlan 102 is the Voice vlan, and vlan 7 is the PC vlan.

[HP-KC51-V] interface GigabitEthernet1/0/1
[HP-KC51-V-GigabitEthernet1/0/1] port link-type hybrid
[HP-KC51-V-GigabitEthernet1/0/1] port hybrid pvid vlan 7
[HP-KC51-V-GigabitEthernet1/0/1] port hybrid vlan 102 tagged
Please wait... Done.

[HP-KC51-V-GigabitEthernet1/0/1] stp edged-port enable
Warning: Edge port should only be connected to terminal. It will cause temporary loops if port GigabitEthernet1/0/1 is connected to bridges. Please use it carefully!

[HP-KC51-V-GigabitEthernet1/0/1] poe enable
#May 11 13:38:28:863 2000 HP-KC51-V POE/1/PSE_PORT_ON_OFF_CHANGE:
Trap 1.3.6.1.2.1.105.0.1: PSE ID 4, IfIndex 9437185, Detection Status 3.
#May 11 13:38:30:978 2000 HP-KC51-V IFNET/4/INTERFACE UPDOWN:
Trap 1.3.6.1.6.3.1.1.5.4: Interface 9437185 is Up, ifAdminStatus is 1, ifOperStatus is 1
#May 11 13:38:31:169 2000 HP-KC51-V MSTP/1/PFWD: hwPortMstiStateForwarding: Instance 0's Port 0.9437185 has been set to forwarding state!
%May 11 13:38:31:525 2000 HP-KC51-V IFNET/3/LINK_UPDOWN: GigabitEthernet1/0/1 link status is UP.
%May 11 13:38:31:650 2000 HP-KC51-V MSTP/6/MSTP_FORWARDING: Instance 0's GigabitEthernet1/0/1 has been set to forwarding state.

Of course, we enable stp edged-port (the portfast equivalent), as well as the PoE power for the IP phone. Let us inspect the configuration so far:

[HP-KC51-V-GigabitEthernet1/0/1]display this
#
interface GigabitEthernet1/0/1
port link-type hybrid
port hybrid vlan 102 tagged
port hybrid vlan 1 untagged
port hybrid pvid vlan 7
poe enable
stp edged-port enable
#
return

After inspecting the configuration we can observe that VLAN 1 is still permitted as an untagged vlan. However, the PVID (port vlan id), is set to 7. This may be confusing, so let’s elaborate. The untagged 1 VLAN means that the switch will pass traffic (e.g. broadcasts) from VLAN 1 down this port. The PVID of 7 means that when the switch receives untagged traffic, it will place in in VLAN 7.

Because most device management interfaces are assigned to VLAN 1, it is not a good idea to keep the port a part of this VLAN. So, let us remove the untagged VLAN 1 from the hybrid port.

[HP-KC51-V-GigabitEthernet1/0/1]undo port hybrid vlan 1
Please wait... Done.

[HP-KC51-V-GigabitEthernet1/0/1]display this
#
interface GigabitEthernet1/0/1
port link-type hybrid
undo port hybrid vlan 1
port hybrid vlan 102 tagged
port hybrid pvid vlan 7
poe enable
stp edged-port enable
#
return

Now the port is configured. There, wasn’t that difficult, right? 🙂

Default gateway on Cisco IOS

On Cisco hardware, or at least on most of the IOS family, there are three ways of specifying a default gateway. Let us look into that:

ip default-gateway
ip default-network
and ip route 0.0.0.0 0.0.0.0

The ip default-gateway command differs from the other two commands, as it should only be used when ip routing is disabled on the Cisco router, which means most probably never, unless in boot mode. In such case, you can use it to define a gateway to use TFTP to transfer a Cisco IOS image to the router. Apparently, the router does not have ip routing enabled in boot mode.

***

ip default-network

So, unless in boot mode, you should probably be using the ip default-network command. When you configure ip default-network the router considers routes to that network for installation as the gateway of last resort on the router.

For every network configured with ip default-network, if a router has a route to that network, that route is flagged as a candidate default route.

show ip route
Gateway of last resort is not set
161.44.0.0/24 is subnetted, 1 subnets
C 161.44.192.0 is directly connected, Ethernet0
131.108.0.0/24 is subnetted, 1 subnets
C 131.108.99.0 is directly connected, Serial0
S 198.10.1.0/24 [1/0] via 161.44.192.2

show ip route
ip default-network 198.10.1.0

 Gateway of last resort is 161.44.192.2 to network 198.10.1.0
161.44.0.0/24 is subnetted, 1 subnets
C 161.44.192.0 is directly connected, Ethernet0
131.108.0.0/24 is subnetted, 1 subnets
C 131.108.99.0 is directly connected, Serial0
S* 198.10.1.0/24 [1/0] via 161.44.192.2

The gateway of last resort is now set as 161.44.192.2. This result is independent of any routing protocol, as shown by the show ip protocols command at the bottom of the output. This can help you solve some tricky connectivity scenarios, as well as utilize two (or ever more) routing protocols for default gateway redundancy.

You can add another candidate default route by configuring another instance of ip default-network:

ip route 171.70.24.0 255.255.255.0 131.108.99.2
ip default-network 171.70.24.0
show ip route

Gateway of last resort is 161.44.192.2 to network 198.10.1.0
171.70.0.0/16 is variably subnetted, 2 subnets, 2 masks
S 171.70.0.0/16 [1/0] via 171.70.24.0
S 171.70.24.0/24 [1/0] via 131.108.99.2
161.44.0.0/24 is subnetted, 1 subnets
C 161.44.192.0 is directly connected, Ethernet0
131.108.0.0/24 is subnetted, 1 subnets
C 131.108.99.0 is directly connected, Serial0
S* 198.10.1.0/24 [1/0] via 161.44.192.2

However, changes did not take effect. That is because there is a potential pitfall with the ip default-network command – it is classful. Due to this, the command must be issued again, using the major net, in order to flag the candidate default route. Kind of like a recursive default-network path.

ip default-network 171.70.0.0
show ip route

Gateway of last resort is 171.70.24.0 to network 171.70.0.0
* 171.70.0.0/16 is variably subnetted, 2 subnets, 2 masks
S* 171.70.0.0/16 [1/0] via 171.70.24.0
S 171.70.24.0/24 [1/0] via 131.108.99.2
161.44.0.0/24 is subnetted, 1 subnets
C 161.44.192.0 is directly connected, Ethernet0
131.108.0.0/24 is subnetted, 1 subnets
C 131.108.99.0 is directly connected, Serial0
S* 198.10.1.0/24 [1/0] via 161.44.192.2

As the Cisco documentations describes this interesting ‘hack’, if the original static route had been to the major network, the extra step of configuring the default network twice would not have been necessary. As you can see, this may create some implications if your dynamic routing protocol is advertising networks with a subnet mask higher than the classless one. Do proceed with care, and lab and test any configuration change before deploying it in a production network. It is always easy to roll-back with a quick ip route 0.0.0.0, but if you’re troubleshooting from afar, never take the risk of cutting yourself off. Shall you have no other choice, scheduled reloading may help you in such case.

Let us test the fallback mechanism of the ip default-network command. If we are to remove a route to the particular default network, the router selects the other candidate default. Let’s try it:

no ip route 171.70.24.0 255.255.255.0 131.108.99.2
show ip route

Gateway of last resort is 161.44.192.2 to network 198.10.1.0
161.44.0.0/24 is subnetted, 1 subnets
C 161.44.192.0 is directly connected, Ethernet0
131.108.0.0/24 is subnetted, 1 subnets
C 131.108.99.0 is directly connected, Serial0
S* 198.10.1.0/24 [1/0] via 161.44.192.2

 

Of couse, there’s always the good old ip route command.

Creating a static route to network 0.0.0.0 0.0.0.0 is another way to set the default gateway on almost any layer 3 device.

Most network engineers think that a static route using the ip route command takes precedence over any other route, but there is an exception.

As stated by the Cisco documentation:
If you use both the ip default-network and ip route 0.0.0.0 0.0.0.0 commands to configure candidate default networks, and the network used by the ip default-network command is known statically, the network defined with the ip default-network command takes precedence and is chosen for the gateway of last resort. Otherwise if the network used by the ip default-network command is derived by a routing protocol, the ip route 0.0.0.0 0.0.0.0 command, which has a lower administrative distance, takes precedence and is chosen for the gateway of last resort. If you use multiple ip route 0.0.0.0 0.0.0.0 commands to configure a default route, traffic is load-balanced over the multiple routes.

So, to sum it up, if there is a default-network statement, and it points to a statically defined network, it overrides the ip route 0.0.0.0 0.0.0.0 command. Plain and simple.

Summary

When in doubt, always check the documentation first 🙂
The ip default-gateway command should only be used when ip routing is disabled on a Cisco router. In any other case, use either the ip default-network or ip route 0.0.0.0 0.0.0.0 commands to set the gateway. Just take care with the classfull behavior of the former command.

Upgrade firmware and bootrom on HP A5120

A simple software upgrade of an HP A5120 EI switch is explained in the following post.

The device software includes the Boot ROM program and the system boot file. After powered on, the device runs the Boot ROM program, initializes the hardware, and displays the hardware information. Then the device runs the boot file. The boot file provides drivers and adaption for hardware, and implements service features. The Boot ROM program and system boot file are required for the startup and running of a device.

NOTE: Regarding commands on the device, the BootROM is called bootrom, while the boot file is called boot-loader. So boot-loader and boot file are interchangeable in context, but not in syntax.

The Boot ROM program and system boot file can both be upgraded at the Boot ROM menu or at the command line interface (CLI). We will perform this upgrade by the command line this time.

dis ver
HP Comware Platform Software
Comware Software, Version 5.20, Release 2208
Copyright (c) 2010-2011 Hewlett-Packard Development Company, L.P.
HP A5120-48G EI Switch with 2 Interface Slots uptime is 0 week, 0 day, 17 hours, 56 minutes
HP A5120-48G EI Switch with 2 Interface Slots with 1 Processor
128M bytes SDRAM
16384K bytes Flash Memory
Hardware Version is REV.B
CPLD Version is 007
Bootrom Version is 607
[SubSlot 0] 48GE+4SFP Hardware Version is REV.B

This is the output of the “display version” command before the updates take place. Now, on to the real update – first, enable the bootrom security check. This should help you in case you try to update your device with a wrong boot file, but do not rely too much on it. After all, we should know what we’re doing in the first place 🙂

system-view
[HP]bootrom-update security check enable
[HP]quit

tftp [tftp server IP] get A5120EI-BTM-610.btm
 ...
File will be transferred in binary mode
Downloading file from remote TFTP server, please wait...\
TFTP: 0 bytes received in 0 second(s)
File downloaded successfully.

bootrom update file flash:/a5120ei-btm-610.btm slot 1
This command will update bootrom file on the specified board(s), Continue? [Y/N]:y
Now updating bootrom, please wait...
Succeeded to update bootrom of Board 1.

We have successfully updated the bootrom, by downloading the new file from a TFTP server. I will cover more on TFTP servers in a future blogpost.

Due to the insufficient space on the device, the current boot loader file needs to be deleted before the new one is uploaded. That is an interesting situation, where the device is left running with its boot loader in the RAM. Do not reboot the device before setting up the new boot loader or recovery steps will need to be taken.

The /unreserved parameter deletes the file from memory, as opposed to only moving it to the “Recycle Bin”. While in the Bin, the file will still take up space, hence the need for the complete removal.

delete /unreserved flash:/a5120ei-cmw520-r2208-s168.bin
The contents cannot be restored!!! Delete flash:/a5120ei-cmw520-r2208-s168.bin?[Y/N]:y
Deleting a file permanently will take a long time. Please wait...
.................................................................................................
%Delete file flash:/a5120ei-cmw520-r2208-s168.bin...Done.

tftp 192.168.15.39 get A5120EI-CMW520-R2215.bin
..
File will be transferred in binary mode
Downloading file from remote TFTP server, please wait......................................................................................................................................................................................................
TFTP: 12625865 bytes received in 198 second(s)
File downloaded successfully.

We are successful so far. Now, instruct the device to select the new boot-loader file. After that, verify that the new boot-loader will get loaded on the next reboot with the command “display boot-loader”. Do not forget to save the configuration before reloading, as missing that may make your device unbootable, and you may have to manually point to the new boot-loader again, from the bootrom (which means that you will incur downtime and would need physical access to the device – a nasty situation if you’re doing this from afar).

boot-loader file flash:/a5120ei-cmw520-r2215.bin slot 1 main
This command will set the boot file of the specified board. Continue? [Y/N]:y
The specified file will be used as the main boot file at the next reboot on slot 1!
display boot-loader
Slot 1
The current boot app is: flash:/a5120ei-cmw520-r2208-s168.bin
The main boot app is: flash:/a5120ei-cmw520-r2215.bin
The backup boot app is: flash:/
save main force
Validating file. Please wait......................
Saved the current configuration to mainboard device successfully.
Configuration is saved to device successfully.
reboot
Start to check configuration with next startup configuration file, please wait.........DONE!
This command will reboot the device. Continue? [Y/N]:y

After the reboot, check out the new version of both the bootrom and the boot-loader.

dis ver
HP Comware Platform Software
Comware Software, Version 5.20.99, Release 2215
Copyright (c) 2010-2012 Hewlett-Packard Development Company, L.P.
HP A5120-48G EI Switch with 2 Interface Slots uptime is 0 week, 0 day, 0 hour, 2 minutes
HP A5120-48G EI Switch with 2 Interface Slots with 1 Processor
128M bytes SDRAM
16384K bytes Flash Memory
Hardware Version is REV.B
CPLD Version is 007
Bootrom Version is 610
[SubSlot 0] 48GE+4SFP Hardware Version is REV.B

Always be very careful if doing this procedure remotely, backup both bootroms and bootloaders, as well as configuration files.
Never update the device during non-maintenance windows, and always be ready for the worst – which may very well be the need to physically access the device.

If the update takes place on an IRF system stack, you may speed up the procedure by enabling automatic boot-loader update during the stack formation, then updating only the master of the stack, and then rebooting the slave members.

By having the auto-update enabled, the slave members will download the new boot-loader from the master right after they have formed their neighborship. This way, you will only have to update a single device.

Traffic Shaping and Policing

Crash course in QoS

What is traffic shaping/policing? In a nutshell, policing is dropping packets when the traffic exceeds a certain speed threshold, while shaping is queuing the incoming traffic in order to send it at a lower rate. Naturally, shaping is applied to outbound traffic, while policing can be applied on both directions, although it is usually applied to the inbound traffic. The following is the general QoS terminology:

  • Tc – Time interval, over which the commited burst (Bc) can be sent
  • Bc – Commited burst, measured in bits. This is the amount of traffic to be sent each Tc.
  • Be – Excess burst in bits. This is the traffic sent above your Bc, and most of the time risks being dropped, due to being in excess
  • CIR – Commited information rate, in bits per second. This is you allowed speed from your Internet contract.
  • Shaping rate – This is the rate at which your device will be sending traffic, which may be equal to the CIR, or even a little bit greater (more on that – later)
  • Policing rate – The rate after which your ISP starts to drop your traffic, in order to control your speed (this may be bigger that the actual CIR)

The deal with Bc, Be and Tc is that if you have a 128kbps line, and the intervals are 10 in a second (that can be configured), each 10th of a second you send 12.8kb. If you don’t have anything to send one interval, you’ve wasted 12.8kb. So, to reclaim it, you could send 25.6kb the next interval, but now you’ve overused your allowance. That means that your Tc is 0.1, your Bc is 12.8, and when you reclaimed your lost bandwidth, your Be was 12.8kb.

As I mentioned, the time interval Tc can be configured. The time interval directly impacts your Be burst. Why would you modify your time interval, when you can burst all the traffic up as fast as you can, then just wait ‘in silence’ for the current time interval to end?

Consider you have a 32kbps serial line from your ISP. Which means that you can transfer 32 kilobits per second. However, what if the clock rate on your router is running with clock rate 64000? That means that the router is transmitting at the hardware speed of 64 kilobits per second. Does mean that we get twice the bandwidth allotted for free? No. Our device, as the DTE end of the line, cannot change the physical speed of transmission. Then how do we maintain the 32kbps speed? Simple – we transmit the most we can, and then wait. Since we can transmit 64kilobits per second, then we can transmit 32kilobits per half a second, and then wait another half of the second.

The VoIP guys now scream in terror “500ms latency?”. Yeah, it’s no good – we need the use of a shaper in such case.

Shapers

Using a traffic shaper (usually) means transmitting at a lower rate that receiving. There are a couple of gotchas to traffic shaping, mainly which traffic should you send first, and which one should wait in line, as well as the speed you are transmitting with. The first problem is resolved through queuing strategies.  The second – using careful planning of your shaping configuration. So let’s dive in!

We already established that if no shaping is used, our router will transmit at the physical clock rate as much as possible, and when your limit is reached (in our former case – at the half of a second), the ISP will drop police any other traffic for the rest of the interval (again, in our former case – for the rest of the half second). This 500ms latency is most of the time unacceptable, so we employ shaping. To assume a safe figure of many intervals in a single second (in order to minimize delay), Cisco routers have a predefined limit of the Bc value. How does the Bc affect your Tc? To calculate your Tc time interval, use the following formula

Tc = (Bc / CIR) x 1000

By default, Cisco routers will use a value of 8000 bits for Bc if the interface bandwidth rate <= 320kbps; and calculates the Tc using the upper formula (that’s why it is important to set up your bandwidth [speed] in the interface view). If your line is > 320kbps, your Tc will be 25ms fixed, and your Bc will equal = ( shaping rate * Tc ).

This setup ensures that delays are kept to a minimum, even with the default settings. Of course, you can tune your Bc, Be, CIR(using the bandwidth command), and by extension of the former values – the Tc (which cannot be directly modified).

Policing

Depending on the traffic contract with your ISP, your ISP may police your traffic with a Single rate or Dual rate policer. Single rate traffic contract usually defines an average speed, which the contractor guarantees. However, taking into account the “burstyness” of the traffic on packet-switched networks, and oversubscriptions, the contractor may decide to offer a Dual rate scenario. With the latter option, the contractor guarantees a minimum speed, and also provides a higher one, which is non-guaranteed.

One should also know that the policers are categorized into two groups: two-color and three-color. What that means is that the two-colored policer distinguishes traffic within the CIR, and traffic above it; while the three-color policer has the notions of two kinds of exceeding traffic – regular exceeding traffic, and extremely exceeding traffic (violating traffic). We can draw parallels with the Dual rate policer here – the CIR is the minimum speed guaranteed, which is the regular traffic speed for the policer. The non-guaranteed traffic is the regular exceeding traffic to the policer, and when traffic exceeds the average non-guaranteed limit, it is considered the violating traffic. Thus the policer has the notions of Conforming traffic, Exceeding traffic, and if three-colored, Violating traffic. As you can probably guess, the Dual-Rate policer can only be three-colored, as we have clearly defined minimum and average speeds. The way I like to differentiate between the single-rate and double-rate three-color policers is the following: it depends on the way we reserve bandwidth for the exceeding speeds. Let’s visualise it with some ASCII art! Here the higher rate “sits” on top of the minimum guaranteed rate.

Maximum "unsafe" utilization
|Be rate>===============
|Bc rate>

Maximum "safe" utilization
|Be rate>
|Bc rate>===============

No utilization
|Be rate>
|Bc rate>_______________

A peak in the midst of no utilization
|Be rate> /****\
|Bc rate>___/ *****\_____

A peak of violating traffic in the midst of no utilization
|Bv rate> /*\
|Be rate> /*** \
|Bc rate>___/ \_____

With a single-rate policer, whenever your traffic passes over the Bc rate, it will either get dropped, or get marked “eligible for discard”, which most of the time means that it will get dropped somewhere along the way if congestion occurs.

With each Tc interval, you gain the right to transmit Bc amount of traffic. If you transmit more than the Bc traffic, you get sanctioned.

With a double rate policer, whenever your traffic passes over the Bc rate, the Be rate is ‘utilized’. Each Tc interval you gain the right to transmit Bc+Be amounts of traffic. If your traffic is within the Bc rate, only the Bc ‘bucket’ is depleted. If you have stood a couple of intervals in silence, and then need to transmit, you could compensate for the previous ‘wasted’ time intervals, by filling in the Be bucket, thus getting a Bc+Be speed.

Summary

As one can see, the ISP limitations on speed are often rounded up to an interval of a second, which tends to get interesting to configure. Depending on what your goals are and which technology needs to be supported, an uplink traffic can be shaped in many ways. Fine tuning the shaping variables can either make wonders with your network, or make it as unresponsive as a dead cloud unicorn.

So, have you got anything peculiar to share on the QoS matter?

IRF on HP 5800

This is a quick primer on running IRF on a couple of HP 5800.

IRFv2 systems are connected using any 10GbE interface:

  • CX4
  • SFP+
  • XFP
  • XENPAK

A best practice of connecting IRF members is connecting them in a ring-like fashion. This guarantees that one link failure will not disrupt the stack. For example, if you have four devices, you should connect them like this: 1) – 2 – 3 – 4 – (1, with the first being connected to both the second and the fourth. Should the link between members 2 and 3 fail, what you’ll get is this 3-4-1-2.

Ok, enough of the theory. Let’s plug a couple of 10G SFP+ modules now.

%Apr 26 13:25:30:548 2000 HP OPTMOD/4/MODULE_IN: -Slot=2;
Ten-GigabitEthernet2/0/54: The transceiver is STACK_SFP_PLUS.

 

Now we proceed to configure basic stuff about the IRF stack:

system-view
System View: return to User View with Ctrl+Z.

irf domain [ID]

The domain ID is not necessary to match on the other members of the stack, but you should keep it the same for the sake of clarity later on.

 

Now we should renumber the IRF member ID if needed. To make more sense of this step, one should know that every IRF-enabled HP device assumes it is member number 1. That means that you should renumber every switch after the first one for the current stack. If two members have the same ID, they cannot form an IRF stack.

irf member 1 renumber [X]

In order for the renumbering to take place, the device should be reloaded. It is not necessary to save the configuration for the renumbering to take effect, but we should save it anyway. Issue a save, followed by a reboot and the device shall be renumbered.

Now it’s time we configured the logical IRF ports. Remember, an IRF port can be either a physical 10G port, or an aggregation group. In order to assign such a port to a logical IRF port, we should prepare the former first, by shutting it down. For example, we will use two physical 10G ports, namely Ten1/0/53 and Ten1/0/54.

system-view
interface Ten 1/0/53
shutdown
interface Ten 1/0/54
shutdown
quit

Now the ports are ready to be assigned to the logical IRF ports. Remember, the logical IRF ports on a device are only 2, and they connect in a cross-link fashion, namely Port 1 on a device connects to Port 2 on the other and vice versa.

irf-port 1/1
port group interface Ten 1/0/53
quit

irf-port 1/2
port group interface Ten 1/0/54
quit

By now you should have noticed the bolded prefix 1 in front of some of the interface commands. This is the current chassis number, and by default it is one (1). The time you will notice its significance is when you move on to configuring the second, third and so-on device from the stack. Remember, changing the IRF Member ID will change the whole internal addressing of the current device. The port Ten 1/0/53 would become Ten 2/0/53 on the second member of the stack, the IRF logical ports would become irf-port 2/1 and irf-port 2/2, and so on.

The same principle applies backwards, when you have an operating stack, and you renumber a device/chassis. Say, you renumber 2 to 3 and 3 to 2. What would happen is that when they get rebooted, they would download the configuration from the master switch, and end up with “exchanged” configurations from each other.

This seems like a good time to introduce the Master/Slave concepts of IRF. Basically, there is a priority value that plays an important role in the election, but there are a couple of pitfalls too. Let’s see what the process is:

Master election is held each time the topology changes, for example, when the IRF virtual device is established, a new member switch is plugged in, the master switch fails or is removed, or the partitioned IRF virtual devices merge. The master is elected based on the following rules in descending order:

  1. The current master, even if a new member has a higher priority. (When an IRF virtual device is being formed, all member switches consider themselves as the master, and this rule is skipped). If an election is held, and the current topology has 2 masters and N slaves, election is heldonly between the current 2 masters.
  2. The switch with a higher priority.
  3. The switch with the longest system up-time. (The member switches exchange system up-time in the IRF hello packets)
  4. The switch with the lowest bridge MAC address.

After a master election, all slave member switches initialize and reboot with the configuration on the master, and their original configuration, even if has been saved, will be lost.

Phew, I got carried away. Now that we have set up the logical IRF ports, we can bring back up the physical ports they comprise.

interface Ten 1/0/53
undo shutdown
%Apr 26 13:28:46:723 2000 HP IFNET/3/LINK_UPDOWN: Ten-GigabitEthernet1/0/53 link status is UP.

interface Ten 1/0/54
undo shutdown
%Apr 26 13:28:46:723 2000 HP IFNET/3/LINK_UPDOWN: Ten-GigabitEthernet1/0/54 link status is UP.

Save the configuration now! The next command will cause the device to reboot, and lose all unsaved changes.

Just to illustrate a point, issue a display irf topology. If you have any ports in the DISABLED or DOWN state (and you most probably will), you need to activate the irf-port configuration with the following command

irf-port-configuration active

The device should now activate its interfaces, send/receive IRF hello packets, form adjacencies and then if not the master, proceed to reboot itself and join the stack. You can also just reboot instead of activating the irf-port configuration, and the result would still be the same, though don’t forget to save the configuration either way.

Another pitfall you need to watch out for is if you configure all the ports, issue irf-port-config active, and then plug-in the SFP+. If not the master, the device will catch you off-guard and reboot 🙂

If you plug out one/all SFP+ transceivers and sever the IRF member from the stack, the irf-port configuration is not erased. Thus if you replug it later, the device will join the stack and reboot itself.

There you have it, a crash-course in IRF configuration. It was a lengthy post, but non-exhaustive nonetheless. For more info, check out the documentation for your specific device, as a couple of things depend on it, e.g. the maximum number of devices you’re able to join to the stack.

Remove GPT and go back to MBR

Note: if you would like to perform this action from within Windows, take a look at the comments section for a guide.

First of all, I assume you have an empty harddrive, or you don’t care about your data on it. It would be foolish to rewrite MBR and partition information on a drive with something even remotely usable on it. You’re going to lose it!

With that out of the way, and assuming you have backed up your data, let’s start – we’re gonna do this labeling DOS-style!

First, if you have no idea what GPT is, take a look at this.

Unlike MBR, GPT resides both at the beginning and the end of a drive, thus providing redundancy in case you wipe the beginning of your drive. If you write MBR to the beginning drive, or even zero it out, it doesn’t remove GPT recognition from fdisk and other tools that detect GPT at the end. You have to zero out the end of the drive as well.

In order not to zero out the whole drive, we’ll just clear the  blocks used by GPT. Here’s the rundown:

  1. Get the blocksize of the device. fdisk -s /dev/[HDDNAME]
  2. Round the last five digits of the size to zeros. For example, with block size 156290904, you get 156200000
  3. Zero out the last 1 000 000 blocks. dd if=/dev/zero of=/dev/[HDDNAME] bs=1k seek=[ROUNDEDSIZE]
  4. Zer0 out the first 20 000 blocks. dd if=/dev/zero of=/dev/[HDDNAME] bs=1k count=20

GPT layout table

 

To make more sense of what we just did, here’s the GPT layout image, courtesy of Wikipedia.

Each LBA entry on the diagram represents 512 bytes.

The third command zeroes out the last million blocks from your drive, just to be safe. That means that 1 000 000 * 512 bytes = approximately 480MB are zeroed out.

Looking at the diagram, zeroing out the last 17408 bytes (34 LBA * 512 bytes) could also work, if you’re in a hurry.

The fourth command zeroes out the first 20000 bytes. Unlike MBR which uses only the first 512k, you can see that GPT is spread on 4 LBAs, each one is 512k size. That means you can also delete only the first 17408 bytes (34 LBA * 512bytes).

Note:

On newer drives, the block size could be as high as 4096 bytes. Adjust the bs parameter in the dd command accordingly.

Edit (Apr/13): Corrected drivers -> drives; Added comments info

all those routes and paths