Cisco stackport admin down

Lately one of our customers Cisco 3750 switch stacks has a syslog entry that states that one of the stackports is administratively down. About 10 seconds later the port comes up again. Because the redundant stackcables, the stack’s functionality is as usual. While troubleshooting it became clear to me it is possible to administratively shutdown a stack port.

I never knew that a stackport could be configured as administrativly down. After some research in the Cisco 3750 config guide I found out it is absolutly possible.
Normally a switchport is brought administratively down in the “interface configuration” mode. A stackport can be brought administratively down in the “privileged EXEC” mode, with the following command:

Switch01# switch <switch number> stack port <stack port number> disable/enable

Be carefull with this command, because if you shut the wrong stack port it’s possible that traffic is disrupted.

 

 

Advertisements

Spontanious reboot of Cisco 6509E switch

Lately one of our customers Cisco 6509E switches rebooted spontanious. At first sight it looked like there was no obvious reason for this problem. After investigation, the reason of the spontanious reboot became clear.

The first clue was found when the “show version” command was run. This was the output (the output is omitted):

!
switch#show version

System returned to ROM by s/w reset at 07:43:59 gmt Sun Dec 8 2013 (SP by processor memory parity error at PC 0x419EEB88, address 0x0)

switch#
!

The fat part of the output triggered me, to download the crashinfo from the sup-bootflash.

!
show sup-bootflash
!
-#- –length– —–date/time—— path
-#- ED —-type—- –crc— -seek– nlen -length- ———date/time——— name
1 .. config 4F161C23 9D14C 21 118988 Apr 13 2012 00:26:20 +02:00 BackupSXH513apr12.bak
2 .. crashinfo 1A1C516D 104FF0 32 425506 Dec 8 2013 07:43:55 +01:00 crashinfo_RP_20131208-074355-gmt
!
copy bootflash:/crashinfo_RP_20131208-074355-gmt tftp:
!

After investigating the crashinfo I found the following lines:

!
674670: Dec 8 07:43:55: %C6K_PLATFORM-2-PEER_RESET: RP is being reset by the SP
%Software-forced reload

07:43:55 gmt Sun Dec 8 2013: Breakpoint exception, CPU signal 23, PC = 0x428A667C

——————————————————————–

Possible software fault. Upon reccurence, please collect

crashinfo, “show tech” and contact Cisco Technical Support.

——————————————————————–

After reading this meassage it became clear that the RP is rebooted by the SP, so for further information the logging from the SP has to be investigated.

The SP crashinfo can be found:

!
show sup-bootflash:
!
-#- –length– —–date/time—— path
1 74788836 Sep 24 2009 09:48:00 +02:00 s72033-ipservicesk9_wan-mz.122-33.SXH5.bin
2 33554432 Sep 24 2009 11:57:50 +02:00 sea_log.dat
3 119027 Apr 12 2012 02:09:34 +02:00 BackupSXH513apr12.bak
4 33554432 Apr 13 2012 01:08:52 +02:00 sea_console.dat
5 139998532 Mar 9 2012 11:40:28 +01:00 s72033-ipservicesk9_wan-mz.122-33.SXJ2.bin
6 439014 Dec 8 2013 07:43:58 +01:00 crashinfo_SP_20131208-074355-gmt
!
copy sup-bootflash:/crashinfo_SP_20131208-074355-gmt tftp:
!

In the SP crashinfo the following lines triggered me:

Cache error detected!

CPO_ECC (reg 26/0): 0x000000EC

CPO_CACHERI (reg 27/0): 0x20000000

CP0_CAUSE (reg 13/0): 0x00000C00

Real cache error detected. System will be halted.

Error: Primary instr cache, fields: data,

Actual physical addr 0x00000000,

virtual address is imprecise.

Imprecise Data Parity Error

Imprecise Data Parity Error

07:43:55 gmt Sun Dec 8 2013: Interrupt exception, CPU signal 20, PC = 0x419EEB88

——————————————————————–

Possible software fault. Upon reccurence, please collect

crashinfo, “show tech” and contact Cisco Technical Support.

——————————————————————–

After looking up this error messsage on the Cisco support forum only two possible options were left:

1) soft-parity error

Cisco says the following about soft-parity errors:

“These errors occur when an energy level within the chip (for example, a one or a zero) changes, most often due to radiation. When referenced by the CPU, such errors cause the system to crash. In case of a soft parity error, there is no need to swap the board or any of the components.”

2) hard-parity error

Cisco says the following about hard-parity errors:

“These errors occur when there is a chip or board failure that corrupts data. In this case, you need to re-seat or replace the affected component, which usually involves a memory chip swap or a board swap.”

After reading some more about these possible problems it became clear that making a new TAC case for this problem was useless. Cisco states on the official forum that they are not able to tell if it was a soft or hard parity error after just one spontanious reboot.

They state that only 1 out of 100 reboots with this error is a hard-parity error. If the switch doesn’t reboot again in one or two days it is safe to say that it is a soft-parity error. Cisco also states that in case of a soft-parity error an upgrade to the latest IOS version can prevent a spontanious reboot.

Our advise to the customer is to upgrade to the latest IOS version and monitor the switch for odd behaviour. When the switchs reboots a second time, a TAC case has to be opened.

Cisco switch high CPU load due to IP Input process

A customer had a problem with one of their distribution switches. The switch was not reachable by telnet or SSH. About 90% of the pings couldn’t reach the switch. Normal traffic wasn’t possible either.

After initial troubleshooting we concluded, with the “show processes cpu sorted” command, that the CPU value was around 99% all the time because of the “IP Input” process.

This process handles, as the name already says, IP traffic. To find out who or what was spamming our switch with traffic, we issued the “debug ip traffic detail” command.

After configuring this command it became clear that a server with HP thinclient discovery software was sending traffic to the broadcast address of the matching vlan. Because of this the IP Input process was sky high and the switch not able to send traffic.

After a reboot of the server, the CPU utilization of the switch dropped below 10% and everything was normal again.

Vlan Hopping

At the office we had a discussion about vlan hopping. What is possible and what’s not?
Basically there are two possibilities, a “DTP attack” and “double tagging”.

The first option is based on the “Dynamic Trunking Protocol”. With this attack it is possible to create a trunk from a hackers pc. By sending malicious DTP packets to the switch, the attacker is able to create a trunk. From this point the attacker can reach all vlan’s on the trunk.

This attack can be avoided by disabling the DTP protocol. Use “vlan allowed” lists on trunks and make sure that unused ports are configured as access ports and placed in a dummy vlan. And make sure the ports are administratively down!

The second option is called double tagging. With this kind of attack a hacker double tags a packet.
Let’s say the following network situation is in place:

Switch 1 and switch 2 are connected by a trunk. This trunk has vlan 1 configured as the native vlan.
One PC connected to switch 1 in vlan 1 and one PC connected to switch 2 in vlan 100.
The attacker uses PC 1. He or she sends a malicious packet that is double tagged with vlan 1 and 100. The packet arrives at switch 1, the first tag is taken off. The switch can’t see the second tag and because the traffic is on the native vlan it is send to switch 2. When the packet arrives at switch 2, the switch see’s the tag and puts the traffic in vlan 100. As you can see, the traffic hopped from vlan 1 to vlan 100.
See the image below for a graphical view.

Double tagging

This attack can be avoided by using a dummy vlan as native vlan!

VLAN Access Control Lists

A customer asked me if it is possible to block ping (or icmp) from host 1 to host 2 in the same vlan.
Yes this is possible, if you use a so called VLAN Access Control List or VACL.

It’s fairly easy to configure. First of all, create a vlan:
!
conf t
!
vlan 100
name Test
!

Then create a SVI (this is not mandatory):
!
conf t
!
int vlan 100
ip address 10.0.0.254 255.255.255.0
no shut
!

Create an extended access list in which you permit the traffic you want to drop:
!
conf t
!
ip access-list extended VACL_test
permit icmp host 10.0.0.1 host 10.0.0.2
!

Create the access-map:
!
conf t
!
vlan access-map VACL_no_icmp
action drop
match ip address VACL_test // this points to the access list created earlier
vlan access-map VACL_no_icmp
action forward // this permits all other traffic
!

Now connect the access-map to the vlan:
!
conf t
!
vlan filter VACL_no_icmp vlan-list 100
!

Try pinging from host A to host B and you will see it isn’t possible. Try pinging your default gateway and you will see this is possible.
With other words, the VACL is functioning as intended.

Wireshark on Cisco 4500

For a customer I’m installing some Cisco 4507 switches into their network. Whilst troubleshooting it came to my attention that it is possible to use wireshark from cli of the 4500 switch.
One of the biggest advantages is that you don’t need to create SPAN or RSPAN ports anymore.

There are, off course, some prerequisites:
-Supervisor 7-E/L-E
-IOS-XE version 3.3(0) / 151.1 or higher
-Enterprise services license
-CPU utilization lower than 50%

To use wireshark:

monitor capture <name> interface <int nr> <IN/OUT/BOTH>
monitor capture file location bootflash:<name.pcap>
monitor capture <name> match <any/…>
monitor capture <name> start/stop

To view the captured file :

show monitor capture file bootflash:<name.pcap>

The output will look like beneath:

switch#sh monitor capture file bootflash://Test.tcap
1 0.000000 10.112.200.1 -> 10.111.200.1 UDP Source port: 32664 Destination port: 32672
2 0.005996 10.111.200.2 -> 10.121.200.1 UDP Source port: 32656 Destination port: 32560
3 0.005996 10.111.200.2 -> 10.121.200.1 UDP Source port: 32692 Destination port: 32668
4 0.005996 10.111.200.2 -> 10.71.101.55 UDP Source port: 32676 Destination port: 32514
5 0.005996 10.111.200.2 -> 10.61.101.99 UDP Source port: 32608 Destination port: 32514
6 0.007004 10.121.200.1 -> 10.111.200.2 UDP Source port: 32560 Destination port: 32656
7 0.007004 10.121.200.1 -> 10.111.200.1 UDP Source port: 32532 Destination port: 32660

If you want a more detailed view of the grabbed traffic, use:

show monitor capture file bootflash:<name.pcap> detailed

The output will look like beneath:

CSS02#sh monitor capture file bootflash://Test.tcap detailed
Frame 1: 222 bytes on wire (1776 bits), 222 bytes captured (1776 bits)
Arrival Time: Mar 18, 2013 11:33:21.013991000 CET
Epoch Time: 1363602801.013991000 seconds
[Time delta from previous captured frame: 0.000000000 seconds]
[Time delta from previous displayed frame: 0.000000000 seconds]
[Time since reference or first frame: 0.000000000 seconds]
Frame Number: 1
Frame Length: 222 bytes (1776 bits)
Capture Length: 222 bytes (1776 bits)
[Frame is marked: False]
[Frame is ignored: False]
[Protocols in frame: eth:vlan:ip:udp:data]
Ethernet II, Src: 00:12:43:7d:b2:41 (00:12:43:7d:b2:41), Dst: e0:2f:6d:a4:ff:bf (e0:2f:6d:a4:ff:bf)
Destination: e0:2f:6d:a4:ff:bf (e0:2f:6d:a4:ff:bf)
Address: e0:2f:6d:a4:ff:bf (e0:2f:6d:a4:ff:bf)
…. …0 …. …. …. …. = IG bit: Individual address (unicast)
…. ..0. …. …. …. …. = LG bit: Globally unique address (factory default)
Source: 00:12:43:7d:b2:41 (00:12:43:7d:b2:41)
Address: 00:12:43:7d:b2:41 (00:12:43:7d:b2:41)
…. …0 …. …. …. …. = IG bit: Individual address (unicast)
…. ..0. …. …. …. …. = LG bit: Globally unique address (factory default)
Type: 802.1Q Virtual LAN (0x8100)
802.1Q Virtual LAN, PRI: 0, CFI: 0, ID: 100
000. …. …. …. = Priority: Best Effort (default) (0)
…0 …. …. …. = CFI: Canonical (0)
…. 0000 0110 0100 = ID: 100
Type: IP (0x0800)
Trailer: fe151ab9
Internet Protocol, Src: 10.112.200.1 (10.112.200.1), Dst: 10.111.200.1 (10.111.200.1)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
0000 00.. = Differentiated Services Codepoint: Default (0x00)
…. ..0. = ECN-Capable Transport (ECT): 0
…. …0 = ECN-CE: 0
Total Length: 200
Identification: 0x0000 (0)
Flags: 0x02 (Don’t Fragment)
0… …. = Reserved bit: Not set
.1.. …. = Don’t fragment: Set
..0. …. = More fragments: Not set
Fragment offset: 0
Time to live: 63
Protocol: UDP (17)
Header checksum: 0x9643 [correct]
[Good: True]
[Bad: False]
Source: 10.112.200.1 (10.112.200.1)
Destination: 10.111.200.1 (10.111.200.1)
User Datagram Protocol, Src Port: 32664 (32664), Dst Port: 32672 (32672)
Source port: 32664 (32664)
Destination port: 32672 (32672)
Length: 180
Checksum: 0x0000 (none)
[Good Checksum: False]
[Bad Checksum: False]
Data (172 bytes)

0000 80 08 8c 70 b2 d2 79 60 84 64 76 45 d5 d5 d5 d5 …p..y`.dvE….
0010 55 d5 55 d5 55 54 55 54 55 d5 55 55 55 d5 d5 55 U.U.UTUTU.UUU..U
0020 d5 d4 d4 d4 d5 d5 d5 d5 d5 d4 d5 55 55 55 55 54 ………..UUUUT
0030 d5 d5 55 d5 54 d5 d5 54 55 54 d5 d5 55 d5 55 d4 ..U.T..TUT..U.U.
0040 d4 d5 d4 d4 d5 55 55 55 d5 d5 54 d5 d5 55 d5 55 …..UUU..T..U.U
0050 d5 55 d5 55 55 d5 55 55 d5 d5 d5 55 d5 d5 d4 d7 .U.UU.UU…U….
0060 d5 d5 d5 55 d5 54 55 55 d5 d5 55 d5 54 55 d5 54 …U.TUU..U.TU.T
0070 d5 d5 55 d5 d5 d4 d5 55 d5 55 55 d5 d5 d5 d5 d5 ..U….U.UU…..
0080 54 54 d5 d5 d5 d5 55 d5 d5 55 55 d5 d5 d4 d5 d5 TT….U..UU…..
0090 d5 d5 d5 d5 d5 d5 55 55 55 55 55 55 d5 55 55 d5 ……UUUUUU.UU.
00a0 d5 d4 d5 d5 d5 d5 d4 d5 d5 d5 55 55 ……….UU
Data: 80088c70b2d2796084647645d5d5d5d555d555d555545554…
[Length: 172]

If you want specifics from the file you can use the good old pipe sign:

show monitor capture file bootflash:<name.pcap> | include <detail>

Off course it’s possible to transfer the *.pcap file to your local machine:

copy bootflash:<*.pcap> ftp/tftp

High CPU loads Cisco 2960

Lately I have been working a lot with Cisco 2960 access switches, with different IOS versions. I noticed that the C2960-48TC-L and C2960-48PST-L types had high cpu loads. After contacting Cisco it looks like it’s not a bug, but a “feature”.

On the switch there is a process called “Hulc LED process”. In short this proccess is responsible for link detection. When a switch has al it’s ports with the “connected” status there is no problem. If one or more  ports has it’s status “not connected” the process spikes up to 80%.

Cisco’s solution after reading the official Cisco forum: the process as mentioned above is working as designed and if you want to temper the cpu load, manually shutdown the “not connected” ports.

I think it’s a really crappy answer because a not connected port can have a turned off computer behind it.

Nevertheless the solution works. After manually shutting down the “not connected” ports the process cpu load drops to normal values.