Tuesday, May 31, 2011

T1 Loopback Tests on Cisco 2821 Router

You're having an intermittent issue with one or several of your T1's in a multilink bundle and you show errors on the interfaces.
You've cleared counters, watched them increment for several days, weeks, or longer, opened multiple trouble tickets with your provider only to have them tell you that their equipment is good all the way up to the Smart Jack. Well, I'm sorry to tell you that now the burden of proof is up to you to prove them wrong. Your advantage? Well, you're reading this article of course! I will walk you through the loopback testing I performed for a customer in a similar situation.

First some prerequisite steps:
(Now I know you've already done a 'write mem' to save your current configuration and copied it to your pc to have at the ready, so I won't mention that.)
Do a 'show ppp multilink' to display the interfaces in the bundle:

2821# s ppp multilink
Multilink1
  Bundle name: host1034
  Remote Endpoint Discriminator: [1] host1034
  Local Endpoint Discriminator: [1] p1728536-1588737
  Bundle up for 1w0d, total bandwidth 1544, load 4/255
  Receive buffer limit 12000 bytes, frag timeout 1000 ms
    0/0 fragments/bytes in reassembly list
    18 lost fragments, 1627759 reordered
    0/0 discarded fragments/bytes, 1 lost received
    0x660BB1 received sequence, 0x77660C sent sequence
  Member links: 1 active, 1 inactive (max not set, min 1)
    Se0/3/0, since 01:10:02
    Se0/2/0, since 00:50:07
No inactive multilink interfaces

Here we see 2 "physical" interfaces in the "logical" bundle, serial 0/3/0 and serial 0/2/0. Now do 'show int Multilink 1':

interface Multilink1
 description Verizon MPLS BTBFCEK0001
 bandwidth 3000
 ip address 3.17.259.30 255.255.255.252
 no peer neighbor-route
 ppp chap hostname p1728536-1588737
 ppp multilink
 ppp multilink links minimum 1
 ppp multilink group 1
 ppp multilink fragment disable

I like seeing "ppp multilink links minimum 1." This means the logical bundle will stay up if just one of the serial interfaces is down.
I can test each interface and not take the office down! (I don't recommend this until you are comfortable with these procedures or if there is no alternative.)

Now I will prepare the first T1 card (Se0/3/0) for loopback tests. Note that the "3" here is for slot 3, which is labeled as HWIC 3 on the back of the router:

2821(config)# interface Serial0/3/0
2821(config-if)# no encapsulation ppp
2821(config-if)# encap hdlc
2821(config-if)# ip address 10.1.1.1 255.255.255.0
2821(config-if)# no shutdown
2821(config-if)# end
2821# clear counters
Clear "show interface" counters on all interfaces [confirm]
2821#

Now I trot on over to the Smart Jack, unplug the patch cable, and place my SuperLooper on the cable using the female end. (This will be testing both the extended wiring in between the Smart Jack and the router and the T1 card. If errors are found during the test, then you'll have to use the male end and plug directly into the T1 card to determine if the card is bad or some part of the wiring.)
After 10 seconds or so, you can run this command to verify that the card is looped and ready, 'show int se0/3/0':

2821# s int se0/3/0
Serial0/3/0 is up, line protocol is up (looped)
  Hardware is GT96K with integrated T1 CSU/DSU
  Description: 41/HCGS/039981 /NW
  Internet address is 10.1.1.1/24
  MTU 1500 bytes, BW 1544 Kbit/sec, DLY 20000 usec,
     reliability 255/255, txload 2/255, rxload 4/255
  Encapsulation HDLC, loopback not set
  Keepalive set (10 sec)
  Last input 00:00:03, output 00:00:03, output hang never
  Last clearing of "show interface" counters 00:00:25
  ......
  ......
 
Now you're ready to start the ping tests. Which will be the same ping test with 3 different data patterns:

2821# ping
Protocol [ip]:
Target IP address: 10.1.1.1
Repeat count [5]: 5000
Datagram size [100]: 1440
Timeout in seconds [2]:
Extended commands [n]: yes
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]:
Validate reply data? [no]:
Data pattern [0xABCD]: 0x1111
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 5000, 1440-byte ICMP Echos to 10.1.1.1, timeout is 2 seconds:
Packet has data pattern 0x1111
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
......
......
Success rate is 100 percent (5000/5000), round-trip min/avg/max = 16/16/56 ms

Now do a show int and look for input or output errors:

2821# s int se0/3/0
Serial0/3/0 is up, line protocol is up (looped)
  Hardware is GT96K with integrated T1 CSU/DSU
  Description: 41/HCGS/039981 /NW
  Internet address is 10.1.1.1/24
  MTU 1500 bytes, BW 1544 Kbit/sec, DLY 20000 usec,
     reliability 255/255, txload 56/255, rxload 57/255
  Encapsulation HDLC, loopback not set
  Keepalive set (10 sec)
  Last input 00:00:06, output 00:00:06, output hang never
  Last clearing of "show interface" counters 00:02:45
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/0 (size/max total/threshold/drops)
     Conversations  0/1/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1158 kilobits/sec
  5 minute input rate 349000 bits/sec, 38 packets/sec
  5 minute output rate 341000 bits/sec, 41 packets/sec
     10021 packets input, 14442074 bytes, 0 no buffer
     Received 21 broadcasts, 0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     10021 packets output, 14442074 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 unknown protocol drops
     0 output buffer failures, 0 output buffers swapped out
     0 carrier transitions
     DCD=up  DSR=up  DTR=up  RTS=up  CTS=up

Great!!! No errors. Next test:

2821# ping
Protocol [ip]:
Target IP address: 10.1.1.1
Repeat count [5]: 5000
Datagram size [100]: 1440
Timeout in seconds [2]:
Extended commands [n]: yes
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]:
Validate reply data? [no]:
Data pattern [0xABCD]: 0xaaaa
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 5000, 1440-byte ICMP Echos to 10.1.1.1, timeout is 2 seconds:
Packet has data pattern 0xAAAA
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
.....
.....
Success rate is 100 percent (5000/5000), round-trip min/avg/max = 16/16/60 ms

Now do a show int again and look for input or output errors:

2821# s int se0/3/0
Serial0/3/0 is up, line protocol is up (looped)
  Hardware is GT96K with integrated T1 CSU/DSU
  Description: 41/HCGS/039981 /NW
  Internet address is 10.1.1.1/24
  .....
  .....
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     20032 packets output, 28882652 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 unknown protocol drops
     0 output buffer failures, 0 output buffers swapped out
     0 carrier transitions
     DCD=up  DSR=up  DTR=up  RTS=up  CTS=up

No errors. Last test:

2821# ping
Protocol [ip]:
Target IP address: 10.1.1.1
Repeat count [5]: 5000
Datagram size [100]: 1440
Timeout in seconds [2]:
Extended commands [n]: yes
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]:
Validate reply data? [no]:
Data pattern [0xABCD]: 0xffff
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 5000, 1440-byte ICMP Echos to 10.1.1.1, timeout is 2 seconds:
Packet has data pattern 0xAAAA
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
.....
.....
Success rate is 100 percent (456/456), round-trip min/avg/max = 16/16/20 ms

2821# s int se0/3/0
Serial0/3/0 is up, line protocol is up (looped)
  Hardware is GT96K with integrated T1 CSU/DSU
  Description: 41/HCGS/081281 /NW
  Internet address is 10.1.1.1/24
  .....
  .....
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     30959 packets output, 44640882 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 unknown protocol drops
     0 output buffer failures, 0 output buffers swapped out
     0 carrier transitions
     DCD=up  DSR=up  DTR=up  RTS=up  CTS=up

Again, no errors!

Now replace the original interface config:

2821(config-if)# no ip address
2821(config-if)# no encap hdlc
2821(config-if)# encapsulation ppp
2821(config-if)# service-module t1 timeslots 1-24
2821(config-if)# service-module t1 remote-alarm-enable
2821(config-if)# no cdp enable
2821(config-if)# ppp chap hostname <your ppp id>
2821(config-if)# ppp multilink
2821(config-if)# ppp multilink group 1
2821(config-if)# end

Ensure it is active in the bundle:

2821# show ppp multilink

Multilink1
  Bundle name: host1034
  Remote Endpoint Discriminator: [1] host1034
  Local Endpoint Discriminator: [1] p1728536-1588737
  Bundle up for 1w0d, total bandwidth 3088, load 42/255
  Receive buffer limit 24000 bytes, frag timeout 1000 ms
    0/0 fragments/bytes in reassembly list
    0 lost fragments, 81 reordered
    0/0 discarded fragments/bytes, 0 lost received
    0x66B3E0 received sequence, 0x7768E8 sent sequence
  Member links: 2 active, 0 inactive (max not set, min 1)
    Se0/2/0, since 01:35:35
    Se0/3/0, since 00:00:16
No inactive multilink interfaces

Now repeat the above steps for the other card (se0/2/0). Hopefully, you don't have errors on either circuit in which case you can contact your carrier again with this smoking gun in hand and they should do a deeper dive into their network equipment. Additionally, a great command that will show you detailed info on circuit errors is the 'show service-module' command. You may not be able to decipher all the output from this command, but, sometimes the carrier technician can.

If you do have errors after running the tests, run the tests using the SuperLooper male end directly into the T1 card.
If there are errors, you'll have to look into replacing the card (and no they are NOT hot-swappable). If you get no errors from the card directly, this indicates a problem with your extended wiring.
Check your patch cables and/or call your cable installers to troubleshoot for you.

Troubleshooting:
  1. If the ping tests are timing out, have TC verify that the CD light on the T1 card is green and no other leds are lit.  If loopback is plugged into either cable or card, the led should be   green and the LP and AL leds should not be lit (except for the initial 10 seconds when first plugged in).

  2. The "show service-module serial 0/x/0" command can be run at anytime to display alarms.  (Also, shows whether loopback button has been pressed accidentally.)

  3. The “show int se0/x/0” command will show whether the interface is looped or not:
    “Serial0/3/0 is up, line protocol is down (looped)



Hope you've learned something!

Sunday, May 29, 2011

Upgrading SNMP to Version 3 on Cicso Router


1) Check current snmp settings. You might remove or alter some of this later. (Commands are shown in bold.)
s run | i snmp-
snmp-server community YourPrivateCommunity RO
snmp-server location Sometown, USA
snmp-server contact IT Guy
snmp-server enable traps
s snmp view
*ilmi system - included nonvolatile active
*ilmi atmForumUni - included nonvolatile active
v1default iso - included permanent active
v1default internet.6.3.15 - excluded permanent active
v1default internet.6.3.16 - excluded permanent active
v1default internet.6.3.18 - excluded permanent active
v1default ciscoMgmt.394 - excluded permanent active
v1default ciscoMgmt.395 - excluded permanent active
v1default ciscoMgmt.399 - excluded permanent active
v1default ciscoMgmt.400 - excluded permanent active
*tv.FFFFFFFF.FFFFFFFF.FFFFFFFF.FFFFFFFF0F ieee802dot11 - included volatile active
*tv.FFFFFFFF.FFFFFFFF.FFFFFFFF.FFFFFFFF0F internet - included volatile active
2) Copy/paste commands below to add version 3 configuration. The exclusions shown here are route tables and arp entries that I didn't want to allow for security reasons.
config t
snmp-server ifindex persist
snmp-server view yourview system included
snmp-server view yourview interfaces included
snmp-server view yourview ip included
snmp-server view yourview ifMIB included
snmp-server view yourview ip.21 excluded
snmp-server view yourview ip.22 excluded
snmp-server view yourview ip.35 excluded
snmp-server view yourview ciscoRttMonMIB included
snmp-server enable traps snmp authentication linkdown linkup coldstart warmstart
snmp-server group yourgroup v3 priv read yourview
snmp-server user youruser yourgroup v3 auth md5 <password1> priv des <password2>
snmp-server host 10.13.4.267 version 3 priv youruser
If any of the above commands give errors on the “priv” section, then the device’s IOS version only supports “auth” and you’ll have to use the commands below:
snmp-server group yourgroup v3 auth read yourview
snmp-server user youruser yourgroup v3 auth md5 <password1>
snmp-server host 10.13.4.267 version 3 auth youruser
3) At this point, before you remove version 1 or 2, run snmpwalk as follows to verify v3 credentials. I use this on my Windows machine http://www.snmpsoft.com/freetools/snmpwalk.html:
C:\snmpwalk> SnmpWalk.exe -r:10.13.4.267 -v:3 -sn:youruser -ap:MD5 -aw:password1 -pp:DES -pw:password2 -os:.1.3.6.1.2.1.2.2.1.1
For images that don't support auth only and not priv:

C:\snmpwalk> SnmpWalk.exe -r:10.13.4.267 -v:3 -sn:youruser -ap:MD5 -aw:password1
-os:.1.3.6.1.2.1.2.2.1.1
4) Copy/paste commands below to remove unneeded v1 or v2c config:
no snmp-server community YourPrivateCommunity RO
no snmp-server community public RO
no snmp-server enable traps tty
etc
etc
….
5) Copy/paste commands below to disable any default or hidden Cisco views, you may have more:
snmp-server view *ilmi system excluded
snmp-server view *ilmi atmForumUni excluded
snmp-server view v1default iso excluded
end
6) Verify end result and wr:
s run | i snmp-
snmp-server group yourgroup v3 priv read yourview notify *tv.FFFFFFFF.FFFFFFFF.FFFFFFFF.FFFFFFFF0F
snmp-server view *ilmi system excluded
snmp-server view *ilmi atmForumUni excluded
snmp-server view yourview system included
snmp-server view yourview interfaces included
snmp-server view yourview ip included
snmp-server view yourview ifMIB included
snmp-server view yourview ip.21 excluded
snmp-server view yourview ip.22 excluded
snmp-server view yourview ip.35 excluded
snmp-server view yourview ciscoRttMonMIB included
snmp-server view yourview lsystem.58.0 included
snmp-server view v1default iso excluded
snmp-server ifindex persist
snmp-server location Yourtown, USA
snmp-server contact IT Guy
snmp-server enable traps snmp authentication linkdown linkup coldstart warmstart
snmp-server host 10.13.4.267 version 3 priv youruser
write mem
Notes:
This broke my Cacti graphing when I upgraded to v3 and restricted the view to specific OIDs. Since many people use Cacti I figured I'd share this as well....
If you notice that one of the graphs on Cacti has stopped returning data for a certain graph, like my Free Memory graph, you'll need to find out exactly which OID or family the graph uses:





Since you’ve restricted the view to only include certain OIDs, you need to drill down and find the OID being polled by Cacti.
Console> Devices> RTR1> GraphList> RTR1 – Free Memory

On this page you'll see "RTR1 - Proc Mem Free"

Now that you know the Cacti OID name (Proc Mem Free) find the OID number:
Console> Data Templates> Cisco Router – Proc Mem Free

(Don’t forget to walk this OID with snmpwalk to verify you have the right OID.)
So, you add this OID to the config:

RTR1(config)# snmp-server view yourview 1.3.6.1.4.1.9.9.48.1.1.1.6.1 included
RTR1(config)# s snmp view
Look what it added:
yourview ciscoMemoryPoolEntry.6.1 - included nonvolatile active
Now you know the MIB family that the device is using! This is very helpful because MIBs get deprecated or replaced with different names.
Now you might need to add the entire family, but check what other OIDs are in that family first. Go to Cisco’s SNMP Object Navigator website http://tools.cisco.com/Support/SNMP/do/BrowseOID.do?local=en and enter
ciscoMemoryPoolEntry. You should get this OID back: 1.3.6.1.4.1.9.9.48.1.1.1
So, walk that to determine if you can add the entire family:

C:\snmpwalk> SnmpWalk.exe -r: 10.13.4.267 -v:3 -sn:youruser -ap:MD5 -aw:password1 -pp:DES -pw:password2 -os:.1.3.6.1.4.1.9.9.48.1.1.1
OID=.1.3.6.1.4.1.9.9.48.1.1.1.2.1, Type=OctetString, Value=Processor
OID=.1.3.6.1.4.1.9.9.48.1.1.1.2.2, Type=OctetString, Value=I/O
OID=.1.3.6.1.4.1.9.9.48.1.1.1.3.1, Type=Integer, Value=0
OID=.1.3.6.1.4.1.9.9.48.1.1.1.3.2, Type=Integer, Value=0
OID=.1.3.6.1.4.1.9.9.48.1.1.1.4.1, Type=Integer, Value=1
OID=.1.3.6.1.4.1.9.9.48.1.1.1.4.2, Type=Integer, Value=1
OID=.1.3.6.1.4.1.9.9.48.1.1.1.5.1, Type=Gauge32, Value=27303960
OID=.1.3.6.1.4.1.9.9.48.1.1.1.5.2, Type=Gauge32, Value=5792672
OID=.1.3.6.1.4.1.9.9.48.1.1.1.6.1, Type=Gauge32, Value=103653048
OID=.1.3.6.1.4.1.9.9.48.1.1.1.6.2, Type=Gauge32, Value=19373152
OID=.1.3.6.1.4.1.9.9.48.1.1.1.7.1, Type=Gauge32, Value=101099660
OID=.1.3.6.1.4.1.9.9.48.1.1.1.7.2, Type=Gauge32, Value=19239868
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.2.1, Type=Integer, Value=0
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.3.1, Type=Gauge32, Value=2
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.4.1, Type=Gauge32, Value=3
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.5.1, Type=Gauge32, Value=3
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.6.1, Type=Gauge32, Value=2
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.7.1, Type=Gauge32, Value=3
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.8.1, Type=Gauge32, Value=3
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.9.1, Type=Gauge32, Value=5
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.10.1, Type=Gauge32, Value=2
OID=.1.3.6.1.4.1.9.9.109.1.1.1.1.11.1, Type=Gauge32, Value=1
Total: 22
Now remove the previous entry and add this entire family:
RTR1(config)# no snmp-server view yourview 1.3.6.1.4.1.9.9.48.1.1.1.6.1 included
RTR1(config)# snmp-server view yourview ciscoMemoryPoolEntry included
Verify it’s there and save config:
RTR1(config)# s snmp view
RTR1(config)# write mem