More often than not, there’s this pointing-finger-at situations in IT, wherein the development or database team pointing finger at the networking guys for issues that may or may not be caused by the network; and networking engineers justifying it’s not because of the network. Whatever the case may be, we as networking engineers should equip ourselves with skills to evaluate our networks without any bias and draw right conclusions.
This post explains the day to day networking issues that occur and how to resolve them. Specifically, I talk about
- “The Network is Slow !!” comment. How to know that network really is slow or how to prove it isn’t? Plus what are the causes for a slow network and how to fix them.
- Key interface counters
- Finding devices — MAC address table
The Network is Slow — Speed and Duplex
Trust me, “the network is slow” is one of the worst and beautiful sentences any network engineer hears in a day. It’s worst as it hampers the productivity of teams should the network actually is slow. It’s also beautiful as it always gives a valuable experience to the network engineer involved in troubleshooting.
When the network is slow, it’s usually always because of speed and duplex mismatch. Speed tells how fast the interface can send/receive the packets whereas duplex defines whether it can send or receive the packets simultaneously.
Consider the following diagram in which a computer is connected to one of the ports of the switch. Note however, that in place of the computer there can a router, server or even another switch.
If one end of the link is 100 Mbps full and the other end is 100 Mbps half, the node at the other end drops half the traffic as it can either send or receive at any given time, which will lead to a slower response. If the traffic is TCP, retransmissions of the lost packets occur and the nodes negotiate their window sizes dynamically.
Auto sensing on the interface can sense the speed and duplex settings on the other end of the link and accordingly it can set its configuration. There’s one catch in the auto sensing standard. If one end of the link is 100 Mbps full, auto sensing on the other end defaults to 100 Mbps half, not 100 Mbps full!
They fixed the auto sensing problem in Gigabit Ethernet (1000 Mbps).
Hence if any one end of the link is 100 Mbps, we should manually hard code the speed and duplex settings on both the ends of the link. If both the ends support GigabitEthernet, we should set them to auto.
In a Cisco router or switch, the speed and duplex settings can be manually hard coded on the interface like so:
Switch(config)#interface fastEthernet 0 Switch(config-if)#speed ? 10 Force 10 Mbps operation 100 Force 100 Mbps operation auto Enable AUTO speed configuration Switch(config-if)#speed 100 *Mar 1 04:23:02.427: %LINEPROTO-5-UPDOWN: Line protocol on Interface Fastethernet0, changed state to down *Mar 1 04:23:04.440: %LINEPROTO-5-UPDOWN: Line protocol on Interface Fastethernet0, changed state to up Switch(config-if)#duplex ? auto Enable AUTO duplex configuration full Force full duplex operation half Force half-duplex operation Switch(config-if)#duplex full *Mar 1 04:23:09.322: %LINEPROTO-5-UPDOWN: Line protocol on Interface Fastethernet0, changed state to down *Mar 1 04:23:15.026: %LINEPROTO-5-UPDOWN: Line protocol on Interface Fastethernet0, changed state to up
The Gigabit Ethernet interfaces of router or switch are by default in the auto mode.
Note that changing the speed and duplex settings will cause the interface to go down and then up. There will a network outage of 5 to 10 seconds. Hence it’s not recommended to change the interface speed and duplex settings during business hours.
It’s sensible to hard code the speed and duplex settings only on the critical devices in the environment, such as router, server, surveillance camera etc. Hard coding on hundreds of computers is not a good idea. It’s wiser to go and hard code on the specific computer that’s facing the speed issue than to hard code on every computer.
In Windows, you have to open the network adapter settings –> Ethernet Properties –> Configure –> Link Speed.
In Linux, you have to run the following commands as root to change the NIC’s speed and duplex.
# ethtool -s eth0 speed 100 duplex full # ethtool -s eth0 speed 10 duplex half
Note: To connect the like devices such as switch-to-switch, router-to-router, PC-to-PC/router, we require a cross-over cable. And to connect unlike devices like switch-to-router, PC-to-switch, we require a straight-through cable.
The Auto-MDIX technology introduced with Gigabit Ethernet can automagically configure the functionality of its pins whether we use straight-through or cross-over cable. With auto-mdix, we can use straight-through cable to connect a PC to a router or a switch to another switch. For auto-mdix to work, the interfaces must be in the auto mode.
Key Interface Counters
How to determine that there’s an actual problem in the network? In other words, where to look in when somebody complains about the network being slow.
When somebody tells that the network is slow, we should verify various counters of the interface.
Before we do so we should verify the configuration on the interface.
Switch#sh run int gig0/2 Building configuration... Current configuration : 60 bytes ! interface GigabitEthernet0/2 switchport mode access end
There’s nothing configured on this particular interface. The show interface command shows all the statistics about the interface from which we can determine whether the network is healthy or not.
Switch#sh interfaces gig0/1 GigabitEthernet0/1 is up, line protocol is up (connected) Hardware is Gigabit Ethernet, address is 24e9.b3ca.1881 (bia 24e9.b3ca.1881) MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec, reliability 255/255, txload 2/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 100Mb/s, media type is 10/100/1000BaseTX input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:03:54, output 00:00:00, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 2000 bits/sec, 1 packets/sec 5 minute output rate 943000 bits/sec, 186 packets/sec 103032 packets input, 13688193 bytes, 0 no buffer Received 2436 broadcasts (471 multicasts) 0 runts, 0 giants, 0 throttles 27 input errors, 27 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 471 multicast, 0 pause input 0 input packets with dribble condition detected 2751770 packets output, 1206078072 bytes, 0 underruns 0 output errors, 0 collisions, 0 interface resets 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 PAUSE output 0 output buffer failures, 0 output buffers swapped out
Here are the important lines of the output of show interface command to pay attention to determine the cause of network slowness.
1. Determining the port status
GigabitEthernet0/2 is up, line protocol is up (connected)
The first line tells that the interface is up meaning it’s physically connected. The line protocol is the layer-2 protocol status. Both the interface and the line protocol should be up for the node connected to be able to communicate.
2. The interface configuration
Full-duplex, 100Mb/s, media type is 10/100/1000BaseTX
Next is the speed and duplex settings on the interface. This interface for example, is configured for 100 Mbps, Full duplex.
3. Traffic density
5 minute input rate 2000 bits/sec, 1 packets/sec 5 minute output rate 943000 bits/sec, 186 packets/sec
Next is to get a feel of the traffic volume that’s coming into the port and leaving from the port.
4. Queue status
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max)
This is particularly important in routers to see how many packets are there in the queue and any drops because of queue full.
5. Collisions and Late collisions
<< In a switched network, there must be no collisions. Period. >>
0 output errors, 0 collisions, 0 interface resets 0 babbles, 0 late collision, 0 deferred
I tweaked the interface to be half-duplex and did some pinging. Now the collisions are happening.
2884286 packets output, 1287593847 bytes, 0 underruns 0 output errors, 29 collisions, 0 interface resets 0 babbles, 0 late collision, 0 deferred
Out of 2,884,286 packets there are 29 collisions. Run the show interface command a couple of times, if the number of collisions doesn’t increase, then its ok. Otherwise there’s a problem.
Collisions are normal collisions and late collisions are late.🙂
Back in the days of hubs when the Ethernet standard was developed, it was designed that the collisions were always detected within the 32-bytes of the time period of the frame. For example, is a PC sends a frame of 1500 bytes, the hubs, should they detect a collision, they did so before the 32nd byte of the frame hits the wire. They are normal collisions. Normal collisions happen if we connect a hub to the switch.
A late collision is the one which gets detected after the 32nd byte of the frame. For example, if a collision is detected after 723 bytes, that’s a late collision. When you see a late collision, it’s always the indicative of duplex mismatch.
6. CRC errors
27 input errors, 27 CRC, 0 frame, 0 overrun, 0 ignored
CRC stands for Cyclical Redundancy Check, which is a hash on the overall packet. It’s calculated to determine if there’s a change in the frame during transit. If CRC counter is high, then it’s the cabling issue. We should replace the cable with a new one.
Now we have a nice counter argument if anybody points a finger and says, the network is slow. We can verify all these counters on each interface and tell them the network is ok or problematic. We should make sure that we verify the links from end to end, meaning point to point from source to destination.
Finding devices — MAC address table
The last part is to locate the device using a series of ARP and show commands. This is helpful in identifying the device in our network diagram and ensuring it’s in the right place.
The MAC address of the local machine can be seen either from the interface statistics (ipconfig /all) and the machine is within the local network, we can see its MAC address from the ARP cache.
c:\> ipconfig /all ... Physical Address. . . . . . . . . : B8-CA-3A-8A-23-D7 c:\> arp -a Internet Address Physical Address Type 169.254.111.229 24-be-05-1f-95-76 dynamic 169.254.218.200 2c-44-fd-1d-89-88 dynamic 192.168.1.1 24-be-05-18-b9-c0 dynamic 192.168.1.8 c8-1f-66-33-c8-31 dynamic 192.168.1.10 d4-c9-ef-f4-25-dd dynamic 192.168.1.17 00-1e-8c-f4-26-b5 dynamic 192.168.1.20 24-be-05-1a-2e-6f dynamic 192.168.1.23 24-be-05-23-0f-26 dynamic 192.168.1.27 78-2b-cb-86-32-e2 dynamic 192.168.1.28 24-be-05-1f-95-76 dynamic 192.168.1.40 c4-34-6b-51-ef-07 dynamic ...
The switch learns the MAC address on its interface and caches them in its CAM (Content Accessible Memory). To see the CAM table, we should run show mac address-table.
Switch#sh mac address-table Mac Address Table ------------------------------------------- Vlan Mac Address Type Ports ---- ----------- -------- ----- All 0100.0ccc.cccc STATIC CPU All 0100.0ccc.cccd STATIC CPU All 0180.c200.0000 STATIC CPU All 0180.c200.0001 STATIC CPU All ffff.ffff.ffff STATIC CPU 1 0001.e66c.aaf0 DYNAMIC Gi0/23 1 0004.a3fb.9029 DYNAMIC Gi0/23 1 0004.a3fb.906d DYNAMIC Gi0/23 1 0009.0f09.0004 DYNAMIC Gi0/23 1 000c.2904.149b DYNAMIC Gi0/23 1 000c.291c.e633 DYNAMIC Gi0/23 1 000c.2944.a15d DYNAMIC Gi0/23 1 000c.2963.6eba DYNAMIC Gi0/23 1 000c.2983.e879 DYNAMIC Gi0/23 1 000c.299a.f549 DYNAMIC Gi0/23 1 000c.29a0.a869 DYNAMIC Gi0/23 ...
The CAM table is usually very large. We can use filters in IOS to narrow in.
Switch#show mac address-table | include 23d7 1 b8ca.3a8a.23d7 STATIC Gi0/1
From this I know that my PC is connected to Gi0/1 interface of the switch. If there are many devices in the path, we can discover them using the Cisco Discovery Protocol.
show cdp neighbors