Tuesday, October 14, 2008

T1 Circuit Problem...

We've been having problem with a T1 line for a couple months. Verizon tested the circuit clean from smart jack to smart jack. They could run a loop between their smart jacks. They said they could loop our CSU at one end only, so we thought the problem lies at the other end. We switched out the router, the WIC card, the cable, and even re terminated the extended RJ48X jack. But the circuit would just start bouncing up and down after some hours or a couple of days.
Verizon said that is so because when they are in the circuit testing, they break up the cell signal, so once they are out, the routers could communicate again: claiming it's our equipment that couldn't resync itself!
It is worth noting that our circuit has been working for years and also, there's another T1 circuit using the same VWIC card and it has no problem whatsoever. If there is a problem with the VWIC/WIC cards, then service would be down completely and not just intermittenly.
Verizon then, sent a technician out to test their smart jack and also the extended demark. When you call in to test the circuit, the provider only tested to the front end of the smart jack. Though, they also own the back end of it, they usually don't test to it since it requires a technician to be onsite. So when they do that, they usually will charge you at the discretion of the technician if no problems found.
Again, everything tested good. This time the circuit stayed up much longer than before: 2 days and 18 hours. But then it started bouncing again.
At this time, we are almost out of option. We were clueless to what happen, since everything that could be wrong had been replaced and/or fixed.
But then, persistent in calling Verizon paid off. We talked to a very knowledgable testerl. He pointed out that the rate of the line is at about 17 db which is high. It is supposed to be ~0. This is because both side of the circuit try to become the source of the clocking.
We had one side set to be the source and the other side feed from the line since this is a point to point circuit. The configuration is as follow for the source clocking device:

no network-clock-participate wic 0
controller T1 0/1/1
framing esf
clock source internal
linecode b8zs
channel-group 1 timeslots 1-24
description T1

So, Verizon sent out a technician to replace the smart jack at the clock source end. It didn't help since the rate is still up in the 17 db. But it does make the circuit stay up for a while. If the circuit goes down again, then we will have to make the replace the smart jack at the other end. This comes with a catch: if they don't see any problem with their smart jack, then we will be charged at the discretion of the technician. We are still waiting for the circuit to drop, which it is bound to be since the rate is in the 17 db which means that our equipments' clocking is out of sync with Verizon central office DEC cross connect, and then get authorization for tech dispatching.

Here is a little something about trouble shooting a T1 line, pasted here just for easy access and in case the link will be down in the future.

DEBUGGING T1

Before troubleshooting any aspect of a connectivity issue (e.g., ISDN, CAS, modem) you should always verify the physical integrity of the T1 line. You should always check the status of the T1 controllers and verify you are not receiving any errors. show controller T1 x will give the snapshot of the T1 physical layer status. There should not be any framing errors, Slips, or line code violations.

Following is the sample output of show controller T1 0 and what to look at:

AS5300#show controllers t1 0

T1 0 is up.

Applique type is Channelized T1

Cablelength is long gain36 0db

No alarms detected.

Version info of slot 0: HW: 4, Firmware: 16, PLD Rev: 0



Manufacture Cookie Info:

EEPROM Type 0x0001, EEPROM Version 0x01, Board ID 0x42,

Board Hardware Version 1.32, Item Number 73-2217-5,

Board Revision B16, Serial Number 09356930,

PLD/ISP Version 0.0, Manufacture Date 18-Jun-1998.



Framing is ESF, Line Code is B8ZS, Clock Source is Line Primary.

Data in current interval (8 seconds elapsed):

0 Line Code Violations, 0 Path Code Violations

0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins

0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs

The main items to look at in the above output is

  • The status of the line
  • Alarms
  • Linecode and Pathcode violations
  • Slip Secs

The line status will tell us if the T1 is either up, down, or administratively down. The Alarms section is very important and it will tell us what type of problem maybe present on the line. The presence of any alarms indicates a serious problem on the line.

It is recommended whenever you encounter a T1 that is in an alarm state that you verify the framing and linecoding parameters are configured correctly. Please refer to the show controller t1 commands in the Command Reference to find out all the possible values for the alarm state.

A common message you will see in the alarm field is "receiver has loss of frame." Some routers will also report a 'loss of frame' even when it should be a "loss of signal." So, make sure whenever you receive these errors that the T1 signal is present and the framing is correct.

Another message you might receive is "receiver is getting AIS." This means the receiver is getting an alarm indication signal (blue alarm). This is a framed or unframed all-ones signal in both SF and ESF formats transmitted to maintain transmission continuity. This is typically seen when the far-end CSU has lost its terminal side equipment. The "receiver has remote alarm" indicates the presence of a yellow alarm. This means the downstream CSU is in a loss-of-frame or loss-of-signal state. Therefore, the remote CSU has a red alarm.

The "transmitter is sending remote alarm" indicates that the local CSU has detected either a loss-of-frame or loss-of-signal condition. This indicates that the local controller has a red alarm. This message will be accompanied by a "receiver has loss of frame." Always verify framing and T1 signal when troubleshooting this problem.

If any of the above highlighted fields doesn't contain zeros, than here are some of the possibilities what might causing the physical problem. Following is the brief explanation of these fields.

Line Code Violations

Indicates the occurrence of either a Bipolar Violation (BPV) or Excessive Zeros (EXZ) error event.

A BPV error event for an AMI-coded signal is the occurrence of a pulse of the same polarity as the previous pulse.

A BPV error event for a B8ZS is the occurrence of a pulse of the same polarity as the previous pulse without being a part of the zero substitution code. An EXZ is the violation of the pulse density requirement.

Path Code Violations

Indicates a frame synchronization bit error in the D4 and E1-noCRC formats, or a CRC error in the ESF and E1-CRC formats.

Line Errored Seconds (LES)

A Line Errored Second, according to T1M1.3, is a second in which one or more Line Code Violation error events were detected.

In the T1M1.3 specification, near end Line Code Violations and far end Line Errored Seconds are counted. For consistency, we count Line Errored Seconds at both ends.

Slip Seconds

A Controlled Slip Second is a one-second interval containing one or more controlled slips.

Errored Seconds (ES)

For ESF and E1-CRC links an Errored Second is a second with one or more Path Code Violations OR one or more Out of Frame defects OR one or more Controlled Slip events OR a detected AIS defect.

For D4 and E1-noCRC links, the presence of Bipolar Violations also triggers an Errored Second.

This is not incremented during an Unavailable Second.

Bursty Errored Seconds (BES)

A Bursty Errored Second (also known as Errored Second type B) is a second with fewer than 320 and more than 1 Path Coding Violation error events, no Severely Errored Frame defects and no detected incoming AIS defects. Controlled slips are not included in this parameter.

This is not incremented during an Unavailable Second.

Severely Errored Seconds (SES)

A Severely Errored Second for ESF signals is a second with 320 or more Path Code Violation Error Events, one or more Out of Frame defects, or a detected AIS defect.

For E1-CRC signals, a Severely Errored Second is a second with 832 or more Path Code Violation error events or one or more Out of Frame defects.

For E1-noCRC signals, a Severely Errored Second is a 2048 LCVs or more.

For D4 signals, a Severely Errored Second is a count of one-second intervals with Framing Error events, or an OOF defect, or 1544 LCVs or more.

Controlled slips are not included in this parameter. This is not incremented during an Unavailable Second.

Severely Errored Framing Second (SEFS)

An Severely Errored Framing Second is a second with one or more Out of Frame defects or a detected AIS (Alarm Indication Signal) defect.

Degraded Minutes

A Degraded Minute is one in which the estimated error rate exceeds 1E-6 but does not exceed 1E-3 (see G.821 [15]).

Degraded Minutes are determined by collecting all of the Available Seconds, removing any Severely Errored Seconds grouping the result in 60-second long groups and counting a 60-second long group (a.k.a., minute) as degraded if the cumulative errors during the seconds present in the group exceed 1E-6. Available seconds are merely those seconds which are not Unavailable, as described below.

Unavailable Seconds (UAS)

Unavailable Seconds (UAS) are calculated by counting the number of seconds that the interface is unavailable. The DS1 interface is said to be unavailable from the onset of 10 contiguous SESs, or the onset of the condition leading to a failure (see Failure States). If the condition leading to the failure was immediately preceded by one or more contiguous SESs, then the DS1 interface unavailability starts from the onset of these SESs. Once unavailable, and if no failures present, the DS1 interface becomes available at the onset of 10 contiguous seconds with no SESs. Once unavailable, and if a failure is present, the DS1 interface becomes available at the onset of 10 contiguous seconds with no SESs, if the failure clearing time is less than or equal to 10 seconds. If the failure clearing time is more than 10 seconds, the DS1 interface becomes available at the onset of 10 contiguous seconds with no SESs, or the onset period leading to the successful clearing condition, whichever occurs later. With respect to the DS1 error counts, all counters are incremented while the DS1 interface is deemed available. While the interface is deemed unavailable, the only count that is incremented is UASs.

Other Problems

Another common problem with T1 troubleshooting are: misconfiguration of Controller T1 and also wrong cabling between AS5300 and the switch.

Make sure that framing and line coding settings are the same for the switch and the AS5300. These physical problems also occur due to the linebeing built out. If the PBX is very close to the AS5300, make sure that signal is not too hot.

Whenever you are troubleshooting a T1, always verify that both-sides of the circuit are running clean. It is possible that only one side of the T1 is seeing errors. Remember, that the T1 is going only between you and the provider. Always contact the provider to make sure they aren't seeing errors on their side of the circuit.

No comments:

Contributors