Navigation  Back up to About Overview 
ContentIndex
  
[+] Expand All
[-] Collapse All

No index entries found.

Resolved Issues in Junos OS Release 12.3 for PTX Series Packet Transport Routers

Current Release

The following issues are resolved in Juniper Networks PTX Series Packet Transport Routers. The identifier following the descriptions is the tracking number in the Juniper Networks Problem Report (PR) tracking system.

Platform and Infrastructure

  • Distributed protocol adjacencies (LFM/BFD/etc) may experience a delay in keepalives transmission and/or processing due to a prolonged CPU usage on the FPC microkernel on T4000 Type 5-3D FPCs. The delay in keepalive transmission/processing may result in a mis-diagnosis of a link fault by the peer devices. The issue is seen several seconds after a Routing Engine mastership switch with NSR enabled and the fault condition will clear after a couple of minutes. PR849148: This issue has been resolved.

Previous Releases

Release 12.3R11

Interfaces and Chassis

  • After removing a child link from AE bundle, in the output of "show interface <AE> detail", the packets count on the remaining child link spikes, then if you add back the previous child link, the count recovers to normal. PR1091425: This issue has been resolved.

Release 12.3R10

General Routing

  • On PTX5000, the packet drop is observed along with the parity error read from l3bnd_ht entry corresponding to certain addresses. With this SRAM parity error, ASIC will unconditionally drop the packet even if the PTX5000 does not use l3bnd_ht during lookup. The parity check for l3bnd_ht lookup for the PTX5000 will be disabled to avoid the SRAM parity error and packet drop as a workaround. We also add new log message to report the counter value change for slu.hw_err trap count - TL[<num>]: SLU hw error count <xxx> (prev count <yyy>). PR1012513: This issue has been resolved.
  • When the port on 24x 10GE(LWO) SFP+ (which never went link up since the PIC is onlined) is configured as CLI loopback, the ports will receive framing error until the interface gets physically linked up. (i.e. with real fiber instead of CLI loop). There would be no problem in normal use. This is only seen in self-loopback testing with CLI loopback. PR1057364: This issue has been resolved.

Platform and Infrastructure

  • Distributed protocol adjacencies (LFM/BFD/and so on) may experience a delay in keepalives transmission and/or processing due to a prolonged CPU usage on the FPC microkernel on PTX5000 Type 5-3D FPCs. The delay in keepalive transmission/processing may result in a mis-diagnosis of a link fault by the peer devices. The issue is seen several seconds after a Routing Engine mastership switch with NSR enabled, and the fault condition will clear after a couple of minutes. PR849148: This issue has been resolved.

Routing Protocols

  • When running Simple Network Management Protocol (SNMP) polling to specific IS-IS Management Information Base (MIB) with invalid variable, it will cause routing protocol process (rpd) crash. PR1060485: This issue has been resolved.

Release 12.3R9

General Routing

  • In the P2MP environment with AE interface. When flapping interface continually, in rare condition, if unilist nexthop is added failure, the system does not clean up properly and this leads to FPC crash. PR980622: This issue has been resolved.

Release 12.3R8

General Routing

  • SFP+-10G-ZR (part number = 740-052562) is not fully supported on P1-PTX-24-10G-W-SFPP pic. Inserting the optic on P1-PTX-24-10G-W-SFPP pic can cause FPC core file on the pic. PR974783: This issue has been resolved.

Interfaces and Chassis

  • On PTX Series platform, performing Routing Engine switchover might cause flabel (fabric token) to be out of sync between master Routing Engine and the backup Routing Engine, and results in an FPC crash. PR981202: This issue has been resolved.

Platform and Infrastructure

  • "delete" or "deactivate" of apply-group defining the entire TACACS or RADIUS configuration configured under [edit system apply-group <>] does not take effect on commit. This could lead to TACACS or RADIUS based authentication to still continue working despite removal (delete/deactivate) of configuration. PR992837: This issue has been resolved.

Release 12.3R7

Interfaces and Chassis

  • Kernel crash may happen when a router running a junos install with the fix to PR 937774 is rebooted. This problem will not be observed during the upgrade to this junos install. It occurs late enough in the shutdown procedure that it shouldn't interfere with normal operation. PR956691: This issue has been resolved.
  • Sometimes the COSD generates a core file when add/delete child interface on the LAG bundle. PR961119: This issue has been resolved.

Routing Policy and Firewall Filters

  • On PTX Series platform, when a firewall filter has many terms, all the terms might not work correctly due to incorrect order of terms due to mis-programming. PR973545: This issue has been resolved.

Release 12.3R6

General Routing

  • If rpd ACK feature is enabled through command "indirect-next-hop-change-acknowledgements", when a route being added and a quick route change happens on the same route, high routing protocol process (rpd) CPU utilization might be seen and stays high (above 90%) until rpd is restarted. PR953712: This issue has been resolved.

High Availability (HA) and Resiliency

  • RPD on backup Routing Engine might hit out of memory condition and crash if BGP protocol experiences many flaps. PR904721: This issue has been resolved.

Interfaces and Chassis

  • On PTX Series Packet Transport Routers, a change to the 'oam lfm pdu holdtime' on an interface is not updated correctly. This results in an incorrect LFM state, which should be reported as Adjacency Lost. As a workaround, issue the clear oam ethernet link-fault-management state command from the CLI to correctly update the 'pdu holdtimer.' PR792763: This issue has been resolved.
  • Ethernet-CCC encapsulation allow both untagged and tagged packets to flow through. PR807808: This issue has been resolved
  • After an FPC comes online, CCL ERR counter registers need to be cleared for the link between the FPC and SIB. These counters get cleared after the link training and are 8B long. Also a 30-day history is maintained as well, that makes it 30 8B contiguous locations. Due to this bug only the first 30B out of the 8x30B get initialized leaving others with transient counter values from the SIB/FPC online event. This can lead to unintended CRC errors reported leading to FPC/SIB link alarms (link failure) if FPC/SIB uptime is between 4 - 30 days. Also since the AggrCRCErrCnt maintains a sum of a 30-day history which is a roll over counter, it can lead to a negative value indicated as a high error count. This has been fixed in the recent Junos OS releases to initialize all the 8x30B of the counter to correctly represent the current state of the link. PR948185: This issue has been resolved.

IPv6

  • PTX TLCHIP drops transit and host-bound packets containing the same source and destination IPs due to a protection mechanism built into TLCHIP. Such packets are counted as "Data error". This forces a change in loopback mode configuration on Ethernet interfaces. PR934364: This issue has been resolved.

MPLS

  • When Packet Forwarding Engine fast reroute (FRR) applications are in use (such as mpls facility backup, fast-reroute or loop free alternates), a primary path interface flap could be be triggered due to Operation, Administration, and Maintenance (OAM) link failure detection or by Bidirectional Forwarding Detection (BFD). However, this interface flap might lead to a permanent use of the backup path, which means the original primary path could not be active again. PR955231: This issue has been resolved.

Routing Protocols

  • On PTX Series platform, after short protocol adjacencies flaps, rpd and kernel next-hops might not be in sync, resulting in equal-cost multi-path (ECMP) not working correctly. PR911307: This issue has been resolved.

Software Installation and Upgrade

  • In this case, since the high level package (i.e. jinstall) is signed, the underlying component packages are not required to be signed explicitly. However, the infrastructure was written such a way to display a warning message if the component package is not signed (that is, jpfe). PR932974: This issue has been resolved.

User Interface and Configuration

  • Configuration of an extended community such as: rt-import:*:* src-as:*:* fails because the wildcard is not allowed during the configuration validation process. PR944400: This issue has been resolved.

Release 12.3R5

Class of Service (CoS)

  • In an RSVP point-to-multipoint crossover/pass-through scenario, more than one sub-LSP can use the same PHOP and NHOP. If link protection is enabled in the above mentioned scenario, when a 'primary link up' event is immediately followed by a Path Tear message, disassociation of the routes/next hops are sequential in nature. When the routes/next hops disassociation is in progress, if a sub-LSP receives a path tear/PSB delete will lead to this core. PR739375: This issue has been resolved.
  • The ability to configure buffer size for SH queues was added. PR770583: This issue has been resolved.

General Routing

  • Processing of a neighbor advertisement can get into an infinite loop in the kernel, given a special set of events with regard to the Neighbor cache entry state and the incoming neighbor advertisement. PR756656: This issue has been resolved.

High Availability (HA) and Resiliency

  • FPC might randomly crash during unified ISSU. It will be kept offline after the unified ISSU period. PR773960: This issue has been resolved.
  • Distributed protocol adjacencies (LFM/BFD/ and so on) might experience a delay in keepalives transmission and/or processing due to prolonged CPU usage on the FPC microkernel on T4000 Type 5-3D FPCs. The delay in keepalive transmission/processing might result in a misdiagnosis of a link fault by the peer devices. The issue is seen several seconds after a Routing Engine mastership switch with NSR enabled. The fault condition will clear after a couple of minutes. PR849148: This issue has been resolved.
  • VPLS connections in MI state. In rare scenarios, the routing protocol process can fail to read the mesh-group information from kernel, which might result in the VPLS connections for that routing-instance to stay in MI (Mesh-Group ID not available) state. The workaround is to deactivate/activate the routing-instance. PR892593: This issue has been resolved.

Interfaces and Chassis

  • On PTX5000, when we issue the debug cos halp ifd <ifd_index> commands in a remote FPC, the FPC crashes. The core files could be seen by executing the CLI command show system core-dumps. PR814935: This issue has been resolved.
  • Kernel message 'Only parameters changed ...Sending to Slave side' is seen continuously on both master and backup Routing Engines. PR820414: This issue has been resolved.
  • When an FPC goes bad due to hardware failure and is stuck in a boot mode, it might affect Routing Engine-Packet Forwarding Engine communication for other FPCs since all private next-hop index space got depleted.

    The following syslog entries are reported. /kernel: %KERN-4: Nexthop index allocation failed: private index space exhausted PR831233: This issue has been resolved.

  • This issue is triggered with affected Junos OS versions, under some special conditions, when only one end of an AE link sees LACP timeouts or there is intermittent LACP loss on the AE link. This trigger causes an issue only with these specific Junos OS versions (that do not have this PR fix) because of a change in default behavior where the AE member link was considered to be UP in any state other than DETACHED. The PR fix affects the following two changes 1) The default behavior has been restored to what it was before - which is, in any LACP state other than COLLECTING_DISTRIBUTING, the AE member is considered to be DOWN. 2) A fast-failover knob has been introduced that, if configured, causes the behavior to change such that in any LACP state other than DETACHED, the AE member is considered to be UP. Note that this issue is platform independent. PR908059: This issue has been resolved.
  • PTX Series and T4000's FPC crash can be triggered by a "Single Bit Error" (SBE) event after accessing a protected memory region, as indicated in the following log: "System Exception: Illegal data access to protected memory." PR919681: This issue has been resolved.

IPv6

  • Setting OSPF overload via the configuration sets both the metric field in router LSAs as well as the te-metric field in opaque LSAs to 65535 or 2^16-1. Since te-metric is a 32-bit field, it should be set to 2^32-1. PR797293: This issue has been resolved.
  • Changing the domain-name doesn't reflect in DNS query unless a Commit full is done. This bug in management daemon (mgd) has been resolved by ensuring mgd propagates the new domain-name to file /var/etc/resolv.conf, so that this can be used for future DNS queries. PR918552: This issue has been resolved.

Network Management and Monitoring

  • Multiple SNMP queries for large volumes of information might cause Mib2d to grow in size and eventually create a core file. Mib2d will restart, possibly multiple times, but should recover by itself. PR742186: This issue has been resolved.

Software Installation and Upgrade

  • In this case, since the high level package (i.e. jinstall) is signed, the underlying component packages are not required to be signed explicitly. However the infra was written such a way to display warning message if the component package is not signed (i.e. jpfe). PR932974: This issue has been resolved.

User Interface and Configuration

  • Configuration mode access is locked after connection to router dropped. PR745280: This issue has been resolved.

VPNs

  • On the PTX Series, the routing protocol process (rpd) and the kernel might be out of sync regarding the forwarding nexthops after short protocol adjacencies flaps. PR911307: This issue has been resolved.

Release 12.3R4

Class of Service (CoS)

  • At Junos OS Release 12.1, excess-rate was an unsupported statement under [edit class-of-service] schedulers on PTX Series Packet Transport Routers. The excess-rate statement is now supported for scheduler configurations on PTX Series Packet Transport Routers. Subsequent versions of the Junos OS Class of Service Configuration Guide and other related documentation will be updated to reflect this change on the PTX Series. PR738552: This issue has been resolved.
  • Changing the preference on an LSP was considered a catastrophic event, tearing down the current path and then re-establishing a new one. This PR makes the preference change minor and only needs a new path to be re-signalled in a make-before-break manner. PR897182: This issue has been resolved.

General Routing

  • Processing of a neighbor advertisement can get into an infinite loop in the kernel, given a special set of events with regard to the Neighbor cache entry state and the incoming neighbor advertisement. PR756656: This issue has been resolved.

High Availability (HA) and Resiliency

  • In a situation where prefix-export-limit and NSR are configured together, when there are Routing Engine mastership switches, the IS-IS overload bit might be set after the NSR switchover. This issue is triggered due to inconsistent state between the master Routing Engine and the backup Routing Engine. As a workaround, disable protocols isis prefix-export-limit. PR725478: This issue has been resolved.
  • LACP status disagreement after Routing Engine switchover. PR751745: This issue has been resolved.
  • RPD on the backup Routing Engine might crash when it receives a malformed message from the master. This can occur at high scale with nonstop active routing enabled when a large flood of updates are being sent to the backup. There is no workaround to avoid the problem, but it is rare. The backup RPD will restart and the system will recover without intervention. PR830057: This issue has been resolved.

Infrastructure

  • The Junos OS kernel might crash because of a timing issue in the ttymodem() internal I/O processing routine. The crash can be triggered by simple remote access (such as Telnet or SSH) to the device. PR755448: This issue has been resolved.

Interfaces and Chassis

  • NSR switchover does not work with aggressive hello and hold-timers. This is a system limitation. Even the default 3 sec timer interval (for LAN interfaces) will not work. Cannot use such aggressive timers in scale scenario. Similar issue is seen in PR 719301 (though the scale numbers are different). To ensure zero traffic loss during GRES or NSR switchover ensure the following. 1. For non-"point-to-point" interfaces increase the hello-timer to something big around 30 seconds and holdtime to 90 seconds on all interfaces, on all routers. OR 2. Configure the interfaces as "point-to-point" under IS-IS on all routers. PR772136: This issue has been resolved.
  • When an FPC goes bad due to hardware failure and is stuck in a boot mode, it might affect Routing Engine-Packet Forwarding Engine communication for other FPCs since all private next-hop index space got depleted. The following syslog entries are reported. /kernel: %KERN-4: Nexthop index allocation failed: private index space exhausted. PR831233: This issue has been resolved.
  • Interrupt storm happened when press craft button with "craft-lockout". PR870410: This issue has been resolved.

IPv6

  • The core file is due to a null pointer dereference in ND6 code in the kernel and this bug was introduced when new feature HFRR was added. An IPv6 route that points to a discard next-hop will not require ND6 cache entry, and this check has been coded in to fix this issue. PR755066: This issue has been resolved.

MPLS

  • Some MPLS LSPs might run into the stuck status on the following triggers: 1. Aggressive link flapping on the LSP path 2. RSVP session got cleared from both ingress and egress within 1 second interval. The LSP remains down on the ingress router indefinitely because RSVP RESV messages were stuck on one of the transit LSRs, and were never sent to its upstream. PR751729: This issue has been resolved.
  • On PTX Series, in l2circuit/l2vpn or VPLS scenario, with chained composite-next-hop used, while performing certain l2circuit/l2vpn/vpls pings, routing protocol process (rpd) might crash and create a core file. When the issue happens, the following behavior could be observed: user@router> ping mpls l2circuit interface et-1/0/10.2 Info request to rpd timed out, exiting. PR755489: This issue has been resolved.
  • When a PTX Series Packet Transport Switch is a penultimate hop of one P2MP LSP branch and acts as a transit LSR on another branch for the same P2MP LSP, the MPLS packets going out from the penultimate hop branch might be tagged with incorrect Ethertype field. There is no workaround. PR867246: This issue has been resolved.

Multicast

  • RPD might generate a core file in some cases of incorrect next-hop deletion when adding and deleting multicast next hops. PR702359: This issue has been resolved.

Platform and Infrastructure

  • Following error messages in logs when performing commit synchronize mgd[1951]: UI_COMMIT: User 'user' requested 'commit synchronize' operation (comment: none) rpd[2787]: junos_dfw_trans_purge:1445 Error "session" is not allocated. rpd[2787]: junos_dfw_session_close:1120 Error "session" is NULL or its socket has an invalid value before session close. The error is purely cosmetic and happens whenever anything being touched that results in RPD being notified, including just activating and deactivating interfaces. PR737438: This issue has been resolved.

Routing Policy and Firewall Filters

  • If you issue the show krt next-hop or show krt iflist-next-hop command, and if you later delete a route or the route is removed, an rpd core file might be created. PR727014: This issue has been resolved.
  • In scale LDP scenario (about 250 LDP neighbors), routing protocol process (rpd) crashed and created a core file while issuing CLI command show ldp neighbor because system tries to remove an active LDP adjacency incorrectly. The core files could be seen by executing CLI command show system core-dumps. If issue happens, the following logs could be seen: init: routing (PID 4305) terminated by signal number 6. Core dumped! init: routing (PID 32773) started. PR747109: This issue has been resolved.
  • When packet with TTL expiry is dropped on PTX Series as penultimate hop router, the following message can be seen: /kernel: rnh_comp_output(): rnh 1578: no af 2 iff context for chaining; discarding packet The issue is resolved in 12.1X48-D30 and later releases. PR785366: This issue has been resolved.
  • RPD on the backup Routing Engine might crash when it receives a malformed message from the master. This can occur at high scale with nonstop active routing enabled when a large flood of updates are being sent to the backup. There is no workaround to avoid the problem, but it is rare and backup RPD will restart and the system will recover without intervention. PR830057: This issue has been resolved.
  • Routing Engine might cause kernel panic. PR851086: This issue has been resolved.
  • Fixed continuous FPC crash on PTX Series when reject firewall action with non-zero reject code is present. PR856473: This issue has been resolved.
  • On PTX Series, while deactivating or activating a firewall filter that has tcp-flags in the match condition on a loopback interface (e.g. lo0.0), memory corruption could occur when the filter configuration is pushed to the Packet Forwarding Engine, or is removed from the Packet Forwarding Engine, causing all FPCs to crash and generate core files. The following is logged by the FPCs a few seconds prior to the crash: fpc1 dfw_match_branch_db_destroy:77filter index 1, dfw 0x20bb2a90, match_branch_db not empty on filter delete fpc2 dfw_match_branch_db_destroy:77filter index 1, dfw 0x205a6340, match_branch_db not empty on filter delete fpc0 dfw_match_branch_db_destroy:77filter index 1, dfw 0x20471c38, match_branch_db not empty on filter delete PR874512: This issue has been resolved.

Routing Protocols

  • RPD (routing protocol process) cored on receipt of RESV message with unexpected NHOP address. To avoid the crash, the solution is to drop RESV message with different NHOP IP address. Then the LSP will time out due to lack of refresh by RESV message and session reset. PR887734: This issue has been resolved.
  • When running the command "monitor label-switched-path <lsp-name>" on the PTX Series platform to display the real-time status of the specified RSVP label-switched path (LSP), the routing protocol process (rpd) might generate core files. PR773439: This issue has been resolved.

Subscriber Access Management

  • "Power Supply failure"/"Power Supply Removed" messages and SNMP trap occur hourly. PR860223: This issue has been resolved.

User Interface and Configuration

  • In scenario where telnet session is disconnected ungracefully while accessing "load merge terminal" prompt, problem can be exhibited with other CLI users unable to access configuration mode. PR745280: This issue has been resolved.

Release 12.3R3

Interfaces and Chassis

  • Distributed protocol adjacencies (LFM/BFD/etc) might experience a delay in keepalives transmission and/or processing due to a prolonged CPU usage on the FPC microkernel on PTX5000 type 5-3D FPCs. The delay in keepalive transmission/processing can result in a mis-diagnosis of a link fault by the peer devices. The issue is seen several seconds after a Routing Engine mastership switch with nonstop active routing is enabled, and the fault condition will clear after a couple of minutes. PR849148: This issue has been resolved.

Release 12.3R2

High Availability and Resiliency

  • On PTX Series Packet Transport Routers with nonstop active routing configured, if LDP is deleted or deactivated from the master Routing Engine, the Layer 2 circuit connections enter an incorrect encapsulation information state. The Layer 2 circuit connection transitions to the correct state when LDP is reactivated on the master Routing Engine. PR799258: This issue has been resolved.
  • Deletion of IPv6 addresses with a prefix of /128 from an interface can cause the Routing Engine to crash. PR799755: This issue has been resolved.
  • When the Flexible PIC Concentrator (FPC) restarted after performing a master Routing Engine switchover, the aggregate interface flag was set to “down”. Any traffic that entered this FPC and traversed the equal-cost multipath (ECMP) to the aggregate interface was dropped. PR809383: This issue has been resolved.

Infrastructure

  • In an IPv6 scenario, when ipv6-duplicate-addr-detection-transmits is configured with a value of zero, IPv6 Neighbor Discovery might not function properly. PR805837: This issue has been resolved.

Interfaces and Chassis

  • FPC crashes on PTX Series Packet Transport Routers when reject firewall action with nonzero reject code is present. PR856473: This issue has been resolved.

IPv6

  • On PTX Series Packet Transport Routers, only 48k longest prefix match (LPM) routes are supported. If the limit of 48,000 LPM routes is exceeded, the kernel routing table (KRT) queue can be stuck with the error "Longest Prefix Match(LPM) route limit is exceeded." As a workaround, reduce the number of LPM routes because only 48000 LPM routes are supported. PR801271: This issue has been resolved.

User Interfaces and Configurations

  • PTX Series does not allow configuring buffer sizes on SH queues. PR770583: This issue has been resolved

Related Documentation

Modified: 2016-06-09