Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation
Guide That Contains This Content
[+] Expand All
[-] Collapse All

    Known Issues

    This section lists the known issues in hardware and software in Junos OS Release 16.1R5 for MX Series and T Series.

    For the most complete and latest information about known Junos OS defects, use the Juniper Networks online Junos Problem Report Search application.

    Forwarding and Sampling

    • It is known that policing filter application to the LSP is catastrophic. Any active LSP carrying traffic when applied a policing filter tears down and resignals and drops traffic for approximately 2 seconds. In Junos OS Release 16.1R1, it would take up to 30 seconds for the LSP to come up if

      1. Creation of the policing filter and application of the same to the LSP through configuration in the same commit sequence

      2. Load override of a configuration file that has policing filter and policing filter application to the LSP followed by commit. The plan is to rectify this behavior in Junos OS Release 16.2. PR1160669

    • The "default-arp-policer" is applied to every relevant IFL to rate the limit of the ARP traffic. You can disable the "default-arp-policer" by running the above hidden command set firewall disable-arp-policer. Note that improper application leads to the Routing Engine over loaded with a bulk of ARP traffic leading to a typical DOS scenario. The issue was that even after disabling the "default-arp-policer", it still affects IFL in some scenario such as after DUT reboot or when a new IFL is created. The issue is fixed in this PR so that wherever set firewall disable-arp-policer is configured, in all scenarios "default-arp-policer" will not get applied to IFL. PR1198107
    • Root Cause of the Problem: +++++++++++++++++++++++++ As per the investigation from RPD : we have is an interface for a direct route starting in ifdown condition. The remote side is then brought up, so I/F goes to ifup. Since it is a direct route, rpd does not install the route or next-hop. It receives that info from the kernel, and just updates a next-hop in rpd local storage. The route and next-hop for the interface are taken care of in the kernel. There is no route change in rpd. The route_record depends on the route flash to find out about the updates. That is the architecture. Since there is no route change, there is no route flash, so route_record is blissfully unaware. In order to change this, we would need to decide that we want a route flash for this case. Currently, for direct and local routes or next-hops, these are "do not care" in rpd, as far as route updates go. We just update our next-hop info, without marking for any other notifications. To change this, we would need to find the correct place to decide we need to flash the route, and at the same time, make sure we do not do any harm to anything else. A complication for the solution is a change that was done for PR 1002287, where if the NOTINSTALL flag is set, do not send the update to srrd. That flag is set for direct and local routes. Incidentally, this is a day-one operation. If the interface is up at startup, it should all work correctly. Why is the Packet Forwarding Engine depending on rpd / srrd to get the information for sampling when it is already there in the forwarding table ? +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++++++++++++++++++++ FIB table can provide OIF/GW only. SRC_MASK, DST_MASK, SRC_AS and DST_AS are not available in Packet Forwarding Engine FIB Table. So SRRD connection is required. Listening to both SRRD and FIB table, and consolidating information will complicate implementation. Scanning entire FIB Table just for the few such routes will have performance impact and will complicate present implementation. This is day 1 implementation for SRRD/sampled. Workarounds: ++++++++++++ There are two possible workarounds a) A workaround would be to have the far end interface up when the DUT interface is brought up. In the case where that is not happening, a recovery would be to disable the DUT interface, then enable it again. At that point, everything should be initially brought up in the state we are looking for. b) enable next-hop learning configuration statement. PR1224105
    • On Routing Engine based sampling supported platforms, if Routing Engine based sampling is configured, sampled might stop collect date after every 5-7 days. PR1270723

    General Routing

    • EVPN uses several different subtypes of routes within the EVPN address family which are advertised through the control plane between Provider Edges using BGP. Multihoming Provider Edges use Ethernet segment (ES) routes to advertise the fact that the Provider Edges are connected to a given multihomed segment. All other multihoming Provider Edges attached to the same multihomed segment import those Ethernet segment routes, and combined with their own local state, elect a single designated forwarder (DF) for each EVPN instance that is part of the multihomed segment. When a new Provider Edge is added to an existing EVPN, the new Provider Edge needs to download the full set of EVPN routes advertised by the other existing Provider Edges. In cases of high MAC scaling, it is possible that remote Provider Edges will generate and send BGP updates for MAC routes (or other EVPN route types) before generating and sending the ES routes. If the time taken by the original multihoming Provider Edge(s) to send the ES routes is longer than the DF election hold timer on the new Provider Edge, the new Provider Edge and an existing multihoming Provider Edge might both consider themselves to be the DF for the same EVPN ES simultaneously. In this situation, broadcast traffic could be flooded by both Provider Edges. Additionally, in the case of single-active multihoming, transient/spurious MAC moves could happen between the two Provider Edges both considering themselves to be the DF, causing unnecessary BGP update churn and slowing convergence. PR968428
    • This issue is seen when the configured global-mac limit is less than the interface MAC limit and the same interface is configured with packet action. When the traffic is sent with higher packet rate, all the MAC entries are learnt by the Packet Forwarding Engine. Routing Engine later trims this to the configured global-mac limit. When the traffic is sent with lower packet rate, the Routing Engine learns some what more than the configured global-mac limit and subjects the remaining packets(with newer MACs) to the configured drop-action. PR1002774
    • In PBB EVPN scenario, after configuration changes of the EVPN routing-instance, a rare condition might occur --- a internal reference count might unexpectedly become zero when some deletes are yet to be processed. As a result, L2ald crashes. L2ald runs on the Routing Engine to mainly manage MAC learning, aging, removal, and so on. The crash of L2ald might impact the MAC learning related feature during the crash. The impact might sustain around 3--4 seconds. PR1015297
    • There are some configuration related functions in rpd and L2cpd that use special memory APIs called lite pools. These pools when reset were not freeing control information related to these pools, thereby resulting in a leak. This is not a day one issue. This bug is introduced in Junos OS Release 15.1 when we re-implemented LIBJTASK memory subsystem. This PR impacts all daemons using LIBJTASK (including rpd) on all platforms, provided memory lite pools are used by those daemons. PR1071191
    • For Junos OS Release 13.3R5, 14.1R1, and later, the MX-VC Series Virtual Chassis inter-chassis TCP control flows are changed to Virtual Chassis high priority, so high volume of VC inter-chassis TCP control flow might impact Virtual Chassis stability and responsiveness to external protocol events. Now with the fix, the priority of VC inter-chassis TCP control flow has been reverted. PR1074760
    • On MX Series platform with MS-MPC/MS-MIC, memory leaks will be seen with jnx_msp_jbuf_small_oc object, upon sending millions of PPTP control connections (3-5M) alone at higher CPS (> 150K CPS). This issue is not seen upto 50000 control connections at 10000-30000 CPS. PR1087561
    • On XL-based cards such as MPC or IOC3, PPE thread time out errors are triggered when the FPC allocates illegal memory space for forwarding state of routing operations. In certain cases, this results in packet loss depending on the number of packets using this forwarding state. PR1100357
    • On MX Series routers containing multiple Packet Forwarding Engines such as MX240/MX480/MX960/MX2010/MX2020, with MPC3E/MPC4E/MPC5E/MPC6E cards, if the routers have GRE decap, then certain packet sizes coming via these line cards at very high rate can cause these line cards to exhibit a lockup, and one or more of their Packet Forwarding Engines corrupt traffic toward the router fabric. PR1117665
    • In case of IPsec, if the member interface say mams-x/y/z is part of ams bundle in one-to-one mode. The same numbered ms-x/y/z interface should not be used for IPsec or any other services. PR1134645
    • Dynamic-tunnel interface bounces causing memory corruption leading to rpd crash. And the new rpd process once up, synchronizes up with the kernel, which might have information stored about the GRE tunnel IFL created by previous rpd process. The new rpd process using this information from the kernel leading to subsequent rpd crash being triggered. The following logs might be seen when this issue occurs: root@abc>show log messages| match "Address already in use" %DAEMON-3: Error creating dynamic logical interface from sub-unit 32792: Address already in use %DAEMON-3-RPD_KRT_Q_RETRIES: kqp 0x49df00d0: op add queue low-add attempts 4010 ifd index 284, ifl unit 32792, family 2 instance id 0, state CreateIFL RPD_KRT_Q_RETRIES: IFL IFF Update: Address already in use. PR1152912
    • On MX Series routers with MS-MICs and MS-MPCs with the syslog statement included at the [edit services CoS rule rule-name term term-name then] hierarchy level, a system log message is not generated when a CoS rule term is matched, in contrast to the expected behavior in which system log messages are generated when a NAT rule term is matched. PR1159231
    • On MX Series platform, if dynamic VLAN is using in subscriber scenario, ksyncd might crash if subscribers login and logout continuously. This is a timing issue due to the transient replication error for VLAN between master Routing Engine and backup Routing Engine. PR1161487
    • This is an intermittent issue. Assuming that aggregated Ethernet is configured with the bypass-queuing-chip configuration statement. Now, the follow up configuration changes are such that removing child link(s) from aggregated Ethernet bundle, configuring per-unit-scheduler on the removed child link(s) in a single commit causes intermittent issues with per-unit-scheduler configuration updates to CoSd and the Packet Forwarding Engine. Hence, dedicated scheduler nodes might not be created for all units or logical interfaces. PR1162006
    • With MX Series platform acting as TWAMP client and vMX platform acting as TWAMP server setup, we see probe packet loss at TWAMP server, that is, on VMX with Junos OS Release 15.1F5. When the TWAMP target interface address is configured as a media interface (-ge/-xe), probe packets are getting dropped at vMx because of ENDIAN conversion of UDP checksum (vMx is Little Endian and Mx is Big Endian platform) in the probe packet. This issue was seen earlier in Junos OS Release 15.1F4 but was resolved through PR1125516. However, due to some merge issue the fix got overwritten and this issue resurfaced. Also, when the TWAMP target interface address is configured as si-interface, we again see probe-packet loss but this time not because of UDP checksum error. Here, the issue appears because of some looping issue and packet after getting processed at LU (timestamped at LU) is not able to go out of the media interface. Sometimes enabling some debug logs at the Packet Forwarding Engine and changing TWAMP probe packet size resolves the issue (but not always). PR1164093
    • When using MS-MPC or MS-MIC service cards, a single pool cannot be used in different service-sets. Separate pools with different names would then need to be used. Additionally, pools created automatically by a source-prefix or destination-prefix statement will not work if the same source-prefix or destination-prefix statement appears in a different service-set. PR1175664
    • Starting with Junos OS Release 15.1F5-S2, 15.1F6, 16.2R1, and 17.1R1 on vMX series, it introduces new CLI command set chassis fpc X performance-mode num-of-ucode-workers Y to support dedicated users for control and multicast traffic. This will avoid unicast traffic to be hashed to users doing ucode processing. As usual vMX X86 can be configured to run in lite-mode or performance-mode. With this new CLI option, users are allowed to configure number of ucode workers to process multicast and control traffic on separate worker generates a core file. Intention of this command to separate flow cache and non-flow cache traffic, but as part of this fix, only control and multicast traffic will be separated from remaining traffic. In future, we could move other non-flow cache traffic to this dedicated ucode workers. Whenever there is change in num-of-ucode-workers, RIOT will be rebooted and first Y workers will process control and multicast traffic and remaining workers will process flow cache traffic. PR1178811
    • Chef for Junos OS supports additional resources to enable easier configuration of networking devices. These are available in the form of netdev-resources. The netdev-resource developed for interface configuration has a limitation to configure XE interface. Netdev-interface resource assumes that 'speed' is a configurable parameter which is supported on a GE interface but not on an XE interface. Hence netdev-interface resource cannot be used to configure an XE interface due to this limitation. This limitation is applicable to packages chef-11.10.4_1.1.*.tgz chef-11.10.4_2.0_*.tgz in all platforms {i386/x86-32/powerpc}. PR1181475
    • If the operator wants to deactivate the global DDOS parameters (in order to keep effect only the protocol specific DDOS configurations for flow detection), its recommended to use the following two commands: CLI> deactivate system ddos-protection global flow-detection-mode. CLI> deactivate system ddos-protection global flow-level-control. Just using 'deactivate system ddos-protection global' instead, will result in disabling the flow detection completely because this command does also deactivate the master flow-detection configuration statement under it. PR1182078
    • On MX2010/2020 routers with SFB2 and empty fabric slots, a system defect that fetches wrong fabric information might cause MPC7E/8E/9E not being able to come online. PR1182404
    • GUMEM errors for the same address might continually be logged if a parity error occurs in a locked location in GUMEM. Since GUMEM utilizes ECC memory, any error is self-correcting and has no impact to router's operation. In a rare case, such parity error might appear repeatedly at a specific location. Without this software improvement, such error can be cleared by rebooting the FPC. PR1200503
    • Few sessions are always dropped during session setup with IPsec, consistently seen above 1M sessions. PR1204566
    • On changing of members from AMS bundle will impact traffic and SAs. correct way is to reboot members after new members are added to existing AMS bundle. PR1205932
    • In certain cases, the subscriber created on the MS-MPC will not be cleared even though all the sessions associated with that subscriber are cleared. Any new sessions for that subscriber will go through the session creation based on the rules configured and the subscriber will be re-used. No new sessions will be automatically allowed because of the existing subscriber. PR1210820
    • Major errors might be seen on MPC3/FPC3 with 1X100 and 5x100 DWDM MIC/PIC. user@router> show chassis alarms no-forwarding 1 alarms currently active Alarm time Class Description Major FPC 3 Major Errors The following messages are seen in the logs: fpc3 Cmerror Op Sub Set: CORDOBA : CORDOBA(3/0) link 0 : DSP loss of lock fpc3 Cmerror Op Sub Set: CORDOBA : CORDOBA(3/0) link 0 : DFE tuning failed alarmd[16241]: Alarm set: FPC color=RED, class=CHASSIS, reason=FPC 3 Major Errors craftd[15906]: Major alarm set, FPC 3 Major Errors. PR1212089
    • On MX Series platforms and in the case of multi-homed(MH) PEs with EVPN, the rpd might crash during MAC moving between multi-homed PEs causing the traffic loss. PR1216144
    • The /etc/passwd file is created in the process of the first commit when a pristine jinstall image is used to boot for the first time. If event options is configured, the system will try to read the configuration from the available event scripts which requires privileges obtained from the /etc/passwd file. That causes a circular dependency as the commit will not pass if the configuration includes event-options the first time a pristine image boots up, which is the case of an upgrade performed with virsh create. PR1220671
    • Multicast processing is processor intensive on vMX since flow cache is not supported. Ucode workers are hyper threaded, so they are sharing a physical core. PR1221036
    • Change of behavior of reflexive keyword. If "reflexive" keyword is configured in CoS rule, then CoS-service-plugin will store CoS-VALUES [DSCP, Forwarding class] received in forward flow and apply same CoS-VALUES [DSCP, Forwarding class] to packets going back in reverse flow. PR1227021
    • Changing virtual switch type from IRB type to regular bridge, interfaces under openflow protocol got all removed. Openflow daemon failed to program any flows. PR1234141
    • Mobiled daemon is not supported in MX Series Virtual Chassis environment yet, as a result BNG advanced services functionality will not work in MX Series Virtual Chassis mode. PR1241857
    • Problem description =================== na-grpcd memory leak and eventually resulting in na-grpcd crash. Root cause ============ single thread was streaming the data and hence it kept on allocating memory. Fix description =============== We have now made it multithreading to keep the memory usage under control which also improves the performance. PR1254794
    • In case of presence two logical interfaces with the same vlan-id on lt- interface, bbe-smgd process will crash continuously. The issue is specific for Junos OS 15.1F5 Release. PR1257931
    • Duplicate sensor resources created when the difference is trailing "/". PR1263446
    • Due to transient hardware error conditions only syslog events XMCHIP(x) FI: Cell underflow at the state stage - Stream 0, Count 65535 are reported which is a sign of fabric stream wedge. Additional traffic flow register pointers are validated and if stalled a new CMERROR alarm is raised "XMCHIP(x) FI: Cell underflow errors with reorder engine pointers stalled - Stream 0, late_cell_value 65535, max_rdr_ptr 0x6a9, reorder_ptr 0x2ae". PR1264656
    • On a MX Series Virtual Chassis system in a scaled subscriber management scenario, when you perform an in-service software upgrade (ISSU) while protocol sessions are active, the protocols might go down and come up again, which can cause traffic loss. PR1265407
    • On MX Series platform, device might become unaccessible and services go down after executing GRES switchover. PR1266636
    • The smg-service daemon can core in the backup with distributed IGMP configuration. Likely trigger is that during a subscriber login with multiple service activations, the multicast service got activated successfully but the login is aborted for other reasons. The backup Routing Engine which is in the midst of replicating the multicast state has to abort the login and there is problem in this clean-up code. PR1288465
    • CPCDD generates a core file using Routing Engine based http-redirect. PR1293553

    Infrastructure

    • The configuration statement set system ports console log-out-on-disconnect when set, logs the user out from the console and closes the console connection . If the configuration statement set system syslog console any warning is used along with the earlier configuration and when there is no active telnet connection to the console, the daemons try to open the console and hang as they wait for a "serial connect" which is received only by doing a telnet to the console. This issue can be worked around by removing the later configuration, set system syslog console any warning which solves the issue. PR1230657
    • When executing any operations to fetch interface stats, the pfem process or the FPC might crash in rare scenario and generate core files. PR1247026
    • The show system users CLI output displays more users which are not using the router. The request system logout CLI command cannot clear the stale telnet sessions. This is a cosmetic issue, because show system connection and the CLI process show only the current session. user@router> show system users 5:39PM up 8 mins, 3 users, load averages: 0.27, 0.43, 0.26 USER TTY FROM LOGIN@ IDLE WHAT lab pts/0 172.27.208.216 5:36PM - -cli (cli) <---- old telnet session lab pts/0 172.27.208.216 5:38PM - -cli (cli) <---- old telnet session lab pts/0 172.27.208.216 5:39PM - -cli (cli) <---- current telnet session user@router> show system connections |match 172.27.208.216 tcp4 0 0 172.27.116.36.23 172.27.208.216.63830 ESTABLISHED user@router> start shell % ps -aux |grep cli|grep -v grep lab 21016 0.0 0.2 786268 50304 0 S 5:39PM 0:00.15 -cli (cli) %. PR1247546
    • The issue is because logging to console is enabled (set system syslog console and set system ports console log-out-on-disconnect) but there is no active console connection. This causes the console log buffer to become full and processes ( like eventd) that has to log messages in the system get hung. This could lead to the undesired behavior like rpd, lacpd, l2ald etc not getting started correctly. PR1253544

    Interfaces and Chassis

    • After changing the MTU on the IFD, on the static VLAN demux interface above the IFD IPv6 link local address is not assigned. PR1063404
    • During configuration changes and reuse of Virtual IP on an interface as a interface address; it is required to delete the configuration and commit. Then add the interface address configuration in the following commit. PR1191371
    • On EX2300 & EX3400 IPv6 neighbor-ship is not created on IRB interface PR1198482
    • The first IP address from the framed prefix (returned in Framed-IPv6-Prefix) is assigned to the subscriber interface. PR1214647
    • In case if there is a iflset configuration present then the following issue might be seen:

      - After ISSU from FreeBSD 6.1-based Junos OS to Junos OS Release 15.1F throttle, interfaces of windsurf card stay down. when the card is restarted, it goes to ready state. - After unified ISSU from FreeBSD 6.1-based Junos OS to Junos OS Release 16.1 throttle, windsurf card interfaces stay down but NEO card goes to ready state. (issue is seen if ISSU is done from Junos OS Release 14.2 to 15.1 onwards with interface set) - Before performing upgrade (ISSU) to Junos OS Release 15.1 or higher, static interface sets have to be disabled. The interface sets disabled can be enabled after the upgrade. PR1252360

    • In a VPLS multi-homing scenario, the CFM packets are forwarded over the standby PE link resulting in duplicate packets or loop between the active and standby link. PR1253542
    • By default, in Junos OS, the minimum length of the CHAP challenge is 16 bytes, and the maximum length is 32 bytes. In Juniper lab tests, without using the configuration statement challenge-length minimum XX maximum XX. It was found that MX Series does not initialize the default Chap Challenge-Length which as per our document, it should be minimum of 16 and maximum of 32. PR1280263
    • Junos OS upgrade involving releases 14.2R5 (and above in 14.2 maintenance releases) and 16.1 above mainline releases with CFM configuration can cause CFMD core post upgrade. This is due the old version of /var/db/cfm.db PR1281073
    • After GRES switchover "master-only" IP address of fxp0 might not be available. PR1289451

    Junos Fusion Provider Edge

    • On Junos fusion setup, the log capture would not work by issuing the CLI command request support information. PR1220575

    Layer 2 Features

    • When "input-vlan-map" with "push" operation is enabled for dual-tagged interfaces in "enhanced-ip" mode, there is a probability that the broadcast, unknown unicast, and multicast (BUM) traffic might be blackholed on some of the child interfaces of the egress aggregated Ethernet (AE) interfaces or on some of the equal-cost multi-path (ECMP) core-links. PR1078617

    Layer 2 Ethernet Services

    • When an MX Series router functions as DHCP local-server, the configuration used to deactive the local-server is invalid, it could cause the server to be halted but the subscriber entries remain active and stranded. This in turn causes unexpected consequence, for instance, prevents deactivating all dynamic-profiles prior to the upgrade to enable the dynamic-profile versioning feature, and cannot ping the subscribers after upgrade. PR935931
    • After changing the underlying IFD for a static VLAN demux interface the NAS-Port-ID is formed still based on the previous IFD. PR1255377

    Multiprotocol Label Switching (MPLS)

    • In BGP prefix-independent convergence (PIC) edge scenario, when the ingress route (the primary route) fails, due to the fact that LDP might fail to send the session down event to Packet Forwarding Engine correctly, the Packet Forwarding Engine might still use the primary path to forward traffic until (in some cases, 3-5 seconds for 30000 prefixes) the global convergence is completed by the interior gateway protocol (IGP). In addition, the issue might also be seen when the delay-delete CLI command is configured. In this scenario, the session down event might get sent to the Packet Forwarding Engine correctly. However, due to local reversion, the primary path might also be chosen as forwarding path when it is deleted. PR1097642
    • When graceful Routing Engine switchover (GRES) is done between the master and backup Routing Engines of different memory capabilities (such that one has only enough memory to run routing protocol process (rpd) in 32-bit mode while the other is capable of 64-bit mode, which could be caused by using Junos OS Release 13.3 onwards with the configuration statement auto-64-bit configured, or, using Junos OS Release 15.1 onwards even without the configuration statement), rpd might crash on the new master Routing Engine. As a workaround, this issue could be avoided by the CLI command set system processes routing force-32-bit. PR1141728
    • When configuring CCC remote-interface switch or LSP-switch, self-ping should be disabled on the LSPs, referred-to in the CCC configuration, by configuring the following: [edit protocols mpls label-switched-path lsp1] + no-self-ping; Not doing the above, would cause the LSPs to not complete MBB (make before break). PR1181407
    • A new configuration protocols mpls traffic-engineering bgp-igp-both-ribs in the routing-instance is required to make COC work. PR1252043
    • In case of performing SNMP get for MPLS LSP statistics MIBs like ( mplsLspInfoList,mplsLspInfoOctets) or having LDP traffic statistics enabled. Kernel memory leak might be seen. The incremental usage of the kernel memory will cause the system to run out of memory, eventually it will impact functionality. In case router runs out of memory below logs might seen as aresult : root rpd[65406]: RPD_KRT_Q_RETRIES: route add: Resource temporarily unavailable root last message repeated 10 times root mgd[46362]: check_regex_add: 1072 regex_add = 0 root rpd[65406]: RPD_KRT_Q_RETRIES: route add: Resource temporarily unavailable root kernel: rts_rnh_num_alloc(6190) error: 12, errmsg: rts_rnh_num_alloc Memory allocation Problem - selidlist root kernel: rt_pfe_veto: Memory over consumed. Op 1 err 55, rtsm_id 5:-1, msg type 2 root kernel: rt_pfe_veto: Memory over consumed. Op 1 err 55, rtsm_id 5:-1, msg type 2 root kernel: rt_pfe_veto: Memory over consumed. Op 1 err 55, rtsm_id 5:-1, msg type 2 root kernel: rt_pfe_veto: Memory over consumed. Op 4 err 55, rtsm_id 5:-1, msg type 21 To recover from this issue, routing-engine switchover might be performed. In order to check if the issue is being hit, specific kernel loger can be enabled : 1- Check for temp usage in kernel virtual memory : root@root> show system virtual-memory | match temp temp 81718 1525K - 274135 16,32,64,128,256,512,1024,2048,4096,8192,65536 {master} root@root> show system virtual-memory | match temp temp 82688 1540K - 275132 16,32,64,128,256,512,1024,2048,4096,8192,65536 <<< continous increment shoudl be seen 2- Enable kernel logger : >start shell user root # sysctl -w net.kern_logerr_enabled=1 below log might be seen in message file : rts_rnh_num_alloc(6190) error: 12, errmsg: rts_rnh_num_alloc Memory allocation Problem - selidlist **** After performing this change, remember to set the logger flag to 0 again : # sysctl -w net.kern_logerr_enabled=0 . This issue is fixed through internal PR1258308. Fixed in 16.1R4-S1/15.1R6/14.1R9 PR1265042
    • The throughput measurement might be inaccurate when doing performance measurement on a MPLS label-switched-path. PR1274822
    • The RPD core involved an LDP egress route, which was stitched to a BGP route via the ldp egress-policy configuration. There is an assumption that LDP egress route should always be associated with a label, in order to install the ldp route. Shortly after Routing Engine switchover during a route flash, LDP found an egress route without a label and the RPD core happened. The issue was due to erroneous logic in LDP during Routing Engine switchover, which caused LDP to delete the label from the egress route. This triggered the RPD core during subsequent route flash. PR1290789

    Platform and Infrastructure

    • On T Series platform, when reloading the chassis which has SONET Clock Generators (SCGs) equipped, due to the timing issue (the issue might not be consistently observed), "No CG online" RED alarm might be displayed on the LCD panel and not cleared while in fact the SCGs are coming up later and this alarm should be cleared. PR991533
    • When TCP authentication is enabled on a TCP session, the TCP session might not use the selective acknowledgement (SACK) TCP extensions. PR1024798
    • On MX Series with MPCs/MICs based platform, when the feature flow-control is disabled (enabled by default) by using CLI command no-flow-control (for example, under "gigether-options" hierarchy), after bringing up or rebooting the MPC, due to the fact that status of the hardware might not be updated correctly, the flow control on that MAC might remain enabled. PR1045052
    • In configurations with IRB interfaces, during times of interface deletion, such as an FPC reboot, the Packet Forwarding Engine might log errors stating nh_ucast_change:291Referenced l2ifl not found. This condition should be transient, with the system re-converging on the expected state. PR1054798
    • On MX Series platform, parity memory errors might happen in pre-classifier engines within a MPC. Packets will be silently discarded as such errors are not reported and makes it harder to diagnose. After the change in this PR, CM-ERRORs, such as syslogs and alarms, will be raised when parity memory errors occur. PR1059137
    • StartTime and EndTime of the flow in inline-jflow (version 9) has future time-stamp. PR1067307
    • SNMP queries to retrieve jnxRpmResSumPercentLost will return the RPM/TWAMP probe loss percentage as an integer value, whereas the precise value (including decimal points) can be retrieved through the CLI by using the following commands: show services rpm probe-results and show services rpm twamp client probe-results .PR1104897
    • From Junos OS Release 15.1F5 and later, the hidden configuration statement filter-list-template will be enabled by default for all firewall filters on MX Series based platforms to use a common program on MX Series-boards for all interfaces that use the same filter list. This can save MX Series board microkernel memory and DMEM memory. The hidden configuration statement no-filter-list-template can be configured to disable this behavior. PR1157079
    • On MPC5E and MPC6E line cards auto next hop tracing upon PPE traps might have permanent impact on packet forwarding and is now disabled. PR1166479
    • The delegated BFD session over aggregated Ethernet interface failed to come up after FEB switchover with FEB redundancy group (1:1 and 1:N). PR1169018
    • Multicast traffic might get dropped when the STP port role is changed. Work around is to toggle the IGMP snooping membership. PR1193325
    • junos:key attribute, which is emitted in the XML format of the configuration, will not be emitted in the JSON format of the configuration. PR1195928
    • Due to a code defect related to ephemeral database, rpd might crash if ephemeral database is enabled. PR1214298
    • Due to transient hardware events, fabric stream might report 'CPQ1: Queue unrderrun indication - Queue <q>’ in continuous occurrence. For each such events, all fabric traffic is queued for this Packet Forwarding Engine reporting the error and causes very high amount of fabric drops. PR1265385

    Routing Protocols

    • On MX Series router, when a instance type is changed from VPLS to EVPN, and in the same commit an interface is added to the EVPN instance, the newly added EVPN interface might not be able to come up. PR1016797
    • With Shared Risk Link Group (SRLG) enabled under corner conditions, after executing command of clear isis database, the rpd might crash due to the IS-IS database tree gets corrupted. PR1152940
    • Customer was qualifying Junos OS Release 16.1X60-D40 on MX960 for BNG/subscriber management functionalities. Several times they encountered an issue with RPD process, which unexpectedly goes to 100%: {master} root@router> show system processes extensive | no-more last pid: 76128; load averages: 1.51, 1.46, 1.68 up 6+04:38:02 14:32:44 198 processes: 2 running, 195 sleeping, 1 waiting Mem: 1415M Active, 5284M Inact, 2441M Wired, 2088M Buf, 6752M Free Swap: 8192M Total, 8192M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 10 root 4 155 ki31 0K 64K RUN 3 509.5H 304.10% idle 5207 root 4 20 0 3017M 2140M kqread 0 23.0H 100.00% rpd 4925 root 2 -26 r26 556M 47060K nanslp 1 511:02 5.08% chassisd 5185 root 1 20 0 698M 176M select 2 139:31 0.20% authd 5002 root 1 20 0 455M 7464K select 1 32:43 0.10% license-check 11 root 30 -72 - 0K 480K WAIT 255 888:28 0.00% intr 52981 root 1 35 15 459M 10360K select 1 469:19 0.00% sampled .. From syslogs we can observe the following messages: Dec 7 03:36:56.615 2016 lab31 rpd[5474]: RPD_KRT_Q_RETRIES: route table add: Resource temporarily unavailable Dec 7 03:36:56.615 2016 lab31 rpd[5474]: RPD_SYSTEM: Get index for rt table failed: Resource temporarily unavailable Dec 7 03:36:56.615 2016 lab31 rpd[5474]: RPD_KRT_Q_RETRIES: route table add: Resource temporarily unavailable Dec 7 03:36:56.615 2016 lab31 rpd[5474]: RPD_SYSTEM: Get index for rt table failed: Resource temporarily unavailable Dec 7 03:36:56.615 2016 lab31 rpd[5474]: RPD_KRT_Q_RETRIES: route table add: Resource temporarily unavailable. PR1240273

    Services Applications

    • In L2TP scenario, when the LNS is flooded by high rate L2TP messages from LAC, the CPU on Routing Engine might keep too busy to bring up new sessions. PR990081
    • When polling to jnxNatSrcNumPortInuse via SNMP MIB get, it might not be displayed correctly. PR1100696
    • In Junos OS Release 13.3 and later, when configuring a /31 subnet address under a NAT pool, the adaptive services daemon (SPD) will continuously crash. PR1103237
    • This issue would only been seen with a very unique configuration where there are thousands of routes being added by the SPD daemon, the dameon which manages installation of NAT return routes and destination routes. Commits to the configuration would have to be performed nearly back to back. PR1223729
    • Business services are activated and a Routing Engine switchover is performed. In this case, if we try to deactivate the business services (aka ESSM subscribers) by logging out the parent PPP session, the business services are getting stuck in terminating state. Business services that have LI applied are stuck and the services not having LI are logged out successfully. PR1280074
    • For multiservice-pic, we retain SAs in kmd (ipsec-key-management deamon) when PIC offline and set them as ’not installed’, then install the same SAs again when PIC comes online. Sometimes, reverse routes have reference count problem if SAs are retained during PIC restart. It will cause incorrect next-hop of reverse routes. PR1285907

    Subscriber Access Management

    • In subscriber management environment, after performing the graceful Routing Engine switchover (GRES), if the Routing Engine switchover happens before the Acct-Start response is received, and the timeout on service session happens before timeout on subscriber session, the authentication process (authd) might crash. PR1074011
    • Subscribers stuck in terminated state during PPPoE login or logout test. PR1262219

    VPNs

    • In the Layer 2 circuit environment, when the l2ckt configuration includes the backup-neighbor statement, the flow label operation is blocked at the configuration level. PR1056777
    • In NG-MVPN scenario, when forwarding-cache timeout never non-discard-entry-only is configured for an MVPN instance, even though the cache lifetime is shown as forever in the output of CLI command show multicast route instance X extensive, the route disappears after 7-8 minutes. PR1212061

    Modified: 2017-07-24