    Understand What Happens When Memory Failures Occur

    Most Juniper Networks Routing Engines support Error Checking and Correction (ECC) protected memory. There are two types of memory errors: single-bit and multiple-bit.

    A single-bit error is when a single 0 or 1 bit is incorrect. The system detects and corrects single-bit errors, then logs the event in the /var/log/eccd file. If there are persistent single-bit errors, the Routing Engine controller reboots the Routing Engine. Persistent single-bit errors could be a symptom of bad RAM.

    Multiple-bit errors are when multiple bits are incorrect. By default, if a multiple-bit error is detected, a nonmaskable interrupt (NMI) is generated to interrupt the Routing Engine and panic the kernel causing the router to subsequently reboot. The Routing Engine panics the kernel, and leaves a vmcore file. Multibit parity error detection was implemented in Junos OS Release 5.3 and above.

    Published: 2012-08-20