In order to get ARM64 servers into large scale production, both out-of-band and in-band RAS (Reliability/Availability/Serviceability) solutions have to be in-par with servers of other architectures. This presentation discusses requirements and designs of various RAS solution, how hardware/firmware and higher level software working together to achieve the goals, with focus on firmware. RAS topics covered include memory, PCIe, CPU, thermal management and catastrophic errors. In addition, future directions/expectations of server RAS technologies are presented. —————————————————


