Computer clusters power everything from Google and Facebook to online retail and banking. They’re comprised of hundreds or even thousands of machines connected together by networks, typically in a vast data centre.
“It’s a complex system,” says Cheriton School of Computer Science Professor Samer Al-Kiswany. “In a big cluster, you have thousands of computers connected by hundreds of switches and routers — switching devices that forward traffic between the nodes — and you have complex software to manage the switches and routers. Systems need to be able to tolerate not just node failure, but network failure as well.”
Read the full story.