Operating System for the Earth Simulator.

Accession number;03A0133863
Title;Operating System for the Earth Simulator.
Author; YANAGAWA T (Nec Corp.)
Journal Title;NEC Res Dev
Journal Code:G0138A
ISSN:0547-051X
VOL.44;NO.1;PAGE.43-46(2003)
Figure&Table&Reference;FIG.3, REF.1
Pub. Country;Japan
Language;English
Abstract;The Earth Simulator is a large-scale distributed memory parallel computer system consisting of processor nodes with shared memory vector type multi processors. Its development started under the former Science and Technology Agency (now,Ministry of Education,Culture,Sports,Science and Technology) in 1997 and was completed in February 2002. This paper presents an overview of the operating system of the Earth Simulator (ES). The operating system for the ES is based on SUPER-UX, the UNIX.DAGGER. operating system for the general purpose supercomputer SX series. In order to realize high-performance parallel processing on the highly parallel machine, the operating system is enhanced especially in scalability. The ES system is managed as a two-level cluster system called Super Cluster System. In the Super Cluster System, the ES system is divided into 40 clusters (16 nodes/cluster), and the single controller called Super Cluster Control Station (SCCS) manages all the clusters. This management system provides Single System Image (SSI) operation, management and job control for the large scale multi-node system. The Job Scheduler (JS) and NQS running on the SCCS control all the jobs of the system. They schedule the resources which generally have not been treated as the scheduling resources, such as processing nodes and files. This function realizes efficient scheduling for large-scale jobs. (author abst.)