University of Texas at El Paso
Sacagawea

Sacagawea

Welcome to Sacagawea, UTEP's Penguin Computing Linux cluster. Sacagawea consists of 64 compute nodes, 1 head node and 1 storage node that are interconnected by two high-speed Gigabit Ethernet switches. Each compute node, the head node and the storage node has 2 AMD Opteron processors running at 2 GHz, 4 Gigabyte of RAM and a local 120 Gigabyte hard disk (accessible as /scratch). The storage node has an additional disk array attached which contains approximately 3 Terabyte of disk space in a redundant RAID-5 configuration. This storage has been divided in two partitions that are mounted on every node under /u1 and /u2.



Sacagawea, the new Beowulf cluster at UTEP

The cluster runs the Scyld operating system which has been developed by Donald Becker, one of the two original Beowulf Cluster inventors. Scyld is based on the Single System Image model (SSI) in which cluster users do not see processes running on the individual computer nodes, but all processes are visisble on the head node. This is achieved via a special Linux kernel module called BProc. This means that when for example an MPI program has to be run on the cluster, knowledge of the individual nodes is not necessary. Just tell the scheduler how many nodes you need and your job will transparently be run on this amount of nodes.


If you need an account on Sacagawea, please contact Sergio Zapata (snzapata at utep.edu)

If you have an account please subscribe to the sacagawea-users email list to stay up-to-date on the latest news and maintenance downtime.

Sacagawea was purchased from funding of the UT High Assurance Systems Infrastructure STAR award.

   
News and Information    

Scheduler problem resolved - Monday, April 16, 2007 - Monday, April 16, 2007
The problem with the job scheduler has been resolved and the cluster is ready to accept jobs again. 

problems with mpirun - Wednesday, March 14, 2007 - Wednesday, March 14, 2007
Currently starting MPI programs on the cluster is not possible due to the fact that the mpirun/mpiexec programs are broken. We are working on the problem together with the vendor. Thanks for your patience. 

Cluster is operational again - Friday, February 16, 2007 - Friday, February 16, 2007
A reboot of all the compute nodes solved the problems with the scheduler. Please resubmit your jobs again. Again, sorry for the inconvenience. 

Cluster scheduler problems - Thursday, February 15, 2007 - Thursday, February 15, 2007
Probably because of the problems of this morning, scheduling jobs on the cluster does not work at this moment and gives errors. I am working to find the cause of this problem and solve it as soon as possible. Until that time, please have patience... Sorry for the inconvenience. 

NFS problems this morning - Thursday, February 15, 2007 - Thursday, February 15, 2007
The NFS node was experiencing a kernel panic this morning and had to be rebooted. Because of many hanging processes on the head node, it was rebooted as well. For the moment we have also gone back to the Torque scheduler due to scheduling problems by Maui. Therefore the showq command does not work; everything else is operational. 

Ganglia cluster monitoring available - Friday, February 02, 2007 - Friday, February 02, 2007
The web-based cluster monitoring tool Ganglia is available at http://sacagawea.utep.edu/ganglia. This will show you a detailed overview of the current load on the cluster nodes. 

Cluster maintenance finished - Friday, February 02, 2007 - Friday, February 02, 2007
Sacagawea was down for maintenance for most of today, but is back up again. All the nodes have received a fresh reboot and the Maui scheduler has been installed on the system. See the Manuals page for more details on Maui. read more ...

Node nr 1 fixed and operational - Friday, January 12, 2007 - Friday, January 12, 2007
Compute node number 1 in the cluster had a corrupt memory bank that has been replaced by the vendor and is now operational again. Thanks for your patience.  

New password policy implemented - Friday, January 05, 2007 - Friday, January 05, 2007
We have implemented a new password policy for the cluster. Your password will have to be changed every 60 days after your last password change. The system will start warning you about this 7 days before your account is locked. If you have problems logging in, please contact hipcep@utep.edu

Beorun bug for stdin/stdout discovered - Friday, January 05, 2007 - Friday, January 05, 2007
We have discovered a bug in beorun, which makes impossible to schedule jobs that need to run files from stdin (<) or write to stdout (>). More info can be found in note 2 on the Manuals page. read more ...

Happy new year! - Friday, January 05, 2007 - Friday, January 05, 2007
A warm welcome back to all users of the Sacagawea cluster. Hope you all had a great holiday and are looking forward to doing tons of work now that you're back.  

PGI compilers installed on the cluster - Friday, November 03, 2006 - Friday, November 03, 2006
The Portland Group C/C++ and Fortran compilers have been installed on the cluster. Drop akerstens@utep.edu an email if you need access to these tools. Many thanks to Dr. Carrasco for providing the funding for this.