Wednesday, December 22, 2010

CIS(theta) Meetings VII & VIII (2010-2011) - pelican HPC Trouble Shoot!

Aim: 
pelican HPC Trouble Shoot!

Attending: 
CIS(theta) 2010-2011: DavidG, HerbertK, JoshG, RyanH

Visiting:
CIS(theta) 2009-2010: SteveB
CIS(theta) 2008-2009: MarcA
CIS(theta) 2007-2008: ChrisR, FrankK

In Spirit:
CIS(theta) 2010-2011: JayW
CIS(theta) 2009-2010: ArthurD, DevinB, JeremyA
CIS(theta) 2008-2009: MitchelW
CIS(theta) 2007-2008: NathanielR

Reading: 
Nope, just trouble shooting...

Research: 

As you can see, we had a full house today! Wow, thanx guys for the great turnout right before the holidays and all the trouble shooting you did! BTW, this meeting took the place of both meetings for this month as I had to cancel two weeks ago.

Anyway, we tried booting a single node of the cluster from a student station (AMD Athlon 64bit dual-core) using a pelicanHPC 64bit 2.1 CD.  We ran the pelican_setup script and compiled flops.f

mpif77 -o flops flops.f

and ran it stressing only one core to see how many MFLOPS we could get

mpirun -np 1 --hostfile tmp/bhosts flops

and we got nearly 400 MFLOPS. Then we tried 2 cores and got a bit over 750 MFLOPS. Try as we might, we could not get two or more nodes of this type to boot up via PXEboot using the school Linux network. Then we tried a cross-over Ethernet cable and got 4 cores running flops up over 1500 MFLOPS! We tried the same thing with the 2 new Intel Xeon 32bit dual-core servers we just inherited. The difference here was we used a pelicanHPC 32bit 2.2 CD which compiles and runs flops automatically for a single node. Also, we got PXEboot to work over the school Linux network and got 4 cores running over 2200 MFLOPS!

We also tried hooking up some laptops to a private router via pelicanHPC 2.2 and mixed in some of the student stations. This worked well since it was off the school network. At one point we had 4 nodes running 8 cores at about 3400 MFLOPS. Thanx for bringing in your own laptops, routers and cables! Maybe we should get our own gigabit switches?

OK, so that's progress! I think we are sick of playing with the school network: public vs private, eth0 vs eth1, starting a DHCP server or not. So, the consensus now is to roll our own version of pelicanHPC (see research links above - last link is not HPC retated and is there just for fun). We'll remaster pelicanHPC without its own DHCP server to make use of the existing DHCP server on the public Windows network on eth0 since pelicanHPC seems to work best on eth0. Not to worry, pelicanHPC is based on debianLive which is supposed to be easy to remaster.

Let us proclaim it through out the land, our remaster shall be named shadowfaxHPC! OOPs, sorry to get all LOTRy there. Where's Gandalf when you need him?

Have a great Winter Break and Happy Holidays!

Rest up, you're going to need it....

Happy Clustering,

2 comments:

  1. Hi, can you share your remastered shadowfaxHPC version ? I would save me a lot of work and I only use it for academic and non-commercial purpose. Thanks.

    ReplyDelete
  2. Sorry, I thought I replied to this awhile ago. Anyway, we never did remaster the DVD. We use pelicanHPC.org as is. HTH, AJG

    ReplyDelete