Wednesday, March 25, 2020

COVID-19: Finding a cure with Folding@Home

Got extra compute power and want to help research scientists create a cure for the virus?  Folding@Home is a distributed computing network where we donate our spare compute power by running a client which processes work units which are like puzzle pieces for the bigger compute dataset. They've already grown the total compute power to larger than the world's 7 fastest super computers combined.

How do you know if you are helping research the cure?  Check this link and look at your project ID.  I've stood up an 8 computer F@H farm and plan on adding another 2 computers once I get parts delivered.  4 of these nodes are actually unraid VM's which each have a GPU assigned even though they are physically one server. You can see a picture of the unraid server build for the GPUs here.  The 5 CPU only nodes are below minus my laptop.  These are all the systems I replaced with the single unraid server.



I've used the advanced control tool to add in all of the node members.  You need to set up each one enabling remote access as it's locked down by default.  When they are all added, you can see your total Points Per Day for all nodes on the bottom:


I found that running 4 GPU's + all VM vcores draws > 1,100 watts and overloads my UPS.  GPUs run much faster than the CPU cores, so I've prioritized GPUs for the unraid server which is why the 4 VM nodes don't have CPU slots (saving just enough juice to not overload the UPS).  The 6 other machines are CPU only as they are using onboard video since I pulled the GPU's for the unraid server build.  So in total, the unraid server pulls about 900 watts while 4 of the 6 CPU only systems are currently pulling about 500 watts.

Update: You'll want to disable Spectre & Meltdown microcode mitigation as it slows down intel based systems by up to 30%.

What I've found is that the back end servers are having bandwidth/connectivity issues which causes the client to end up spending a long time waiting for work.  I create a few different scripts for forcing the client to attempt a more frequent download.  Here's the script I run locally on my main system when it gets stuck in waiting for work.



I created this script which runs on startup for pure folding machines which has loop logic to restart when CPU & GPU usage is low.