Running a stupid test, on a stupid cluster, to flush out a stupid non-repeatable error, so that we can fix it. Basically, the damn thing dies “every now and then” with no discernible stimulus. Wrote up a crap-tastic test suite to stress each of the following factors in turn: Network, CPU, disk reads, disk writes, jobs in the queue. Right now it’s busily populating 20 different files from 20 different computers with lines of the form “I was here at $date”. If that doesn’t provoke the error in an hour or so, I shall have all 20 computers computing Pi for several hours. If that doesn’t kill it, they will begin to talk to each other very rapidly. “Hello?’ “hello.” “hi?” “hi.” “Hellooooo?” “Hi.” “Can you hear me now?” “How about now?”
In other news, there is a cat in heat in my neighborhood. It has been mrowling loudly for about four days now. If anyone has a male cat that needs lovin’, please send it to my street. This “mrowl, mrowl, mrooooooowl, mrow, ow,…” needs to stop.
Leave a Reply