This is a follow-up post of my previous two, MarkUs Performance Analysis (1) and Markus Performance Analysis (2). Please have a look at them as they detail the set-up of the lab machines I’ve been using as well as which scripts have been used and how to use them. Also note that I’ve added a couple of additional scripts in order to make it easier to kick off load test runs. More on the load testing scripts are in this review request. At this point I am able to report first results. If you are interested in the gory details, please have a look at this Git repo which contains all raw logs in addition to the spreadsheets and other things I’ve produced while conducting experiments.
Here is a table with first results of various experiments I’ve conducted so far. A “Runner” is one call to ./post_submissions.sh. Each call to ./post_submissions.sh in turn generates 7 requests to the MarkUs server. Unless otherwise noted, results are for a clean MarkUs installation (no Subversion repositories exist). Note that the timing info is the elapsed time as reported by /usr/bin/time for each curl call. Results listed are averages over number of students (one student is one sample).
Here are a few observations I’ve made during my experiments.
- Both, mongrel based and Passenger based MarkUs set-ups seem to be IO bound under load on a clean MarkUs instance (load averages > 2; up to 17’ish on a dual core server machine). Clean MarkUs instance means no Subversion repositories exist prior to each experiment.
- Memory does not seem to be an issue. 2 GB of RAM was no problem for 24 mongrels.
- Top reports around 40% IO waiting when experiments are run. The question is where are these IO requests coming from?
- When an experiment runs, Ruby processes show up at the top of the list of “top” (for both, mongrel and Passenger setups). If I add the WCHAN field in top, the sleep function of these processes seem to be filesytem related or scheduler related. Note the filesystem of the server is ext4. In particular, the most prominent “sleep functions” are jbd2_log, synch_page, get_write and blk_dev_is.
- I’ve used oprofile in order to profile the entire system when an experiment is run. Here is an example of opreport outputof one run. Not surprisingly about 60% of the time the CPU was not halted a Ruby process was running (libruby.so.1.8.7). Note that Subversion and PostgreSQL related percentages are negligible. I’m not sure why. Does anybody have thoughts on this?
- A second run of ./run-load-tests.sh with Subversion repositories already existing and submissions resulting in conflicts (i.e. no SVN IO) are significantly faster (~3 times faster). See last experiment in table.
- Passenger based setups seem to use 6 ruby processes. Seems to be similar to what could be achieved via an Apache reverse proxy with a cluster of 6 mongrels. Perhaps this is different for non-IO heavy workloads.
So what does all of this mean? Good question. Here are my thoughs:
- I think the main causes of heavy IO and, hence, sources of slowdown are request number 5 (where Subversion repositories get created) and request number 7 (where file submissions are happening and files are stored in Subversion repositories).
- There does not seem to be a significant difference between Mongrel and Passenger based set-ups (at least for the IO heavy experiment).
- Something else?
Now I’d be interested in your input. How would you interpret the data? What additional experiments should I run? Did I make a mistake somewhere? Please leave your feedback in the comments. Thanks!