[IIAB] [UKids] Internet-in-a-Box speed profiling tips on different CPUs?

Sun Jun 8 12:59:20 PDT 2014

On 06/07/2014 09:05 AM, Tim Moody wrote:

> Of course, the trick in all this is to make the tests repeatable so that
> comparisons are meaningful.  I agree with Tony that accessing select
> maps, wiki pages, and perhaps videos are good candidate load tests and I
> would add some collaborative tests that exercise ejabberd.

Just to set expectations, I highly doubt there will be some miracle 
change that makes the setup run like greased lightning.

Also, the scope of the testing I'm talking about doing is very limited 
in nature.  Specifically, I'm going to look at determining if the use of 
the indexing in IIAB to look up data is primarily an IO problem or if 
there are aspects of it that are CPU bound.

This knowledge will be useful in deciding where best to spend $$ on XS 
hardware designed to serve that content.

> Some efforts have been made in the past such as
> http://wiki.laptop.org/go/XS_Load_Testing

Thanks.  I'll take a look at that.

> Perhaps Adam can give us an idea of the situation at the deployments in
> Haiti.

+1 Adam or George? This would be useful information.

> Saint Jacob's represents a reasonable example of the deployments
> I am familiar with: three classrooms, 30-40 XOs per classroom, a total
> of 100 XOs connected to a single school server, one router per
> classroom, dhcp provided by the server (all laptops on the same LAN).

I fear that getting low-latency out of this setup will be a tall order 
for any low end server.  Consider the following:

A reasonable number for the throughput of a 802.11g wireless network 
with 40 nodes is 10Mbit/s.  If a search request involves 1k of data then 
all 40 nodes could issue a request in the space of 32ms.  Essentially 
all at once.  So there could be 40 search requests outstanding on the 
server.  I don't know what the multiplication factor of a search request 
to disk IO is but I'd be surprised if it was less than 50.  Lets be 
optimistic, assume I'm wrong and its 25.  So those 40 nodes can generate 
40*25 or 1000 IO requests.

A look at newegg shows that the best 7200 rpm 1T drives are giving 5ms 
as the average seek time.  Others are 15ms. That 5 to 15 seconds of just 
head-seek time.  Throwing in data transfer time into that could easy 
push it up into total latency of a minute.

Of course its just lies until I have data to back it up but I'm going to 
make a prediction that the latency performance correlates more with the 
amount of RAM and HD seek speed than it does with CPU speed.

That said there's a wildcard in the system.  I've only done a cursory 
look at the stuff that is on the IIAB disk Adam brought me but the 
indexing appears to use a python package called Whoosh that is pure 
python.

The Whoosh docs claim that just because its pure python does not 
automatically mean that it's slow.  The docs have a single point 
non-concurrent benchmark that shows that the searching is comparable to 
something non-python like xapian.  That benchmark though was on a pretty 
beefy machine.  In general python programs tend to thrash the CPU cache. 
  Better/faster (more $$) CPUs usually have more cache to keep the CPU 
fed so this could be an explanation if it turns out to correlate with CPU.

Also depending on how its done calling the searching implementation 
concurrently via the web server front end is ripe for having scaling 
problems.  If 40 search requests involve creating 40 python instances 
then thats going to be a massive increase the peak load.

> A class doing a research project on different types of reptiles using
> wikipedia for schools. Each student is to write a report of 50-100 words
> with an image of a reptile they select. Assume they search for reptiles.
> Go to the article. Select a reptile type, e.g. lizard and then download
> an image from the article.

Thanks.  This seems like a great workload to use as a baseline test.

-- 
Richard A. Smith