Consistency graphs


Comments:
- As expected, the results are consistent from one test to another (the confidence intervals overlap well).
- SpamAssassin filter didn't cause any false positives during the tests. Therefore we do not bother scaling the values of false positives in order to make them readable on the graphs as it is normally done when the false positive values are small but not equal to zero.
- The result of the test with the time scale equal to 1 is missing because we need more time to generate it (will be added soon).
- Changing the time scale shows a very interesting tool property. As it is expected, changing the time scale from 60 to 30 almost does not change the results at all, because the same initial seed provides the same scenarios of email users and spammers behavior in the two tests, and because the tests happen on the same machines (the same network). Noticeable difference in the case of time scale equal to 15 is due to the longer real time needed to perform the runs of the test: some Planet Lab machines got rebooted and the appropriate runs were invalidated and not used for computing the mean and confidence interval (by default each test consists of 20 runs). This additional randomness caused the result in the case of the time scale equal to 15 to be different then in the cases of time scale equal to 60 or 30, but the confidence intervals overlap in all the three cases still indicates a good consistency of the three tests. We can conclude that using the time scale not only speeds up the tests, but also better avoids the problems caused by unstable Planet Lab nodes.