2006-02-01 Comparison of speed of different CPUs and variation of client/server JVM and 32 or 64 bit matlab on the 64-bit machine. NOTE: don't pay TOO much attention to the means: note that there can be considerable differences between different algoithms' speed on different processors. Computers: `box': amd64 3500+, Venice core, 2.2GHz, 512KB L2 1G RAM (dualch DDR400) linux-2.6.15 (gentoo system) `one': p4, HT, 3.0GHz, 512KB L2 1G RAM linux-2.6.13 (gentoo system) Using Matlab Version 7.1.0.183 (R14) Service Pack 3, with variation of JVM and architecture (64-bit amd, 32-bit intel). Matlab was started with the -nodektop option, in a `konsole'. Tests are as in the matlab `bench' command in the above version. Only the non-graphical ones are performed. A parameter can be used to change the size of the data worked upon. Results given are means of several (3-5) consecutive runs, and are not given unless every test had the same time to within about 2% between runs. The `server' Java virtual machine is used unless otherwise stated. It is expected that this makes no difference as no GUI or plotting is performed. ========================================================== 1: parameter of 1 (same data-size as matlab bench.m) LU FFT ODE Sparse TOTAL 0.40 0.48 0.20 0.56 1.65 (box, 32-bit) 0.29 0.38 0.14 0.50 1.31 (box, 64-bit) 0.23 0.53 0.32 0.57 1.66 (one, client-jvm) 0.23 0.54 0.32 0.57 1.66 (one) ========================================================== 2: parameter of 1.15 (1150x1150 matrix for LU, FFT on 2^23 instead of 2^20, longer ODE time...) LU FFT ODE Sparse TOTAL 0.59 6.21 0.24 0.96 8.00 (box, 32-bit) 0.42 6.33 0.17 0.83 7.75 (box, 64-bit) 0.43 8.55 0.37 0.95 10.29 (one) ========================================================== 3: parameter of 0.9 LU FFT ODE Sparse TOTAL 0.31 0.10 0.18 0.38 0.97 (box, 32-bit) 0.22 0.09 0.13 0.34 0.78 (box, 64-bit) 0.18 0.12 0.29 0.40 0.99 (one) ========================================================== Conclusions? Somehow, the 64bit version on the same processor does have a considerable advantage over the 32bit version, of some tens of percent for LU factorisation. For the benchmark mixture of 4 tests of similar time consumption (at parameter of 1) the 3.0GHz P4 does overall about the same as the 2.2GHz AMD64 in 32bit mode. The Pentium 4 is rather faster even than the AMD64 in 64bit mode for LU factorisation: at paramter of 1, for example, it's ~25% longer time for the AMD. The AMD64 is much faster then the Pentium 4 at ODE solving: in 64bit mode it's about twice as fast!