| | |

Intel/EM64T Linux Compiler Comparisons
Fortran Execution Time Benchmarks

	Absoft 10.0.3 GA	g95 4.0.3 (14-OCT-06)	gfortran 4.2.0	Intel 10.0.023	Lahey L8.00a	NAG 5.1 (282)	Pathscale 2.4	PGI 6.2-4
AC	18.71	30.82	21.84	21.37	34.10	76.75	18.64	23.05
*AERMOD*	58.00	62.98	68.46	37.80	43.04	61.36	56.39	78.23
*AIR*	17.76	21.78	18.97	12.34	11.39	17.87	16.43	15.89
*CAPACITA*	64.84	99.93	87.27	69.76	89.56	114.30	66.09	65.28
*CHANNEL*	14.11	28.72	10.26	12.06	12.37	12.64	13.86	11.67
*DODUC*	71.35	80.90	67.38	50.21	62.35	87.32	60.17	61.46
*FATIGUE*	12.13	62.24	24.02	14.98	22.20	34.02	11.86	15.46
*GAS_DYN*	6.94	35.00	25.84	7.55	18.38	24.31	9.01	9.77
*INDUCT*	51.02	60.78	53.22	48.81	57.38	100.20	53.24	51.61
*LINPK*	22.99	23.49	23.18	22.44	21.59	23.40	23.00	21.98
*MDBX*	32.24	31.63	28.17	26.32	27.80	37.39	33.83	30.83
NF	23.16	51.45	35.33	22.71	24.74	26.00	22.43	27.87
*PROTEIN*	77.74	102.33	83.24	63.13	118.11	108.02	73.29	91.21
*RNFLOW*	44.75	91.17	76.67	46.43	54.23	73.77	45.08	56.73
*TEST_FPU*	24.80	39.08	21.10	17.10	21.83	26.89	24.59	18.79
*TFFT*	4.72	5.52	4.90	4.56	4.54	4.94	4.54	4.40

Geometric Mean	25.77	42.17	31.33	23.22	29.03	38.27	25.64	27.10

Compiler Switches
*Absoft*	af95 -m64 -Ofast -speed_math=9 -WOPT:if_conv=off -LNO:fu=9:full_unroll_size=7000 -march=em64t -TARG:sse3=on -xINTEGER
*g95*	g95 -march=nocona -ffast-math -funroll-loops -O3
*gfortran*	gfortran -march=nocona -ffast-math -funroll-loops -O3
*Intel*	ifort -O3 -xP -ipo -no-prec-div
*Lahey*	lf64 --fast -static
*NAG*	f95 -O4 -mismatch_all -ieee=full -Bstatic
*Pathscale*	pathf90 -march=em64t -Ofast -WOPT:if_conv=0 -LNO:fu=9:full_unroll_size=7000:prefetch=1:trip_count=50
*PGI*	pgf90 -Bstatic -V -fastsse -Munroll=n:4 -Mipa=fast,inline -tp p7-64

Notes

All figures are Execution Times in Seconds - measured on a Dell Dimension 9100 with a Pentium D820 dual core 2.8 GHz processor, with 2 x 1MB L2 cache and 800 MHz FSB, and running SUSE 9.3 Linux. Each figure is the average over at least 10 runs (many more for some). Measurement error is typically <1%. Green cells highlight figures within 10% of the fastest. Red cells indicate figures which are more than 150% of the fastest.

So far as possible, we have used the compiler switches which give the best overall results. We have not attempted to tune individual benchmarks, and, in particular cases, different switch settings may give better results. For all except LF95, compiler switches were set to generate 64 bit executables.

Thanks are due to Jos Bergervoet for permission to use his CAPACITA benchmark, to Quetzal Associates for permission to use their CHANNEL, FATIGUE, GAS_DYN, INDUCT, PROTEIN and RNFLOW benchmarks, to David Frank for his TEST_FPU benchmark, and to Ted Addison of McVehil-Monnett Associates for permission to use AERMOD, an air quality model used by the US Environmental Protection Agency.

All the benchmarks have been modified slightly to fit into our benchmarking harness.

The NF benchmark uses "nested factorization", a little known but very effective iterative linear solver for huge finite difference matrices. A paper describing nested factorization, and comparing it to other methods is available here.

These benchmarks were also used to compare Win32 compilers on the same machine and on an AMD/Vista machine.

Download Polyhedron Benchmarks

(C) Polyhedron Software Ltd. (2007)
Last Updated on 11/07/2007
By John Appleyard

Programs for Programmers

Intel/EM64T Linux Compiler Comparisons Fortran Execution Time Benchmarks

Intel/EM64T Linux Compiler Comparisons
Fortran Execution Time Benchmarks