xref: /petsc/src/benchmarks/results/benchmarks.html (revision 5b6bfdb9644f185dbf5e5a09b808ec241507e1e7)
1<HTML>
2<HEAD>
3<BASE HREF="http://www.mcs.anl.gov/petsc/benchmarks.html">
4<TITLE>PETSc Benchmarks</TITLE>
5</HEAD>
6<BODY BGCOLOR="#ffffff" LINK="#0000ff" VLINK="#ff0000" ALINK="#ff0000" TEXT="#000000">
7
8<H1 align=center>Sample PETSc Floating Point Performance</H1>
9<P>
10<H3>
11<MENU>
12<LI> <a href="petsc.html#singleprocessor">Single Processor Floating Point Performance</a>
13<LI> <a href="petsc.html#multiprocessor">Parallel Performance for Euler Solver</a>
14<LI> <a href="petsc.html#laplacian">Scalability for Laplacian</a>
15</MENU>
16</H3>
17<P>
18We provide these floating point performance numbers as a guide to users to indicate
19the type of floating point rates they should expect while using PETSc. We have done
20our best to provide fair and accurate values but do not guarantee
21any of the numbers presented here.
22<P>
23See the "Profiling" chapter of <a href="http://www.mcs.anl.gov/petsc/manual.html#Node100">
24the PETSc users manual</a> for instructions on techniques to obtain accurate performance
25numbers with PETSc
26
27<P><HR><P>
28
29<A NAME="singleprocessor"> <H1 align=center>Single Processor Performance</H1></A>
30
31In many PDE application codes one most solve systems of linear equations
32arising from the descretization of multicomponent PDEs, the sparse matrices computed
33naturally have a block structure.
34<P>
35PETSc has special sparse matrix storage formats and routines to take advantage of
36that block structure to deliver much higher (two or three times as high) floating
37point computation rates. Below we give the
38floating point rates for the matrix-vector product for a 1503 by 1503 sparse matrix with a block
39size of three arising from a simple oil reservoir simulation.
40
41<p>
42<A HREF="http://ftp.mcs.anl.gov/pub/petsc/matmultbench.ps">Embed here</A>
43<p>
44
45The next table depicts performance for the entire linear solve using GMRES(30) and
46ILU(0) preconditioning.
47
48<P>
49<A HREF="http://ftp.mcs.anl.gov/pub/petsc/solvebench.ps">Embed here</A>
50<P>
51
52These tests were run using
53the code src/sles/examples/tutorials/ex10.c with the options
54<p>
55<tt>
56mpiexec -n 1 ex10 -f0 arco1 -f1 arco1 -pc_type ilu -ksp_gmres_unmodifiedgramschmidt -optionsleft -mat_baij -matload_block_size 3 -log_view
57</tt>
58
59<P><HR><P>
60
61<A NAME="multiprocessor"> <H1 align=center>Parallel Performance for Euler Solver</H1></A>
62
63<A NAME="laplacian"> <H1 align=center>Scalability for Laplacian</H1></A>
64A typical "model" problem people work with in numerical analysis for PDEs is the
65Laplacian. Discretization of the Laplacian in two dimensions with finite differences
66is typically done using the "five point" stencil. This results in a very sparse
67(at most five nonzeros per row), ill-conditioned matrix.
68
69<P>
70Because the matrix is so sparse and has no block structure it is difficult to get
71very good sequential or parallel floating point performance, especially for small
72problems. Here we demonstrate scalability of the parallel PETSc matrix vector product
73for the five point stencil on two grids. These were run on three machines:
74an IBM SP2 with the Power2Super chip and two memory cards at ANL, the Cray T3E at NERSC and
75the Origin2000 at NCSA.
76
77<P>
78Since PETSc is intended for much more general problems then the Laplacian we don't consider
79the Laplacian to be a particularlly important benchmark; we include it due to interest
80from the community.
81
82<P><HR><P>
83
84<H2 align=center>100 by 100 Grid: Absolute Time and Speed-Up</H1>
85
86100by100 grid
87<P>
88Notes: The problem here is simply to small to parallelize on a distributed memory
89computer.
90<P>
91
92<H2 align=center>1000 by 1000 Grid: Absolute Time and Speed-Up</H1>
93
941000by1000 grid
95<P>
96
97
98
99</BODY>
100</HTML>
101