C = P^T * B: Mat Object: 2 MPI processes type: mpiaij rows=2, cols=5 total: nonzeros=10, allocated nonzeros=10 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines C = P^T * B after MatProductClear(): Mat Object: 2 MPI processes type: mpiaij rows=2, cols=5 total: nonzeros=10, allocated nonzeros=10 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines C = P^T * A * P: Mat Object: 2 MPI processes type: mpiaij rows=2, cols=2 total: nonzeros=2, allocated nonzeros=2 total number of mallocs used during MatSetValues calls=0 using nonscalable MatPtAP() implementation not using I-node (on process 0) routines C = P^T * A * P after MatProductClear(): Mat Object: 2 MPI processes type: mpiaij rows=2, cols=2 total: nonzeros=2, allocated nonzeros=2 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines Cdup: Mat Object: 2 MPI processes type: mpiaij rows=2, cols=2 total: nonzeros=2, allocated nonzeros=2 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines