PHASTA/File Formats

From PHASTA Wiki
Jump to: navigation, search

PHASTA has several different custom file formats that it takes in as input. This wiki page will review the basics of how they work

solver.inp

This file format is fairly ad-hoc, but follows a standard key-value syntax, where the key and value are separated by a colon, :.

Strings, integers, real numbers, and "arrays" can all be put into it (assuming that the key is expecting it). For commenting out code, the # is used. Here's an example:

       Spanwise File Binary Write : 1
  
   #MATERIAL PROPERTIES
   #{
       Viscosity: 1.50e-5      # fills datmat (2 values REQUIRED if iLset=1)
       Density: 1.0           # ditto
       Body Force Option: None # ibody=0 => matflag(5,n)

Generic Array Entries

Most generic array files in ASCII format (ie. not binary), such as STGInflow.dat or STGRand.dat, take a similar format.

The first line describes the size of the array to be read by PHASTA, then the rest of the file is the array in a space delimited format:

            517           3
    0.113416095562452E+00  -0.232694658451470E+00  -0.965914067189997E+00
    0.587836800593012E+00  -0.805356262211647E+00  -0.764799763667374E-01
   -0.320179182204988E+00   0.945214971754362E+00   0.636706247334280E-01

The first line here defines how many rows and columns are in the array (517 and 3 in this case). Then the rest is the array. Sometimes, the number of columns of an array is assumed, so only the number of rows is given. In that case, the there is only one integer in the first line.

Restart and Geombc files

Hint: When trying to read through binary files with ASCII headers, use the strings command in Linux to strip out the binary portions of the file and leave only ASCII.

The Restart and Geombc files for PHASTA hold the solution and problem information, respectively. They can be used in one of two formats: POSIX (as in the filesystem) and SyncIO.

POSIX simply stores the partitioned solution and problem information as individual files. So if you have a problem that was partitioned into 1024 parts, then you would have 1024 restart and geombc files. When run, each PHASTA MPI rank will ask to read all the files at once. This can overwhelm a file system with thousands of read/write requests.

To solve this problem, the SyncIO format was developed. This simply takes the individually partitioned POSIX files and "glues" them together into collections within the same file (somewhat like archiving files into a zip file).

POSIX Format

These files are built on a key-value system, but in this case the "value" for each key is often a binary array of data. At the top of a file, there is general information about the file. For example, in a POSIX geombc file we have this:

   # PHASTA Input File Version 2.0
   # Byte Order Magic Number : 362436
   # Output generated by libph version: yes
   byteorder magic number : < 5 > 1
   Ä<87>^E^@
   number of nodes : < 0 > 15787
   number of modes : < 0 > 15787
   number of shapefunctions soved on processor : < 0 > 0
   number of global modes : < 0 > 0
   number of interior elements : < 0 > 13560
   number of boundary elements : < 0 > 3239
   maximum number of element nodes : < 0 > 8
   number of interior tpblocks : < 0 > 1
   number of boundary tpblocks : < 0 > 1
   number of nodes with Dirichlet BCs : < 0 > 1493

while for restart files we only have:

   # PHASTA Input File Version 2.0
   # Byte Order Magic Number : 362436
   # Output generated by libph version: yes
   byteorder magic number : < 5 > 1
   Ä<87>^E^@
   number of modes : < 0 > 15787
   number of variables : < 0 > 7

The lines prepended with a # are not read by PHASTA. The other lines denote specific global information about the data in the file. For the actual data stored in the file, it gets a header followed by the data in binary format:

   co-ordinates : < 378889 > 15787 3
   ^F<81><95>C<8b>l§¿^F<81><95>C<8b>l§¿^F^R^T?ÆÜ¥¿^F<81><95>C<8b>l
   ....

The header may contain information about the array size being read in.

Note: I'm not sure what the purpose/significance of the numbers in angle brackets are

SyncIO Format

The SyncIO format is very similar to the above except for two changes. First, additional information is included in the first several lines. This includes MPI_IO_Tag, and nPPF. The latter stands are the "number of parts per file".

Second, the headers for the stored data get a @n appended onto them, where n is the partition number of that data block. For example:

   co-ordinates@1 : < 1275913 > 53163  3
   ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
   ......

is the header for the co-ordinates array corresponding to partition 1, along with the binary information afterward.