1(ch_buildsystem)= 2 3# BuildSystem 4 5`BuildSystem` (located in `config/BuildSystem`) configures PETSc before PETSc is compiled with make. 6It is much like [GNU Autoconf (configure)](https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.71/html_node/index.html#Top) 7but written in Python especially for PETSc. 8 9## What is a build? 10 11The build stage compiles source to object files, stores them 12(usually in archives), and links shared libraries and executables. These 13are mechanical operations that reduce to applying a construction rule to 14sets of files. The [Make](http://www.gnu.org/software/make/) tool is 15great at this job. 16 17## Why is configure necessary? 18 19The `configure` program is designed to assemble all information and preconditions 20necessary for the build stage. This is a far more complicated task, heavily dependent on 21the local hardware and software environment. It is also the source of nearly every build 22problem. The most crucial aspect of a configure system is not performance, scalability, or 23even functionality, but *debuggability*. Configuration failure is at least as common as 24success, due to broken tools, operating system upgrades, hardware incompatibilities, user 25error, and a host of other reasons. Problem diagnosis is the single biggest bottleneck for 26development and maintenance time. Unfortunately, current systems are built to optimize the 27successful case rather than the unsuccessful. In PETSc, we have developed the 28`BuildSystem` package to remedy the shortcomings of configuration systems such as 29Autoconf, CMake, and SCons. 30 31## Why use PETSc BuildSystem? 32 33As several configuration tools 34currently exist, it is instructive to consider why PETSc would choose to create another 35from scratch. Below we list features and design considerations which lead us to prefer 36`BuildSystem` to the alternatives. 37 38### Namespacing 39 40`BuildSystem` wraps collections of related tests in Python modules, which also hold 41the test results. Thus results are accessed using normal Python 42namespacing. As rudimentary as this sounds, no namespacing beyond the 43use of variable name prefixes is present in Autoconf, CMake, and SCons. 44Instead, a flat namespace is used. This 45tendency appears again when composing command lines for external tools, 46such as the compiler and linker. In the traditional configure tools, 47options are aggregated in a single bucket variable, such as `INCLUDE` 48or `LIBS`, whereas in `BuildSystem` one can trace the provenance of a flag before it 49is added to the command line. CMake also makes the unfortunate decision 50to force all link options to resolve to full paths, which causes havoc 51with compiler-private libraries. 52 53### Explicit control flow 54 55The `BuildSystem` configure modules mentioned above, containing one `Configure` object 56per module, are organized explicitly into a directed acyclic graph 57(DAG). The user indicates dependence, an *edge* in the dependence graph, 58with a single call, `requires('path.to.other.test', self)`, which not 59only structures the DAG, but returns the `Configure` object. The caller 60can then use this object to access the results of the tests run by the 61dependency, achieving test and result encapsulation simply. 62 63### Multi-language tests 64 65`BuildSystem` maintains an explicit language stack, so that the current language 66can be manipulated by the test environment. A compile or link can be run 67using any language, complete with the proper compilers, flags, 68libraries, etc., with a single call. This automation is crucial 69for cross-language tests, which are thinly supported in current 70tools. In fact, the design of these tools inhibits this kind of check. 71The `check_function_exists()` call in Autoconf and CMake looks only 72for the presence of a particular symbol in a library, and fails in C++ 73and on Microsoft Windows, whereas the equivalent `BuildSystem` test can also take a 74declaration. The `try_compile()` test in Autoconf and CMake requires 75the entire list of libraries be present in the `LIBS` variable, 76providing no good way to obtain libraries from other tests in a modular 77fashion. As another example, if the user has a dependent library that 78requires `libstdc++`, but they are working with a C project, no 79straightforward method exists to add this dependency. 80 81### Subpackages 82 83The most complicated, yet perhaps most useful, part of `BuildSystem` is 84support for dependent packages. It provides an object scaffolding for 85including a 3rd party package (more than 100 are now available) so that 86PETSc downloads and builds the package for use by PETSc. The native 87configure and build system for the package is used, and special support 88exists for Autoconf and CMake packages. No similar system exists in the other 89tools, which rely on static declarations, such as `pkg-config` or 90`FindPackage.cmake` files, that are not tested and often become 91obsolete. 92 93### Batch environments 94 95Most systems, such as Autoconf and CMake, do not actually run tests in a 96batch environment, but rather require a direct specification, in CMake a 97"platform file". This requires a human expert to write and maintain the 98platform file. Alternatively, `BuildSystem` submits a dynamically 99generated set of tests to the batch system, enabling automatic 100cross-configuration and cross-compilation. 101 102### Caching 103 104Caching often seems like an attractive option since configuration can be 105quite time-consuming, and both Autoconf and CMake enable caching by 106default. However, no system has the ability to reliably invalidate the 107cache when the environment for the configuration changes. For example, a 108compiler or library dependency may be upgraded on the system. Moreover, 109dependencies between cached variables are not tracked, so that even if 110some variables are correctly updated after an upgrade, others which 111depend on them may not be. Moreover, CMake mixes together information 112which is discovered automatically with that explicitly provided by the 113user, which is often not tested. 114 115### Concision 116 117The cognitive load is usually larger for larger code bases, 118and our observation is that the addition of logic to Autoconf 119and CMake is often quite cumbersome and verbose as they do not employ a modern, 120higher level language. Although `BuildSystem` itself is not widely used, 121it has the advantage of being written in Python, a widely-understood, high-level 122language. 123 124## High level organization 125 126A minimal `BuildSystem` setup consists of a `config` directory off the 127package root, which contains all the Python necessary to run (in addition 128to the `BuildSystem` source). At minimum, the `config` directory contains 129`configure.py`, which is executed to run the configure process, and a 130module for the package itself. For example, PETSc contains 131`config/PETSc/petsc.py`. It is also common to include a top level 132`configure` file to execute the configure, as this looks like 133Autotools, 134 135```python 136#!/usr/bin/env python3 137import os 138execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py')) 139``` 140 141The `configure.py` script constructs a tree of configure modules and 142executes the configure process over it. A minimal version of this would 143be 144 145```python 146package = 'PETSc' 147 148def configure(configure_options): 149 # Command line arguments take precedence (but don't destroy argv[0]) 150 sys.argv = sys.argv[:1] + configure_options + sys.argv[1:] 151 framework = config.framework.Framework(['--configModules='+package+'.Configure', '--optionsModule='+package+'.compilerOptions']+sys.argv[1:], loadArgDB = 0) 152 framework.setup() 153 framework.configure(out = sys.stdout) 154 framework.storeSubstitutions(framework.argDB) 155 framework.printSummary() 156 framework.argDB.save(force = True) 157 framework.logClear() 158 framework.closeLog() 159 160if __name__ == '__main__': 161 configure([]) 162``` 163 164The PETSc `configure.py` is quite a bit longer than this, as it 165performs specialized command line processing, error handling, and 166integrating logging with the rest of PETSc. 167 168The `config/package/Configure.py` module determines how the tree of 169`Configure` objects is built and how the configure information is output. 170The `configure()` method of the module will be run by the `Framework` 171object created at the top level. A minimal `configure()` method would look 172like 173 174```python 175def configure(self): 176 self.framework.header = self.arch.arch+'/include/'+self.project+'conf.h' 177 self.framework.makeMacroHeader = self.arch.arch+'/conf/'+self.project+'variables' 178 self.framework.makeRuleHeader = self.arch.arch+'/conf/'+self.project+'rules' 179 180 self.Dump() 181 self.logClear() 182 return 183``` 184 185The `Dump` method runs over the tree of configure modules, and outputs 186the data necessary for building, usually employing the 187`addMakeMacro()`, `addMakeRule()` and `addDefine()` methods. These 188methods funnel output to the include and make files defined by the 189framework object, and set at the beginning of this `configure()` 190method. There is also some simple information that is often used, which 191we define in the initializer, 192 193```python 194def __init__(self, framework): 195 config.base.Configure.__init__(self, framework) 196 self.Project = 'PETSc' 197 self.project = self.Project.lower() 198 self.PROJECT = self.Project.upper() 199 self.headerPrefix = self.PROJECT 200 self.substPrefix = self.PROJECT 201 self.framework.Project = self.Project 202 return 203``` 204 205More sophisticated configure assemblies, like PETSc, output some other 206custom information, such as information about the machine, configure 207process, and a script to recreate the configure run. 208 209The `Package` configure module has two other main functions. First, top 210level options can be defined in the `setupHelp()` method, 211 212```python 213def setupHelp(self, help): 214 import nargs 215 help.addArgument(self.Project, '-prefix=<path>', nargs.Arg(None, '', 'Specify location to install '+self.Project+' (eg. /usr/local)')) 216 help.addArgument(self.Project, '-load-path=<path>', nargs.Arg(None, os.path.join(os.getcwd(), 'modules'), 'Specify location of auxiliary modules')) 217 help.addArgument(self.Project, '-with-shared-libraries', nargs.ArgBool(None, 0, 'Make libraries shared')) 218 help.addArgument(self.Project, '-with-dynamic-loading', nargs.ArgBool(None, 0, 'Make libraries dynamic')) 219 return 220``` 221 222This uses the `BuildSystem` help facility that is used to define options 223for all configure modules. The first argument groups these options into 224a section named for the package. The second task is to build the tree of 225modules for the configure run, using the `setupDependencies()` method. 226A simple way to do this is by explicitly declaring dependencies, 227 228```python 229def setupDependencies(self, framework): 230 config.base.Configure.setupDependencies(self, framework) 231 self.setCompilers = framework.require('config.setCompilers', self) 232 self.arch = framework.require(self.Project+'.utilities.arch', self.setCompilers) 233 self.projectdir = framework.require(self.Project+'.utilities.projectdir', self.arch) 234 self.compilers = framework.require('config.compilers', self) 235 self.types = framework.require('config.types', self) 236 self.headers = framework.require('config.headers', self) 237 self.functions = framework.require('config.functions', self) 238 self.libraries = framework.require('config.libraries', self) 239 240 self.compilers.headerPrefix = self.headerPrefix 241 self.types.headerPrefix = self.headerPrefix 242 self.headers.headerPrefix = self.headerPrefix 243 self.functions.headerPrefix = self.headerPrefix 244 self.libraries.headerPrefix = self.headerPrefix 245``` 246 247The `projectdir` and `arch` modules define the project root 248directory and a build name so that multiple independent builds can be 249managed. The `Framework.require()` method creates an edge in the 250dependency graph for configure modules, and returns the module object so 251that it can be queried after the configure information is determined. 252Setting the header prefix routes all the defines made inside those 253modules to our package configure header. We can also automatically 254create configure modules based upon what we see on the filesystem, 255 256```python 257for utility in os.listdir(os.path.join('config', self.Project, 'utilities')): 258 (utilityName, ext) = os.path.splitext(utility) 259 if not utilityName.startswith('.') and not utilityName.startswith('#') and ext == '.py' and not utilityName == '__init__': 260 utilityObj = self.framework.require(self.Project+'.utilities.'+utilityName, self) 261 utilityObj.headerPrefix = self.headerPrefix 262 utilityObj.archProvider = self.arch 263 utilityObj.languageProvider = self.languages 264 utilityObj.precisionProvider = self.scalartypes 265 utilityObj.installDirProvider = self.installdir 266 utilityObj.externalPackagesDirProvider = self.externalpackagesdir 267 setattr(self, utilityName.lower(), utilityObj) 268``` 269 270The provider modules customize the information given to the module based 271upon settings for our package. For example, PETSc can be compiled with a 272scalar type that is single, double, or quad precision, and thus has a 273`precisionProvider`. If a package does not have this capability, the 274provider setting can be omitted. 275 276## Main objects 277 278### Framework 279 280The `config.framework.Framework` object serves as the central control 281for a configure run. It maintains a graph of all the configure modules 282involved, which is also used to track dependencies between them. It 283initiates the run, compiles the results, and handles the final output. 284It maintains the help list for all options available in the run. The 285`setup()` method performs generic `Script` setup and then is called 286recursively on all the child modules. The `cleanup()` method performs 287the final output and logging actions, 288 289- Substitute files 290- Output configure header 291- Log filesystem actions 292 293Children may be added to the Framework using `addChild()` or 294`getChild()`, but the far more frequent method is to use 295`require()`. Here a module is requested, as in `getChild()`, but it 296is also required to run before another module, usually the one executing 297the `require()`. This provides a simple local interface to establish 298dependencies between the child modules, and provides a partial order on 299the children to the Framework. 300 301A backwards compatibility mode is provided for which the user specifies 302a configure header and set of files to experience substitution, 303mirroring the common usage of Autoconf. Slight improvements have been 304made in that all defines are now guarded, various prefixes are allowed 305for defines and substitutions, and C specific constructs such as 306function prototypes and typedefs are removed to a separate header. 307However, this is not the intended future usage. The use of configure 308modules by other modules in the same run provides a model for the 309suggested interaction of a new build system with the Framework. If a 310module requires another, it merely executes a `require()`. For 311instance, the PETSc configure module for hypre requires information 312about MPI, and thus contains 313 314```python 315self.mpi = self.framework.require("config.packages.MPI", self) 316``` 317 318Notice that passing self for the last arguments means that the MPI 319module will run before the hypre module. Furthermore, we save the 320resulting object as `self.mpi` so that we may interrogate it later. 321hypre can initially test whether MPI was indeed found using 322`self.mpi.found`. When hypre requires the list of MPI libraries in 323order to link a test object, the module can use `self.mpi.lib`. 324 325### Base 326 327The `config.base.Configure` is the base class for all configure 328objects. It handles several types of interaction. First, it has hooks 329that allow the Framework to initialize it correctly. The Framework will 330first instantiate the object and call `setupDependencies()`. All 331`require()` calls should be made in that method. The Framework will 332then call `configure()`. If it succeeds, the object will be marked as 333configured. Second, all configure tests should be run using 334`executeTest()` which formats the output and adds metadata for the 335log. 336 337Third, all tests that involve preprocessing, compiling, linking, and 338running operator through `base`. Two forms of this check are provided 339for each operation. The first is an "output" form which is intended to 340provide the status and complete output of the command. The second, or 341"check" form will return a success or failure indication based upon the 342status and output. The routines are 343 344```python 345outputPreprocess(), checkPreprocess(), preprocess() 346outputCompile(), checkCompile() 347outputLink(), checkLink() 348outputRun(), checkRun() 349``` 350 351The language used for these operation is managed with a stack, similar 352to Autoconf, using `pushLanguage()` and `popLanguage()`. We also 353provide special forms used to check for valid compiler and linker flags, 354optionally adding them to the defaults. 355 356```python 357checkCompilerFlag(), addCompilerFlag() 358checkLinkerFlag(), addLinkerFlag() 359``` 360 361You can also use `getExecutable()` to search for executables. 362 363After configure tests have been run, various kinds of output can be 364generated.A #define statement can be added to the configure header using 365`addDefine()`, and `addTypedef()` and `addPrototype()` also put 366information in this header file. Using `addMakeMacro()` and 367`addMakeRule()` will add make macros and rules to the output makefiles 368specified in the framework. In addition we provide `addSubstitution()` 369and `addArgumentSubstitution()` to mimic the behavior of Autoconf if 370necessary. The object may define a `headerPrefix` member, which will 371be appended, followed by an underscore, to every define which is output 372from it. Similarly, a `substPrefix` can be defined which applies to 373every substitution from the object. Typedefs and function prototypes are 374placed in a separate header in order to accommodate languages such as 375Fortran whose preprocessor can sometimes fail at these statements. 376