xref: /petsc/doc/developers/buildsystem.md (revision a982d5546cc9bcf96044945e3157033f4bde0259)
1(ch_buildsystem)=
2
3# BuildSystem
4
5`BuildSystem` (located in `config/BuildSystem`) configures PETSc before PETSc is compiled with make.
6It is much like [GNU Autoconf (configure)](https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.71/html_node/index.html#Top)
7but written in Python especially for PETSc.
8
9## What is a build?
10
11The build stage compiles source to object files, stores them
12(usually in archives), and links shared libraries and executables. These
13are mechanical operations that reduce to applying a construction rule to
14sets of files. The [Make](http://www.gnu.org/software/make/) tool is
15great at this job.
16
17## Why is configure necessary?
18
19The `configure` program is designed to assemble all information and preconditions
20necessary for the build stage. This is a far more complicated task, heavily dependent on
21the local hardware and software environment. It is also the source of nearly every build
22problem. The most crucial aspect of a configure system is not performance, scalability, or
23even functionality, but *debuggability*. Configuration failure is at least as common as
24success, due to broken tools, operating system upgrades, hardware incompatibilities, user
25error, and a host of other reasons. Problem diagnosis is the single biggest bottleneck for
26development and maintenance time. Unfortunately, current systems are built to optimize the
27successful case rather than the unsuccessful. In PETSc, we have developed the
28`BuildSystem` package to remedy the shortcomings of configuration systems such as
29Autoconf, CMake, and SCons.
30
31## Why use PETSc BuildSystem?
32
33As several configuration tools
34currently exist, it is instructive to consider why PETSc would choose to create another
35from scratch. Below we list features and design considerations which lead us to prefer
36`BuildSystem` to the alternatives.
37
38### Namespacing
39
40`BuildSystem` wraps collections of related tests in Python modules, which also hold
41the test results. Thus results are accessed using normal Python
42namespacing. As rudimentary as this sounds, no namespacing beyond the
43use of variable name prefixes is present in Autoconf, CMake, and SCons.
44Instead, a flat namespace is used. This
45tendency appears again when composing command lines for external tools,
46such as the compiler and linker. In the traditional configure tools,
47options are aggregated in a single bucket variable, such as `INCLUDE`
48or `LIBS`, whereas in `BuildSystem` one can trace the provenance of a flag before it
49is added to the command line. CMake also makes the unfortunate decision
50to force all link options to resolve to full paths, which causes havoc
51with compiler-private libraries.
52
53### Explicit control flow
54
55The `BuildSystem` configure modules mentioned above, containing one `Configure` object
56per module, are organized explicitly into a directed acyclic graph
57(DAG). The user indicates dependence, an *edge* in the dependence graph,
58with a single call, `requires('path.to.other.test', self)`, which not
59only structures the DAG, but returns the `Configure` object. The caller
60can then use this object to access the results of the tests run by the
61dependency, achieving test and result encapsulation simply.
62
63### Multi-language tests
64
65`BuildSystem` maintains an explicit language stack, so that the current language
66can be manipulated by the test environment. A compile or link can be run
67using any language, complete with the proper compilers, flags,
68libraries, etc., with a single call. This automation is crucial
69for cross-language tests, which are thinly supported in current
70tools. In fact, the design of these tools inhibits this kind of check.
71The `check_function_exists()` call in Autoconf and CMake looks only
72for the presence of a particular symbol in a library, and fails in C++
73and on Microsoft Windows, whereas the equivalent `BuildSystem` test can also take a
74declaration. The `try_compile()` test in Autoconf and CMake requires
75the entire list of libraries be present in the `LIBS` variable,
76providing no good way to obtain libraries from other tests in a modular
77fashion. As another example, if the user has a dependent library that
78requires `libstdc++`, but they are working with a C project, no
79straightforward method exists to add this dependency.
80
81### Subpackages
82
83The most complicated, yet perhaps most useful, part of `BuildSystem` is
84support for dependent packages. It provides an object scaffolding for
85including a 3rd party package (more than 100 are now available) so that
86PETSc downloads and builds the package for use by PETSc. The native
87configure and build system for the package is used, and special support
88exists for Autoconf and CMake packages. No similar system exists in the other
89tools, which rely on static declarations, such as `pkg-config` or
90`FindPackage.cmake` files, that are not tested and often become
91obsolete.
92
93### Batch environments
94
95Most systems, such as Autoconf and CMake, do not actually run tests in a
96batch environment, but rather require a direct specification, in CMake a
97"platform file". This requires a human expert to write and maintain the
98platform file. Alternatively, `BuildSystem` submits a dynamically
99generated set of tests to the batch system, enabling automatic
100cross-configuration and cross-compilation.
101
102### Caching
103
104Caching often seems like an attractive option since configuration can be
105quite time-consuming, and both Autoconf and CMake enable caching by
106default. However, no system has the ability to reliably invalidate the
107cache when the environment for the configuration changes. For example, a
108compiler or library dependency may be upgraded on the system. Moreover,
109dependencies between cached variables are not tracked, so that even if
110some variables are correctly updated after an upgrade, others which
111depend on them may not be. Moreover, CMake mixes together information
112which is discovered automatically with that explicitly provided by the
113user, which is often not tested.
114
115### Concision
116
117The cognitive load is usually larger for larger code bases,
118and our observation is that the addition of logic to Autoconf
119and CMake is often quite cumbersome and verbose as they do not employ a modern,
120higher level language. Although `BuildSystem` itself is not widely used,
121it has the advantage of being written in Python, a widely-understood, high-level
122language.
123
124## High level organization
125
126A minimal `BuildSystem` setup consists of a `config` directory off the
127package root, which contains all the Python necessary to run (in addition
128to the `BuildSystem` source). At minimum, the `config` directory contains
129`configure.py`, which is executed to run the configure process, and a
130module for the package itself. For example, PETSc contains
131`config/PETSc/petsc.py`. It is also common to include a top level
132`configure` file to execute the configure, as this looks like
133Autotools,
134
135```python
136#!/usr/bin/env python3
137import os
138execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py'))
139```
140
141The `configure.py` script constructs a tree of configure modules and
142executes the configure process over it. A minimal version of this would
143be
144
145```python
146package = 'PETSc'
147
148def configure(configure_options):
149  # Command line arguments take precedence (but don't destroy argv[0])
150  sys.argv = sys.argv[:1] + configure_options + sys.argv[1:]
151  framework = config.framework.Framework(['--configModules='+package+'.Configure', '--optionsModule='+package+'.compilerOptions']+sys.argv[1:], loadArgDB = 0)
152  framework.setup()
153  framework.configure(out = sys.stdout)
154  framework.storeSubstitutions(framework.argDB)
155  framework.printSummary()
156  framework.argDB.save(force = True)
157  framework.logClear()
158  framework.closeLog()
159
160if __name__ == '__main__':
161  configure([])
162```
163
164The PETSc `configure.py` is quite a bit longer than this, as it
165performs specialized command line processing, error handling, and
166integrating logging with the rest of PETSc.
167
168The `config/package/Configure.py` module determines how the tree of
169`Configure` objects is built and how the configure information is output.
170The `configure()` method of the module will be run by the `Framework`
171object created at the top level. A minimal `configure()` method would look
172like
173
174```python
175def configure(self):
176  self.framework.header          = self.arch.arch+'/include/'+self.project+'conf.h'
177  self.framework.makeMacroHeader = self.arch.arch+'/conf/'+self.project+'variables'
178  self.framework.makeRuleHeader  = self.arch.arch+'/conf/'+self.project+'rules'
179
180  self.Dump()
181  self.logClear()
182  return
183```
184
185The `Dump` method runs over the tree of configure modules, and outputs
186the data necessary for building, usually employing the
187`addMakeMacro()`, `addMakeRule()` and `addDefine()` methods. These
188methods funnel output to the include and make files defined by the
189framework object, and set at the beginning of this `configure()`
190method. There is also some simple information that is often used, which
191we define in the initializer,
192
193```python
194def __init__(self, framework):
195  config.base.Configure.__init__(self, framework)
196  self.Project      = 'PETSc'
197  self.project      = self.Project.lower()
198  self.PROJECT      = self.Project.upper()
199  self.headerPrefix = self.PROJECT
200  self.substPrefix  = self.PROJECT
201  self.framework.Project = self.Project
202  return
203```
204
205More sophisticated configure assemblies, like PETSc, output some other
206custom information, such as information about the machine, configure
207process, and a script to recreate the configure run.
208
209The `Package` configure module has two other main functions. First, top
210level options can be defined in the `setupHelp()` method,
211
212```python
213def setupHelp(self, help):
214  import nargs
215  help.addArgument(self.Project, '-prefix=<path>', nargs.Arg(None, '', 'Specify location to install '+self.Project+' (eg. /usr/local)'))
216  help.addArgument(self.Project, '-load-path=<path>', nargs.Arg(None, os.path.join(os.getcwd(), 'modules'), 'Specify location of auxiliary modules'))
217  help.addArgument(self.Project, '-with-shared-libraries', nargs.ArgBool(None, 0, 'Make libraries shared'))
218  help.addArgument(self.Project, '-with-dynamic-loading', nargs.ArgBool(None, 0, 'Make libraries dynamic'))
219  return
220```
221
222This uses the `BuildSystem` help facility that is used to define options
223for all configure modules. The first argument groups these options into
224a section named for the package. The second task is to build the tree of
225modules for the configure run, using the `setupDependencies()` method.
226A simple way to do this is by explicitly declaring dependencies,
227
228```python
229def setupDependencies(self, framework):
230    config.base.Configure.setupDependencies(self, framework)
231    self.setCompilers  = framework.require('config.setCompilers',                self)
232    self.arch          = framework.require(self.Project+'.utilities.arch',       self.setCompilers)
233    self.projectdir    = framework.require(self.Project+'.utilities.projectdir', self.arch)
234    self.compilers     = framework.require('config.compilers',                   self)
235    self.types         = framework.require('config.types',                       self)
236    self.headers       = framework.require('config.headers',                     self)
237    self.functions     = framework.require('config.functions',                   self)
238    self.libraries     = framework.require('config.libraries',                   self)
239
240    self.compilers.headerPrefix  = self.headerPrefix
241    self.types.headerPrefix      = self.headerPrefix
242    self.headers.headerPrefix    = self.headerPrefix
243    self.functions.headerPrefix  = self.headerPrefix
244    self.libraries.headerPrefix  = self.headerPrefix
245```
246
247The `projectdir` and `arch` modules define the project root
248directory and a build name so that multiple independent builds can be
249managed. The `Framework.require()` method creates an edge in the
250dependency graph for configure modules, and returns the module object so
251that it can be queried after the configure information is determined.
252Setting the header prefix routes all the defines made inside those
253modules to our package configure header. We can also automatically
254create configure modules based upon what we see on the filesystem,
255
256```python
257for utility in os.listdir(os.path.join('config', self.Project, 'utilities')):
258  (utilityName, ext) = os.path.splitext(utility)
259  if not utilityName.startswith('.') and not utilityName.startswith('#') and ext == '.py' and not utilityName == '__init__':
260    utilityObj                    = self.framework.require(self.Project+'.utilities.'+utilityName, self)
261    utilityObj.headerPrefix       = self.headerPrefix
262    utilityObj.archProvider       = self.arch
263    utilityObj.languageProvider   = self.languages
264    utilityObj.precisionProvider  = self.scalartypes
265    utilityObj.installDirProvider = self.installdir
266    utilityObj.externalPackagesDirProvider = self.externalpackagesdir
267    setattr(self, utilityName.lower(), utilityObj)
268```
269
270The provider modules customize the information given to the module based
271upon settings for our package. For example, PETSc can be compiled with a
272scalar type that is single, double, or quad precision, and thus has a
273`precisionProvider`. If a package does not have this capability, the
274provider setting can be omitted.
275
276## Main objects
277
278### Framework
279
280The `config.framework.Framework` object serves as the central control
281for a configure run. It maintains a graph of all the configure modules
282involved, which is also used to track dependencies between them. It
283initiates the run, compiles the results, and handles the final output.
284It maintains the help list for all options available in the run. The
285`setup()` method performs generic `Script` setup and then is called
286recursively on all the child modules. The `cleanup()` method performs
287the final output and logging actions,
288
289- Substitute files
290- Output configure header
291- Log filesystem actions
292
293Children may be added to the Framework using `addChild()` or
294`getChild()`, but the far more frequent method is to use
295`require()`. Here a module is requested, as in `getChild()`, but it
296is also required to run before another module, usually the one executing
297the `require()`. This provides a simple local interface to establish
298dependencies between the child modules, and provides a partial order on
299the children to the Framework.
300
301A backwards compatibility mode is provided for which the user specifies
302a configure header and set of files to experience substitution,
303mirroring the common usage of Autoconf. Slight improvements have been
304made in that all defines are now guarded, various prefixes are allowed
305for defines and substitutions, and C specific constructs such as
306function prototypes and typedefs are removed to a separate header.
307However, this is not the intended future usage. The use of configure
308modules by other modules in the same run provides a model for the
309suggested interaction of a new build system with the Framework. If a
310module requires another, it merely executes a `require()`. For
311instance, the PETSc configure module for hypre requires information
312about MPI, and thus contains
313
314```python
315self.mpi = self.framework.require("config.packages.MPI", self)
316```
317
318Notice that passing self for the last arguments means that the MPI
319module will run before the hypre module. Furthermore, we save the
320resulting object as `self.mpi` so that we may interrogate it later.
321hypre can initially test whether MPI was indeed found using
322`self.mpi.found`. When hypre requires the list of MPI libraries in
323order to link a test object, the module can use `self.mpi.lib`.
324
325### Base
326
327The `config.base.Configure` is the base class for all configure
328objects. It handles several types of interaction. First, it has hooks
329that allow the Framework to initialize it correctly. The Framework will
330first instantiate the object and call `setupDependencies()`. All
331`require()` calls should be made in that method. The Framework will
332then call `configure()`. If it succeeds, the object will be marked as
333configured. Second, all configure tests should be run using
334`executeTest()` which formats the output and adds metadata for the
335log.
336
337Third, all tests that involve preprocessing, compiling, linking, and
338running operator through `base`. Two forms of this check are provided
339for each operation. The first is an "output" form which is intended to
340provide the status and complete output of the command. The second, or
341"check" form will return a success or failure indication based upon the
342status and output. The routines are
343
344```python
345outputPreprocess(), checkPreprocess(), preprocess()
346outputCompile(),    checkCompile()
347outputLink(),       checkLink()
348outputRun(),        checkRun()
349```
350
351The language used for these operation is managed with a stack, similar
352to Autoconf, using `pushLanguage()` and `popLanguage()`. We also
353provide special forms used to check for valid compiler and linker flags,
354optionally adding them to the defaults.
355
356```python
357checkCompilerFlag(), addCompilerFlag()
358checkLinkerFlag(),   addLinkerFlag()
359```
360
361You can also use `getExecutable()` to search for executables.
362
363After configure tests have been run, various kinds of output can be
364generated.A #define statement can be added to the configure header using
365`addDefine()`, and `addTypedef()` and `addPrototype()` also put
366information in this header file. Using `addMakeMacro()` and
367`addMakeRule()` will add make macros and rules to the output makefiles
368specified in the framework. In addition we provide `addSubstitution()`
369and `addArgumentSubstitution()` to mimic the behavior of Autoconf if
370necessary. The object may define a `headerPrefix` member, which will
371be appended, followed by an underscore, to every define which is output
372from it. Similarly, a `substPrefix` can be defined which applies to
373every substitution from the object. Typedefs and function prototypes are
374placed in a separate header in order to accommodate languages such as
375Fortran whose preprocessor can sometimes fail at these statements.
376