@shorttitlepage Learning the GNU development tools Copyright (C) 1998 Eleftherios Gkioulekas. All rights reserved.
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the sections entitled "Copying" and "Philosophical issues" are included exactly as in the original, and provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice and the sections entitled "Copying" and "Philosophical issues" may be stated in a translation approved by the Free Software Foundation instead of the original English.
The purpose of this document is to introduce you to the GNU build system, and show you how to use it to write good code. It is also meant to serve as a manual for Autotools, an additional package that provides a variety of additional features. Finally it discusses peripheral topics such as how to use GNU Emacs as a source code navigator and how to make heads and tails out of Texinfo. The intended reader should be a software developer who understands his programming languages, and wants to learn how to put together his programs the way a typical FSF program is put together.
When we speak of the GNU build system we refer primarily to the following three programs:
The GNU build system has two goals. The first is to simplify the development of portable programs. The second is to simplify the building of programs that are distributed as source code. The first goal is achieved by the automatic generation of a `configure' shell script. The second goal is achieved by the automatic generation of Makefiles and other shell scripts that are typically used in the building process. This way the developer can concentrate on debugging his source code, instead of his overly complex Makefiles. And the installer can compile and install the program directly from the source code distribution by a simple and automatic procedure.
The GNU build system needs to be installed only when you are developing
programs that are meant to be distributed. To build a program from
distributed source code, you only need make
, the compiler, a shell,
and occasionally standard Unix utilities like sed
, awk
,
yacc
, lex
.
Some tasks that are simplified by the GNU build system include:
make
recursively. Having simplified this step, the developer
is encouraged to organize his source code in a deep directory tree rather than
lump everything under the same directory. Developers that use raw make
often can't justify the inconvenience of recursive make and prefer to
disorganize their source code. With the GNU tools this is no longer necessary.
check
target available such that you can compile and run the entire test suite
by running make check
.
make distcheck
.
The Autotools package complements the GNU build system by providing the following additional features:
Autotools is still under development and there may still be bugs. At the moment Autotools doesn't do shared libraries, but that will change in the future.
This effort began by my attempt to write a tutorial for Autoconf. It involved into "Learning Autoconf and Automake". Along the way I developed Autotools to deal with things that annoyed me or to cover needs from my own work. Ultimately I want this document to be both a unified introduction of the GNU build system as well as documentation for the Autotools package.
I believe that knowing these tools and having this know-how is very important, and should not be missed from engineering or science students who will one day go out and do software development for academic or industrial research. Many students are incredibly undertrained in software engineering and write a lot of bad code. This is very very sad because of all people, it is them that have the greatest need to write portable, robust and reliable code. I found from my own experience that moving away from Fortran and C, and towards C++ is the first step in writing better code. The second step is to use the sophisticated GNU build system and use it properly, as described in this document. Ultimately, I am hoping that this document will help people get over the learning curve of the second step, so they can be productive and ready to study the reference manuals that are distributed with all these tools.
This manual of course is still under construction. When I am done constructing it some paragraph somewhere will be inserted with the traditional run-down of summaries about each chapter. I write this manual in a highly non-linear way, so while it is under construction you will find that some parts are better-developed than others. If you wish to contribute sections of the manual that I haven't written or haven't yet developed fully, please contact me.
Chapters 1,2,3,4 are okey. Chapter 5 is okey to, but needs a little more work. I removed the other chapters to minimize confusion, but the sources for them are still being distributed as part of the Autotools package for those that found them useful. The other chapters need a lot of rewriting and they would do more harm than good at this point to the unsuspecting reader. Please contact me if you have any suggestions for improving this manual.
This document and the Autotools package have originally been written by Eleftherios Gkioulekas. Many people have further contributed to this effort, directly or indirectly, in various way. Here is a list of these people. Please help me keep it complete and exempt of errors.
FIXME: I need to start keeping track of acknowledgements here
The following notice refers to the Autotools package with which this
document is being distributed.
The following notice refers to the Autotools package, which includes this
documentation, as well as the source code for utilities like `acmkdir'
and for additional Autoconf macros.
The complete GNU build system involves
other packages also, such as Autoconf, Automake,
Libtool
and a few other accessories. These packages are also free software, and you
can obtain them from the Free Software Foundation. For details on doing so,
please visit their web site http://www.fsf.org/
. Although Autotools
has been designed to work with the GNU build system, it is not yet an
official part of the GNU project.
The Autotools package is "free"; this means that everyone is free to use it and free to redistribute it on a free basis. The Autotools package is not in the public domain; it is copyrighted and there are restrictions on its distribution, but these restrictions are designed to permit everything that a good cooperating citizen would want to do. What is not allowed is to try to prevent others from further sharing any version of this package that they might get from you.
Specifically, we want to make sure that you have the right to give away copies of the programs that relate to Autotools, that you receive source code or else can get it if you want it, that you can change these programs or use pieces of them in new free programs, and that you know you can do these things.
To make sure that everyone has such rights, we have to forbid you to deprive anyone else of these rights. For example, if you distribute copies of the Autotools-related code, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights.
Also, for our own protection, we must make certain that everyone finds out that there is no warranty for the programs that relate to Autotools. If these programs are modified by someone else and passed on, we want their recipients to know that what they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation.
The precise conditions of the licenses for the programs currently being distributed that relate to Autotools are found in the General Public Licenses that accompany it.
When you download an autoconfiguring package , it usually has a filename like: `foo-1.0.tar.gz' where the number is a version number. To install it, first you have to unpack the package to a directory someplace:
% gunzip foo-1.0.tar.gz % tar xf foo-1.0.tar
Then you enter the directory and look for files like `README' or `INSTALL' that explain what you need to do. Almost always this amounts to typing the following commands:
% cd foo-1.0 % configure % make % make check % su # make install
The `configure' command invokes a shell script that is distributed with the package that configures the package for you automatically. First it probes your system through a set of tests that allow it to determine things it needs to know, and then it uses this knowledge to generate automatically a `Makefile' from a template stored in a file called `Makefile.in'. When you invoke `make' with no argument, it executes the default target of the generated `Makefile'. That target will compile your source code, but will not install it. If your software comes with self-tests then you can compile and run them by typing `make check'. To install your software, you need to explicitly invoke `make' again with the target `install'. In order for `make' to work, you must make the directory where the `Makefile' is located the current directory.
During installation, the following files go to the following places:
Executables -> /usr/local/bin Libraries -> /usr/local/lib Header files -> /usr/local/include Man pages -> /usr/local/man/man? Info files -> /usr/local/info
where `foo' is the name of the package. The `/usr/local' directory is called the prefix. The default prefix is always `/usr/local' but you can set it to anything you like when you call `configure'. For example, if you want to install the package to your home directory instead, you will have to do this instead:
% configure --prefix=/home/skeletor % make % make check % make install
The `--prefix' argument tells `configure' where you want to install your package, and `configure' will take that into account and build the proper makefile automatically.
The `configure' script is compiled by `autoconf' from the contents of a file called `configure.in'. These files are very easy to maintain, and in this tutorial we will teach you how they work. The `Makefile.in' file is also compiled by `automake' from a very high-level specification stored in a file called `Makefile.am'. The developer then only needs to maintain `configure.in' and `Makefile.am'. As it turns out, these are so much easier to work with than Makefiles and so much more powerful, that you will find that you will not want to go back to Makefiles ever again once you get the hang of it.
In some packages, the `configure' script supports many more options than just `--prefix'. To find out about these options you should consult the file `INSTALL' and `README' that are traditionally distributed with the package, and also look at `configure''s self documenting facility:
% configure --help
Configure scripts can also report the version of Autoconf that generated them:
% configure --version
The makefiles generated by `automake' support a few more targets for undoing the installation process to various levels. More specifically:
configure
or make
did it, make distclean
undoes it.
make
did it, make clean
undoes it.
make install
did it, make uninstall
undoes it.
Also, in the spirit of free redistributable code, there are targets for cutting a source code distribution. If you type
% make dist
it will rebuild the `foo-1.0.tar.gz' file that you started with. If you modified the source, the modifications will be included in the distribution (and you should probably change the version number). Before putting a distribution up on FTP, you can test its integrity with:
% make distcheck
This makes the distribution, then unpacks it in a temporary subdirectory and tries to configure it, build it, run the test-suite, and check if the installation script works. If everything is okey then you're told that your distribution is ready.
Once you go through this tutorial, you'll have the know-how you need to develop autoconfiguring programs with such powerful Makefiles.
It is not unusual to be stuck on a system that does not have the GNU build tools installed. If you do have them installed, check to see whether you have the most recent versions. To do that type:
% autoconf --version % automake --version % libtool --version
If you don't have any of the above packages, you need to get a copy and install them on your computer. The distribution filenames for the GNU build tools are:
autoconf-2.12.tar.gz automake-1.3.tar.gz libtool-1.3.tar.gz
Before installing these packages however, you will need to install the following needed packages from the FSF:
make-3.76.1.tar.gz m4-1.4.tar.gz texinfo-3.9.tar.gz tar-1.12.shar.gz
You will need the GNU versions of make
, m4
and
tar
even if your system already has native versions of these utilities.
To check whether you do have the GNU versions see whether they accept the
--version
flag. If you have proprietory versions of make
or
m4
, rename them and then install the GNU ones.
You will also need to install Perl, the GNU C compiler,
and the TeX typesetter.
It is important to note that the end user will only need a decent shell
and a working make
to build a source code distribution. The developer
however needs to gather all of these tools in order to create the distribution.
Finally, to install Autotools begin by installing the following additional utilities from FSF:
bash-2.01.tar.gz sharutils-4.2.tar.gz
and then install
autotools-X.X.tar.gz
You should be able to obtain a copy of Autotools from the same site from which you received this document.
The installation process, for most of these tools is rather straightforward:
% ./configure % make % make check % make install
Most of these tools include documentation which you can build with
% make dvi
Exceptions to the rule are Perl, the GNU C compiler and TeX which have a more complicated installation procedure. However, you are very likely to have these installed already.
The versions numbers indicated above were the current ones at the time of this writing. If more recent versions are available, you may want to use them instead.
To get your feet wet we will show you how to do the Hello world program using `autoconf' and `automake'. In the fine tradition of K&R, the C version of the hello world program is:
#include <stdio.h> main() { printf("Howdy world!\n"); }
Let's say we've put this in a file called `hello.c'. Please place this file under an empty directory since we will be producing a lot of clutter soon enough! It can be compiled and ran directly with the following commands:
% gcc hello.c -o hello % hello
If you are on a non-GNU variant of Unix, your compiler might be called `cc' but the usage will be pretty much the same.
Now to do the same thing the `autoconf' and `automake' way create first the following files:
bin_PROGRAMS = hello hello_SOURCES = hello.c
AC_INIT(hello.c) AM_INIT_AUTOMAKE(hello,1.0) AC_PROG_CC AC_PROG_INSTALL AC_OUTPUT(Makefile)
Now run `autoconf':
% aclocal % autoconf
This will create the shell script `configure'. Next, run `automake':
% automake -a required file "./install-sh" not found; installing required file "./mkinstalldirs" not found; installing required file "./missing" not found; installing required file "./INSTALL" not found; installing required file "./NEWS" not found required file "./README" not found required file "./COPYING" not found; installing required file "./AUTHORS" not found required file "./ChangeLog" not found
The first time you do this, you get a spew of messages. It says that `automake' installed a whole bunch of cryptic stuff: `install-sh', `mkinstalldirs' and `missing'. These are shell scripts that are needed by the makefiles that `automake' generates. You don't have to worry about what they do. It also complains that the following files are not around:
INSTALL, COPYING, NEWS, README, AUTHORS, ChangeLog
These files are required to be present by the GNU coding standards, and we will discuss them in detail later. Nevertheless, it is important that these files are at least touched, because when we try to make a test distribution by calling `make distcheck' later on, it will cause a fatal error if any of these files are missing. Eventually, we will suggest that you use the `acmkdir' utility to automatically generate templates for these files which you can edit at will. To make these files exist, now please type:
% touch NEWS README AUTHORS ChangeLog
and to make Automake aware of the existence of these files, please rerun it:
% automake -a
Only when Automake completes without error messages, you can assume that the generated `Makefile.in' might be correct.
Now you are "all set" in the sense that your package is in the state that will allow you, as well as the end-user to type:
% configure % make % hello
to compile and run the hello world program. The idea of course is that the end-user will get the package "all-set" and will not have to have a copy of `automake' and `autoconf' to get it compiled. This is the developer's responsibility. If you really want to install it, go ahead and do it:
# make install
Oops, you changed your mind! Then uninstall it:
# make uninstall
If you didn't use the `--prefix' argument to point to your home directory you may need to be superuser to invoke the install commands.
Please note that in order for the above to work you need to use the GNU `gcc' compiler. Automake dependends on `gcc''s ability to compute the dependencies, so without `gcc' this example will not work. If you do have `gcc' installed, then the `configure' script will select it for you.
If you feel like cutting a distribution, you can do it with:
% make distcheck
This will create a file called `hello-1.0.tar.gz' in the current working directory so that when unpacked it is "all-set" for the user to fire away `configure' and start building. While building that file, Automake includes the precomputed dependencies and disables the dependencies from the end-user makefiles. This way the end-user will not have to have `gcc' to compile the package.
Now pretend that you are the end-user, unpack this file, enter it and compile it all over again:
% gunzip hello-1.0.tar.gz % tar xf hello-1.0.tar % cd hello-1.0 % configure % make % hello
And this is the full circle.
It is very important that when you run Automake the `configure' file already exists, otherwise Automake will not include it in the distribution when you do `make dist' and the target `distcheck' will fail to build. This means that you should run Autoconf before running Automake. To see this effect go back up to the toplevel directory and do the following:
% rm -f configure % automake % make distcheck
You will notice that the `distcheck' target fails. Before you ever cut a distribution and put it up on FTP, you should put content to the files
INSTALL, COPYING, NEWS, README, AUTHORS, ChangeLog
The file `COPYING' has to do with copyright issues, which we will discuss on a separate chapter. The other files are part of the software documentation. The GNU coding standards require that these files be present when you distribute your source code.
In this section we give a summary overview of how you should maintain these files. For more details, please see the GNU coding standards, as published by the FSF.
Automake
.
If you have something very important to say, it may be best to say it in
the `README' file instead. the `INSTALL' file is mostly for
the benefit of people who've never installed a GNU package before.
However, if your package is very unusual, you may decide that it is
best to modify the standard INSTALL file or write your own.
Authors of FOO See also the files THANKS and ChangeLog Bart Simpson designed and implemented FOO Principal Skinner: entire files bob1.cc, bob2.cc, bob3.cc extensive changes in foo1.cc, foo2.cc, foo3.cc
FOO THANKS file FOO has originally been written by Your Name. Many people have further contributed to FOO by reporting problems, suggesting various improvements, or submitting actual code. Here is a list of these people. Help me keep it complete and exempt of errors Name1 <email address1> Name2 <email address2> ....A good habit is to use the `THANKS' file to record people's email addresses instead of having them in many places (like `AUTHORS', `ChangeLog'). This will make it easier to keep them updated.
M-x add-change-log-entry-other-window
. It may be
easier to bind a key (for example f8) to this command by adding:
(global-set-key [f8] 'add-change-log-entry-other-window)to your `.emacs' file. Then, after having made a modification and while the cursor is still at the place where you made the modification, press f8 and record your entry. Recently Emacs has decided to use the ISO 8601 standard for dates which is:
YYYY-MM-DD
(year-month-date).
A typical `ChangeLog' entry looks like this:
1998-05-17 Eleftherios Gkioulekas <lf@amath.washington.edu> * src/acmkdir.sh: Now acmkdir will put better default content to the files README, NEWS, AUTHORS, THANKSEvery entry contains all the changes you made within the period of a day. The most recent changes are listed at the top, the older changes slowly scroll to the bottom. If you are a vi user, please read my Emacs for Vi users document (which I still have not written). Emacs has excellent support for vi emulation, and it comes with a lot more helpful features than merely ChangeLog maintance, such as editing files over an FTP link, highlighting your language syntax with colors, and many others.
gpl
utility:
% gpl -l COPYING
Most of these files are easy to maintain. Later we will show you how to use `acmkdir' to create a new directory for a new distribution. The `acmkdir' utility will provide you with templates for all of these files from which you can begin editing.
If you are just writing programs for your own internal use and you don't plan to redistribute them, you don't really need to worry too much about copyright. However, if you want to give your programs to other people then copyright issues become relavant. The main reason why `autoconf' and `automake' were developed was to facilitate the distribution of source code by making packages autoconfiguring. So, if you want to use these tools, you probably also want to know something about copyright issues. The following sections will focus primarily on the legal issues surrounding software. For a discussion of the philosophical issues please see section Philosophical issues. At this point, I should point out that I am not a lawyer, this is not legal advice, and I do not represent the opinions of the Free Software Foundation.
When you create a work, like a computer program, or a novel, and so on, you automatically have a set of legal rights called copyright. This means that you have the right to forbid others to use, modify and redistribute your work. By default no-one, except you the owner, is allowed to do any of these things. To relax these restrictions, you need to enter into an agreement with other people individually when they receive a copy from you. Such an agreement is called a License Agreement, which potentially entails rights and obligations to both you and them. It is very important that the License is written by a lawyer, and invoked from every file that is part of the work, in order for that file to fall under the terms the License. This can be done either by including the full text of the license or by including a legalese reference to the full text of the License. In the free software community, we standardize on using primarily the GNU General Public License, which we will discuss in the next section.
Copyright is transferable. This means that you have the right to transfer most your rights, that we call copyright, to another person or organization, with the exception of the moral right. The moral right is your right to say that you were the first owner of the work. This transfer is called copyright assignment. The moral right will force others to credit you, even if you must assign your copyright to them. When a work is being developed by a team, it makes legal sense to transfer the copyright to a single organization that can then coordinate enforcement of the copyright. In the free software community, some people assign their software to the Free Software Foundation. The arrangement is that copyright is transfered to the FSF. The FSF then grants you all the rights back in the form of a License Agreement, and commits itself legally to distribute the work only as free software. If you want to do this, you should contact the FSF for more information. It is not a good idea to assign your copyright to anyone else, unless you know very well that this is what you want to do.
The legal meaning of the word "use", as it refers to software, is peculiar, because software itself is very peculiar compared to all other forms of copyrighted work. For an executable program, "use" means to run it. But, for a library "use" refers to the act of linking it to your program. Copyright also covers derived work. If someone takes your code and modifies it, he is legally bound by the conditions under which you permitted him to do that. Similarly, if he links against a library that you wrote, then although he has a copyright to his code, he is bound by the license agreement that allowed him to do the linking, and he can only license his work in way that is also consistent with that agreement.
The concept of derived work is actually very slippery ground. Supposedly, what is copyrighted is not the algorithm but the implementation. What this means is that if you take someone's code, fire up an editor and modify it, then the resulting code is derived work. If you take someone's code understand the idea behind the implementation and reimplement the idea, then it is not derived work, even if the two end-results are remarkably similar, which they will be if the idea is very simple. So the property of a work being derived is not an inherent property of the work itself, but of the process with which you created the work. In practical terms, it's derived work if a judge says so in court.
Because copyright law is by default restrictive, you must explicitly grant permissions to your users to enable them to use your work. You do this, when you grant them a License Agreement. Even though the user never signs the agreement, nothing else grants the user any rights, so merely by using the program, the user is bound by the agreement. With some proprietary software, you are bound by the agreement the minute you break the seal in the packaging to unpack the box that contains the media with your software.
In addition to copyright law, there is another legal beast: the patent law. Unlike copyright, which you own automatically by the act of creating the work, you don't get a patent unless you file an application for it. If approved, the work is published but others must pay you royalties in order to use it in any way.
The problem with patents is that they cover algorithms, and if an algorithm is patented you can't write an implementation for it without a license. What makes it worse is that it is very difficult and expensive to find out whether the algorithms that you use are patented or will be patented in the future. What makes it insane is that the patent office, in its infinite stupidity, has patented algorithms that are very trivial with nothing innovative about them. For example, the use of backing store in a multiprocesing window system, like X11, is covered by patent 4,555,775. In the spring of 1991, the owner of the patent, AT&T, threatened to sue every member of the X Consortium including MIT. Backing store is the idea that the windowing system save the contents of all windows at all times. This way, when a window is covered by another window and then exposed again, it is redrawn by the windowing system, and not the code responsible for the application. Other insane patents include the IBM patent 4,674,040 which covers "cut and paste between files" in a text editor. Recently, a Microsoft backed company called "Wang" took Netscape to court over a patent that covered "bookmarks"! Wang lost.
Although most of these patents don't stand a chance in court, the cost of litigation is sufficient to terrorize small bussinesses, non-profit organizations like the Free Software Foundation, as well as individual software developers. For this reason, companies are all too eager to patent whatever they can get away with patenting to protect themselves from being sued by others, further complicating this problem. In practice, you will not be sued unless your code threatens the interests of a big corporation, or if a big corporation's lawyers get too much time on their hands.
Both copyright and patent laws are being used mainly to destroy our freedom. By freedom we refer to three things: the freedom to use software, the freedom to modify it and improve it, and the freedom to redistribute it with the modifications and improvements so that the whole community benefits. When you purchase commercial software, you are not really purchasing the software but a license that gives you limitted rights to the software. You never ever get source code and you are most definitely not granted rights to redistribute it. Finally your rights to use are also restricted in many ways. There are licenses that require that you use the software on only one computer screen. Other licenses allow only a limitted number of users to use the software at the same time. And to top this, some licenses expire after a year and you have to renew them, and other licenses even have illegal terms such as granting right to use on the condition that you do not compete with the company that produced it!
The opposite to this is free software. We must emphasize that by free we mean freedom, and not price. For example, although Internet Explorer is distributed for free, it is not free software, because you don't get the source code and permission to modify. The price is not as important as the freedom. You may have to pay money to obtain free software. In fact it is ok to sell free software and use some of the funds raised to develop more free software. Obscene pricing and rent-like licensing that requires you to pay thousands of dollars per year is only a consequence of not having freedom. But freedom is about more than just that. It is about being free from other people controlling our computer lifes. With free software, the only restrictions that you operate under are the technical limits of the software itself. With non-free software, you are imposed additional legal restrictions, and these restrictions reduce essentially to other people controlling your life. You find suddenly that you can't grab a copy of your software to install to your laptop. You find that you can't share copies with your students for classroom use. You find that you can not modify it to suit your needs, or maintain it when the owner goes out of bussiness. You find that you can not verify that the software does not contain Trojan horses. Software freedom refers to breaking these walls. Because making software free, increases the usefulness of your work, you are encouraged to free your software.
The prefered way to free your software is to distribute it under the terms of the GNU General Public License, also known as the "GPL". To best understand the GPL, you simply have to sit down and read the original document very carefully. In broad strokes, the license does the following:
The purpose of the GPL is to use the copyright law to encourage a world in which software is not copyrighted. If copyright didn't cover software, then we would all be free to use, modify and redistribute software, and we would not be able to restrict others from enjoying these freedoms because there would be no law giving anyone such power. One way to grant the freedoms to the users of your software is to revoke your copyright on the software completely. This is called putting your work in the public domain. The problem with this is that it only grants the freedoms. It does not create the reality in which no-one can take these freedoms away from derived works. In fact the copyright law covers by default derived works regardless of whether the original was public domain or copyrighted. By distributing your work under the GPL, you grant the same freedoms, and at the same time you protect these freedoms from hoarders.
The philosophy behind the GPL is that software should not be copyrighted. Copyright is more appropriate in artistic expression, because the purpose of art is to express an idea in an "artistic" manner. The copyright law itself makes a distinction between idea and expression and covers only the expression, not the idea. If any number of writers are asked to write a story based on a given story idea, they will end up writing completely different stories. Even when different writers enliven a historical event, there is still plenty of room to write about it from many different angles. As a result, what promotes the growth of the arts is not the sharing of the expression but the sharing of ideas, and copyright does promote the growth of the arts by protecting the expression but not covering the ideas.
This seperation of idea and expression breaks down in the field of software. The established legal thinking is that algorithms are ideas and software is expression. The mathematical way of thinking about algorithms however points out that algorithms must be precisely defined, and that can only be done if the algorithms are completely "expressed" in terms of a notational system. In mathematics, equations are such a notational system that can represent a certain set of algorithms. Software is just a different notational system, which can represent all the Turing-computable algorithms. Unlike art, the point of the exercise is to represent the algorithm, not to write a poem about it. As modern research begins to move away from inventing equations, and towards inventing algorithms, it becomes increasingly important that we be free to use, modify and redistribute algorithms, in their software representation, in the same way as in mathematical equations.
The GNU GPL is a legal instrument that has been designed to create a safe haven in which software can be written free from copyright law encumberence. It creates a notion of public good which is similar to public domain with the difference that if you derive your work from public good, the result must also be a public good.
To apply the GPL to your programs you need to do the following things:
// Copyright (C) (year) (Your Name) <your@email.address> // // This program is free software; you can redistribute it and/or modify // it under the terms of the GNU General Public License as published by // the Free Software Foundation; either version 2 of the License, or // (at your option) any later version. // // This program is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // GNU General Public License for more details. // // You should have received a copy of the GNU General Public License // along with this program; if not, write to the Free Software // Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.If you have assigned your copyright to an organization, like the Free Software Foundation, then you should probably fashion your copyright notice like this:
// Copyright (C) (year) Free Software Foundation // (your name) <your@email.address> (initial year) // etc...This legal notice works like a subroutine. By invoking it, you invoke the full text of the GNU General Public License which is too lengthy to include in every source file. Where you see `(year)' you need to list all the years in which you finished preparing a version that was actually released, and which was an ancestor to the current version. This list is not the list of years in which versions were released. It is a list of years in which versions, later released, were completed. If you finish a version on Dec 31, 1997 and release it on Jan 1, 1998, you need to include 1997, but you do not need to include 1998. This rule is complicated, but it is dictated by international copyright law.
// As a special exception, permission is granted for additional uses of // the text contained in its release of LIB. // // The exception is that, if you link the LIB library with other files // to produce an executable, this does not by itself cause the // resulting executable to be covered by the GNU General Public License. // Your use of that executable is in no way restricted on account of // linking the LIB library code into it. // // This exception does not however invalidate any other reasons why // the executable file might be covered by the GNU General Public License. // // This exception applies only to the code released under the // name LIB. If you copy code from other releases into a copy of // LIB, as the General Public License permits, the exception does // not apply to the code that you add in this way. To avoid misleading // anyone as to the status of such modified files, you must delete // this exception notice from them. // // If you write modifications of your own for LIB, it is your choice // whether to permit this exception to apply to your modifications. // If you do not wish that, delete this exception notice.Make sure to substitute "LIB" with the name of your library. This wording is used by the GNU project in the GUILE library. Similar terms are also used for the GNU C++ Standard Library to allow proprietary developers to use the GNU compiler. It is important to understand that you can not take a GPLed file, and slap this additional wording to it, without the original authors permission. You can however, at your option, use this wording in files that are your own original work. Also, if a file already invokes such wording, you can at your option retain or discard the wording in derived versions. Finally, if the library is linking any files that do not invoke this wording, then the permissions do not apply to the library as a whole, and if you link it to an executable, then you can not distribute that executable under a proprietary license. The individual files however, that do contain this notice, can be used to form a library that you can link into a proprietary executable, if that is technically possible.
// This file is free software; as a special exception the author gives // unlimited permission to copy and/or distribute it, with or without // modifications, as long as this notice is preserved. // // This program is distributed in the hope that it will be useful, but // WITHOUT ANY WARRANTY, to the extent permitted by law; without even the // implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.Note that this notice is very liberal. Please use it mainly for cases where you'd find it absurd to claim copyright. Such cases are mainly cases where it is rather hard to distinguish derived work from original work because it's all so similar.
--version
command-line flag. For details please read the GPL.
Also see the GNITS conding standards for suggestions on the output of
--version
and --help
, keeping in mind that these are simply
suggestions.
You may feel that all this legal crap is stupid, and you just want to write code and get your work done. Every true hacker feels the same way and resents the fact that to code nowadays we have to have this sort of legal education. However unless you apply some license to your program, you are not granting anyone any permissions whatsoever. The `gpl' utility has been written to help you bother maintain the legalese. Many people write free software with ambiguous copyright terms, making it unusable to people who want to be precise about using the GNU GPL.
Maintaining these legalese notices can be quite painful after some time. To ease the burden, Autotools distributes a utility called `gpl'. This utility will conveniently generate for you all the legal wording you will ever want to use. It is important to know that this application is not approved in any way by the Free Software Foundation. By this I mean that I haven't asked their opinion of it yet.
To create the file `COPYING' type:
% gpl -l COPYING
If you want to include a copy of the GPL in your documentation, you can generate a copy in texinfo format like this:
% gpl -lt gpl.texi
Also, every time you want to create a new file, use the `gpl' to generate the copyright notice. If you want it covered by the GPL use the standard notice. If you want to invoke the Guile-like permissions, then also use the library notice. If you want to grant unlimited permissions, meaning no copyleft, use the special notice. The `gpl' utility takes many different flags to take into account the different commenting conventions.
% gpl -c file.cthe library notice with
% gpl -cL file.cand the special notice with
% gpl -cS file.c
% gpl -cc file.ccthe library notice with
% gpl -ccL file.ccand the special notice with
% gpl -ccS file.cc
% gpl -sh foo.plthe library notice with
% gpl -shL foo.tcland the special notice with
% gpl -shS foo.plIt does not make sense to use the library notice, if no executable is being formed from this file. If however, you parse that file into C code that is then compiled into object code, then you may consider using the library notice on it instead of the special notice. One of the features provided by Autotools allows you to embed text, such as Tcl scripts, into the executable. In that case, you can use the library notice to license the original text.
% gpl -m4 file.m4In general, we exempt autoconf macro files from the GNU GPL because the terms of autoconf also exclude its output, the `configure' script, from the GPL.
% gpl -am Makefile.amFor these we also exempt them from the GPL because they are so trivial that it makes no sense to add copyleft protection.
If you are using GNU Emacs, then you can insert these copyright notices
on-demand while you're editing your source code. Autotools bundles two
Emacs packages: gpl
and gpl-copying
which provide you with
equivalents of the `gpl' command that can be run under Emacs. These
packages will be byte-compiled and installed automatically for you while
installing Autotools.
To use these packages, in your `.emacs' you must declare your identity by adding the following commands:
(setq user-mail-address "me@here.com") (setq user-full-name "My Name")
Then you must require the packages to be loaded:
(require 'gpl) (require 'gpl-copying)
These packages introduce a set of Emacs commands all of which are prefixed
as gpl-
. To invoke any of these commands press M-x
, type
the name of the command and press enter.
The following commands will generate notices for your source code:
The following commands will generate notices for your source code:
unnumbered
chapter titled "Copying" in the
Texinfo documentation of your source code. You will be prompted for the
title of your package. That title will substitute the word Autotools
as it appears in the corresponding section in this manual.
We begin at the beginning. If you recall, we showed to you that the hello world program can be compiled very simply with the following command:
% gcc hello.c -o hello
Even in this simple case you have quite a few options:
Here are some variations of the above example:
% gcc -g -O3 hello.c hello % gcc -g -Wall hello.c -o hello % gcc -g -Wall -O3 hello.c -o hello
Compilers have many more flags like that, and some of these flags are compiler dependent.
Now let's consider the case where you have a much larger program. made of source files `foo1.c', `foo2.c', `foo3.c' and header files `header1.h' and `header2.h'. One way to compile the program is like this:
% gcc foo1.c foo2.c foo3.c -o foo
This is fine when you have only a few files to deal with. Eventually when you have more than a hundred files, this is very slow and inefficient, because everytime you change one of the `foo' files, all of them have to be recompiled. In large projects this can very well take a quite a few minutes, and in very large projects hours. The solution is to compile each part seperately and put them all together at the end, like this:
% gcc -c foo1.c % gcc -c foo2.c % gcc -c foo3.c % gcc foo1.o foo2.o foo3.o -o foo
The first three lines compile the three parts seperately and generate output in the files `foo1.o', `foo2.o', `foo3.o'. The fourth line puts it all back together. This way if you make a change only in `foo1.o' you just do:
% gcc -c foo1.c % gcc foo1.o foo2.o foo3.o -o foo
This feature of the compiler offers a way out, but it's hardly a solution.
rm -f *.ois dangerous because you may misspell `o' for `c' or you may do this:
rm -f * .oand become depressed.
The `make' utility was written to address these problems.
The `make' utility takes its instructions from a file called `Makefile' in the directory in which it was invoked. The `Makefile' involves four concepts: the target, the dependencies, the rules, and the source. Before we illustrate these concepts with examples we will explain them in abstract terms for those who are mathematically minded:
The `Makefile' is essentially a collection of logical statements about these four concepts. The content of each statement in English is:
To build this target, first make sure that these dependencies are up to date. If not build them first in the order in which they are listed. Then execute these rules to build this target.
Given a complete collection of such statements it is possible to infer what action needs to be taken to build a specific target, from the source files and the current state of the distribution. By action we mean passing commands to the shell. One reason why this is useful is because if part of the building process does not need to be repeated, it will not be repeated. The `make' program will detect that certain dependencies have not changed and skip the action required for rebuilding their targets. Another reason why this approach is useful is because it is intuitive in human terms. At least, it will be intuitive when we illustrate it to you.
In make-speak each statement has the following form:
target: dependency1 dependency2 .... shell-command-1 shell-command-2 shell-command-3
where target
is the name of the target and dependency*
the
name of the dependencies, which can be either source files or other targets.
The shell commands that follow are the commands
that need to be passed to the shell to build the target after the dependencies
have been built. To be compatible with most versions of make, you must
seperate these statements with a blank line. Also, the shell-command*
must be indented with the tab key. Don't forget your tab keys
otherwise make
will not work.
When you run make
you can pass the target that you want to build
as an argument. If you omit arguments and call make
by itself then
the first target mentioned in the Makefile is the one that gets built.
The makefiles that Automake generates have the phony target all
be the default target. That target will compile your code but not install it.
They also provide a few more phony targets such as install
,
check
, dist
, distcheck
, clean
, distclean
as we have discussed earlier. So Automake is saving you quite a lot of
work because without it you would have to write a lot of repetitive code
to provide all these phony targets.
To illustrate these concepts with an example suppose that you have this situation:
To build an executable `foo' you need to build object files and then link them together. We say that the executable depends on the object files and that each object file depends on a corresponding `*.c' file and the `*.h' files that it includes. Then to get to an executable `foo' you need to go through the following dependencies:
foo: foo1.o foo2.o foo3.o foo4.o foo1.o: foo1.c gleep2.h gleep3.h foo2.o: foo2.c gleep1.h foo3.o: foo3.c gleep1.h gleep2.h foo4.o: foo4.c gleep3.h
The thing on the left-hand-side is the target, the thing on the right-hand-side is the dependencies. The logic is that to build the thing on the left, you need to build the things on the right first. So, if `foo1.c' changes, `foo1.o' must be rebuilt. If `gleep3.h' changes then `foo1.o' and `foo4.o' must be rebuilt. That's the game.
The way the `Makefile' actually looks like is like this:
foo: foo1.o foo2.o foo3.o foo4.o gcc foo1.o foo2.o foo3.o foo4.o -o foo foo1.o: foo1.c gleep2.h gleep3.h gcc -c foo1.c foo2.o: foo2.c gleep1.h gcc -c foo2.c foo3.o: foo3.c gleep1.h gleep2.h gcc -c foo3.c foo4.o: foo4.c gleep3.h gcc -c foo4.c
It's the same thing as before except that we have supplemented the rules by which the target is built from the dependencies. Things to note about syntax:
% makeTherefore, the target for the executable must go at the beginning.
If you omit the tabs or the blank line, then the Makefile will not work. Some versions of `make' have relaxed the blank line rule, since it's redundant, but to be portable, just put the damn blank line in.
You may ask, "how does `make' know what I changed?". It knows because UNIX keeps track of the exact date and time in which every file and directory was modified. This is called the Unix time-stamp. What happens then is that `make' checks whether any of the dependencies is newer than the main target. If so, then the target must be rebuilt. Cool. Now do the target's dependencies have to be rebuilt? Let's look at their dependencies and find out! In this recursive fashion, the logic is untangled and `make' does the Right Thing.
The `touch' command allows you to fake time-stamps and make a file look as if it has been just modified. This way you can force make to rebuild everything by saying something like:
% touch *.c *.h
If you are building more than one executable, then you may want to
make a phony target all
be the first target:
all: foo1 foo2 foo3
Then calling make
will attempt to build all
and that will
cause make to loop over `foo1', `foo2', `foo3' and
get them built. Of course you can also tell make to build these individually
by typing:
% make foo1 % make foo2 % make foo3
Anything that is a target can be an argument. You might even say
% make bar.o
if all you want is to build a certain object file and then stop.
The main problem with maintaining Makefiles, in fact what we mean when we complain about maintaining Makefiles, is keeping track of the dependencies. The `make' utility will do its job if you tell it what the dependencies are, but it won't figure them out for you. There's a good reason for this of course, and herein lies the wisdom of Unix. To figure out the dependencies, you need to know something about the syntax of the files that you are working with!. And syntax is the turf of the compiler, and not `make'. The GNU compiler honors this responsibility and if you type:
% gcc -MM foo1.c % gcc -MM foo2.c % gcc -MM foo3.c % gcc -MM foo4.c
it will compute the dependencies and put them out in standard output. Even so, it is clear that something else is needed to take advantage of this feature, if available, to generate a correct `Makefile' automatically. This is the main problem for which the only work-around is to use another tool that generates Makefiles.
The other big problem comes about with situations in which a software project spans many subdirectories. Each subdirectory needs to have a Makefile, and every Makefile must have a way to make sure that `make' gets called recursively to handle the subdirectories. This can be done, but it is quite cumbersome and annoying. Some programmers may choose to do without the advantages of a well-organized directory tree for this reason.
There are a few other little problems, but they have for most part solutions within the realm of the `make' utility. One such problem is that if you move to a system where the compiler is called `cc' instead of `gcc' you need to edit the Makefile everywhere. Here's a solution:
CC = gcc #CFLAGS = -Wall -g -O3 CFLAGS = -Wall -g foo: foo1.o foo2.o foo3.o foo4.o $(CC) $(CFLAGS) foo1.o foo2.o foo3.o foo4.o -o foo foo1.o: foo1.c gleep2.h gleep3.h $(CC) $(CFLAGS) -c foo1.c foo2.o: foo2.c gleep1.h $(CC) $(CFLAGS) -c foo2.c foo3.o: foo3.c gleep1.h gleep2.h $(CC) $(CFLAGS) -c foo3.c foo4.o: foo4.c gleep3.h $(CC) $(CFLAGS) -c foo4.c
Now the user just has to modify the first line where he defines the macro-variable `CC', and whatever he puts there gets substituted in the rules bellow. The other macro variable, `CFLAGS' can be used to turn optimization on and off. Putting a `#' mark in the beginning of a line, makes the line a comment, and the line is ignored.
Another problem is that there is a lot of redundancy in this makefile. Every object file is built from the source file the same way. Clearly there should be a way to take advantage of that right? Here it is:
CC = gcc CFLAGS = -Wall -g .SUFFIXES: .c .o .c.o: $(CC) $(CFLAGS) -c $< .o: $(CC) $(CFLAGS) $< -o $@ foo: foo1.o foo2.o foo3.o foo4.o foo1.o: foo1.c gleep2.h gleep3.h foo2.o: foo2.c gleep1.h foo3.o: foo3.c gleep1.h gleep2.h foo4.o: foo4.c gleep3.h
Now this is more abstract, and has some cool punctuation. The `SUFFIXES' thing tells `make' that files that are possible targets, fall under three categories: files that end in `.c', files that end in `.o' and files that end in nothing. Now let's look at the next line:
.c.o: $(CC) $(CFLAGS) -c $<
This line is an abstract rule that tells `make' how to make `.o' files from `.c' files. The punctuation marks have the following meanings:
In the same spirit, the next rule tells how to make the executable file from the `.o' files.
.o: $(CC) $(CFLAGS) $< -o $@
All that has to follow the abstract rules is the dependencies, without the specific rules! If you are using `gcc' these dependencies can be generated automatically and then you can include them from your Makefile. Unfortunately this approach doesn't work with all of the other compilers. And there is no standard way to include another file into Makefile source. (1) Of course, what we will point out eventually is that `automake' can take care of the dependencies for you.
The Makefile in our example can be enhanced in the following way:
CC = gcc CFLAGS = -Wall -g OBJECTS = foo1.o foo2.o foo3.o foo4.o PREFIX = /usr/local .SUFFIXES: .c .o .c.o: $(CC) $(CFLAGS) -c $< .o: $(CC) $(CFLAGS) $< -o $@ foo: $(OBJECTS) foo1.o: foo1.c gleep2.h gleep3.h foo2.o: foo2.c gleep1.h foo3.o: foo3.c gleep1.h gleep2.h foo4.o: foo4.c gleep3.h clean: rm -f $(OBJECTS) distclean: rm -f $(OBJECTS) foo install: rm -f $(PREFIX)/bin/foo cp foo $(PREFIX)/bin/foo
We've added three fake targets called `clean' and `distclean', `install' and introduced a few more macro-variables to control redundancy. I am sure some bells are ringing now. When you type:
% make
the first target (which is `foo') gets build, and your program compiles. When you type
% make install
since there is no file called `install' anywhere, the rule there is executed which has the effect of copying the executable over at `/usr/local/bin'. To get rid of the object files,
% make clean
and to get rid of the executable as well
% make distclean
Such fake targets are called phony targets in makefile parlance. As you can see, the `make' utility is quite powerful and there's a lot it can do. If you want to become a `make' wizard, all you need to do is read the GNU Make Manual and waste a lot of time spiffying up your makefiles, instead of getting your programs debugged, The GNU Make manual is extremely well written, and will make for enjoyable reading. It is also free, unlike "published" books.
The reason we went to the trouble to explain `make' is because it is important to understand what happens behind the hood, and because in many cases, `make' is a fine thing to use. It works for simple programs. And it works for many other things such as formatting TeX documents and so on.
As we evolve to more and more complicated projects, there's two things that we need. A more high-level way of specifying what you want to build, and a way of automatically determining the values that you want to put to things like CFLAGS, PREFIX and so on. The first thing is what `automake' does, the second thing is what `autoconf' does.
There's one last thing that we need to mention before moving on, and that's libraries. As you recall, to put together an executable, we make a whole bunch of `.o' files and then put them all together. It just so happens in many cases that a set of `.o' files together forms a cohesive unit that can be reused in many applications, and you'd like to use them in other programs. To make things simpler, what you do is put the `.o' files together and make a library.
A library is usually composed of many `.c' files and hopefully only one or at most two `.h' files. It's a good practice to minimize the use of header files and put all your gunk in one header file, because this way the user of your library won't have to be typing an endless stream of `#include' directives for every `.c' file he writes that depends on the library. Be considerate. The user might be you! Header files fall under two categories: public and private. The public header files must be installed at `/prefix/include' whereas the private ones are only meant to be used internally. The public header files export documented library features to the user. The private header files export undocumented library features that are to be used only by the developer of the library and only for the purpose of developing the library.
Suppose that we have a library called `barf' that's made of the following files:
`barf.h', `barf1.c', `barf2.c', `barf3.c'
In real life, the names should be more meaningful than that, but we're being general here. To build it, you first make the `.o' files:
% gcc -c barf1.c % gcc -c barf2.c % gcc -c barf3.c
and then you do this magic:
% rm -f libbarf.a % ar cru libbarf.a barf1.o barf2.o barf3.o
This will create a file libbarf.a
from the object files
`barf1.o', `barf2.o', `barf3.p'.
On most Unix systems, the library won't work unless it's "blessed" by a
program called `ranlib':
% ranlib libbarf.a
On other Unix systems, you might find that `ranlib' doesn't even exist because it's not needed.
The reason for this is historical. Originally ar
was meant to be used merely for packaging files together. The more
well known program tar
is a descendent of ar
that was designed
to handle making such archives on a tape device. Now that tape devices are
more or less obsolete, tar
is playing the role that was originally
meant for ar
. As for ar
, way back, some people thought to
use it to package *.o
files. However the linker wanted a symbol table
to be passed along with the archive for the convenience of the people writing
the code for the linker. Perhaps also for efficiency. So the ranlib
program was written to generate that table and add it to the *.a
file.
Then some Unix vendors thought that if they incorporated ranlib
to ar
then users wouldn't have to worry about forgetting to call
ranlib
. So they provided ranlib
but it did nothing. Some
of the more evil ones dropped it all-together breaking many people's
makefiles that tried to run ranlib
. In the next chapter we will
show you that Autoconf and Automake will automatically determine for you
how to deal with ranlib
in a portable manner.
Anyway, once you have a library, you put the header file `barf.h' under `/usr/local/include' and the `libbarf.a' file under `/usr/local/lib'. If you are in development phase, you put them somewhere else, under a prefix different other than `/usr/local'.
Now, how do we use libraries? Well, suppose that a program uses the
barf
function defined in the barf library. Then a typical program
might look like:
// -* main.c *- #include <stdio.h> #include <barf.h> main() { printf("This is barf!\n"); barf(); printf("Barf me!\n"); }
If the library was installed in `/usr/local' then you can compile like this:
% gcc -c main.c % gcc main.o -o main -lbarf
Of course, if you did not install in `/prefix' instead of `/usr/local' or `/usr' then you are in trouble. Now you have to do it this way:
% gcc -I/prefix/include -c main.c % gcc main.o -o main -L/prefix/lib -lbarf
The `-I' flag tells the compiler where to find any extra header files (like `barf.h') and the `-L' flag tells the compiler where to find any extra libraries (like `libbarf.a'). The `-lbarf' flag tells the compiler to bring in the entire `libbarf.a' library with all its enclosed `.o' files and link it in with whathaveyou to produce the executable.
If the library hasn't been installed yet, and is present in the same directory as the object file `main.o' then you can link them by passing its filename instead:
% gcc main.o libbarf.a -o main
Please link libraries with their full names if they haven't yet been
installed under the prefix directory and reserve using the -l
flag only for libraries that have already been installed. This is very
important. When you use Automake it helps it keep the dependencies straight.
And when you use shared libraries, it is absolutely essential.
Also, please pay attention to the order with which you link your libraries. When the linker links a library, it does not embed into the executable code the entire library, but only the symbols that are needed from the library. In order for the linker to know what symbols are really needed from any given library, it must have already parsed all the other libraries and object files that depend on that library! This implies that you first link your object files, then you link the higher-level libraries, then the lower-level libraries. If you are the author of the libraries, you must write your libraries in such a manner, that the dependency graph of your libraries is a tree. If two libraries depend on each other bidirectionally, then you may have trouble linking them in. This suggests that they should be one library instead!
While we are at the topic, when you compile ordinary programs like the hello world program what really goes on behind the scenes is this:
% gcc -c hello.c % gcc -o hello hello.o -lc
This links in the C system library `libc.a'. The standard include files that you use, such as `stdio.h', `stdlib.h' and whathaveyou are all refering to various parts of these libraries. These libraries get linked in by default when the `-o' flag is present. Note that other C compilers may be calling their system libraries something else. For this reason the corresponding flags are assumed and you don't have to supply them.
The catch is that there are many functions that you think of as standard that are not included in the `libc.a' library. For example all the math functions that are declared in `math.h' are defined in a library called `libm.a' which is not linked by default. So if the hello world program needed the math library you should be doing this instead:
% gcc -c hello.c % gcc -o hello hello.o -lm
On some old Linux systems it used to be required that you also link a `libieee.a' library:
% gcc -o hello hello.o -lieee -lm
More problems of this sort occur when you use more esoteric system
calls like sockets. Some systems require you to link in additional
system libraries such as `libbsd.a', `libsocket.a',
`libnsl.a'. Also if you are linking Fortran and C code together
you must also link the Fortran run-time libraries. These libraries
have non-standard names and depend on the Fortran compiler you use.
Finally, a very common problem is encountered when you are writing
X applications. The X libraries and header files like to be placed in
non-standard locations so you must provide system-dependent -I
and -L
flags so that the compiler can find them. Also the most
recent version of X requires you to link in some additional libraries
on top of libX11.a
and some rare systems require you to link
some additional system libraries to access networking features
(recall that X is built on top of the sockets interface and it is essentially a
communications protocol between the computer running the program and
computer that controls the screen in which the X program is displayed.)
Fortunately, Autoconf can help you deal with all of this. We will cover
these issues in more detail in subsequent chapters.
Because it is necessary to link system libraries to form an executable, under copyright law, the executable is derived work from the system libraries. This means that you must pay attention to the license terms of these libraries. The GNU `libc' library is under the LGPL license which allows you to link and distribute both free and proprietary executables. The `stdc++' library is also under terms that permit the distribution of proprietary executables. The `libg++' library however only permits you to build free executables. If you are on a GNU system, including Linux-based GNU systems, the legalese is pretty straightforward. If you are on a proprietary Unix system, you need to be more careful. The GNU GPL does not allow GPLed code to be linked against proprietary library. Because on Unix systems, the system libraries are proprietary, their terms may not allow you to distribute executables derived from them. In practice, they do however, since proprietary Unix systems do want to attract proprietary applications. In the same spirit, the GNU GPL also makes an exception and explicitly permits the linking of GPL code with proprietary system libraries, provided that said libraries are system libraries. This includes proprietary `libc.a' libraries, the `libdxml.a' library in Digital Unix, proprietary Fortran system libraries like `libUfor.a', and the X11 libraries.
To begin, let's review the simplest example, the hello world program:
#include <stdio.h> main() { printf("Howdy, world!\n"); }
bin_PROGRAMS = hello hello_SOURCES = hello.c
AC_INIT(hello.cc) AM_INIT_AUTOMAKE(hello,1.0) AC_PROG_CC AC_PROG_INSTALL AC_OUTPUT(Makefile)
The language of `Makefile.am' is a logic language. There is no explicit statement of execution. Only a statement of relations from which execution is inferred. On the other hand, the language of `configure.in' is procedural. Each line of `configure.in' is a command that is executed.
Seen in this light, here's what the `configure.in' commands shown do:
AC_INIT
command initializes the configure script. It must be
passed as argument the name of one of the source files. Any source file
will do.
AM_INIT_AUTOMAKE
performs some further initializations that are
related to the fact that we are using `automake'. If you are writing
your `Makefile.in' by hand, then you don't need to call this command.
The two comma-separated arguments are the name of the package and the
version number.
AC_PROG_CC
checks to see which C compiler you have.
AC_PROG_INSTALL
checks to see whether your system has a BSD
compatible install utility. If not then it uses `install-sh' which
`automake' will install at the root of your package directory if it's
not there yet.
AC_OUTPUT
tells the configure script to generate `Makefile'
from `Makefile.in'
The `Makefile.am' is more obvious. The first line specifies the name of the program we are building. The second line specifies the source files that compose the program.
For now, as far as `configure.in' is concerned you need to know the following additional facts:
AC_PROG_RANLIB
command.
AC_PROG_MAKE_SET
command.
AC_OUTPUT
statement like this:
AC_OUTPUT(Makefile \ dir1/Makefile \ dir2/Makefile \ )Note that the backslashes are not needed if you are using the bash shell. For portability reasons, however, it is a good idea to include them.
As we explained before to build this package you need to execute the following commands:
% aclocal % autoconf % touch README AUTHORS NEWS ChangeLog % automake -a % configure % make
The first three commands, are for the maintainer only. When the user unpacks a distribution, he should be able to start from `configure' and move on.
AM_INIT_AUTOMAKE
macro which is
not part of the standard `autoconf' macros. For this reason, it's
definition needs to be placed in `aclocal.m4'. If you call `aclocal'
with no arguments then it will generate the appropriate `aclocal.m4' file.
Later we will show you how to use `aclocal' to also install your
own `autoconf' macros.
If you are curious you can take a look at the generated `Makefile'. It looks like gorilla spit but it will give you an idea of how one gets there from the `Makefile.am'.
The `configure' script is an information gatherer. It finds out things
about your system. That information is given to you in two ways. One way
is through defining C preprocessor macros that you can test for directly
in your source code with preprocessor directives. This is done by passing
-D
flags to the compiler. The other way is by making certain
variables defined at the `Makefile.am' level. This way you can, for
example, have the configure script find out how a certain library is linked,
export is as a `Makefile.am' variable and use that variable in your
`Makefile.am'. Also, through certain special variables, `configure'
can control how the compiler is invoked by the `Makefile'.
As you may have noticed, the `configure' script in the previous example
defines two preprocessor macros that you can use in your code:
PACKAGE
and VERSION
. As you become a power-user of
`autoconf' you will get define even more such macros. If you inspect
the output of `make' during compilation, you will see that these macros
get defined by passing `-D' flags to the compiler, one for each macro.
When there is too many of these flags getting passed around, this can cause
two problems: it can make the `make' output hard to
read, and more importantly it can hit the buffer limits of various braindead
implementations of `make'. To work around this problem, an alternative
approach is to define all these macros in a special header file and include
it in all the sources.
A hello world program using this technique looks like this
AC_INIT AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(hello,0.1) AC_PROG_CXX AC_PROG_INSTALL AC_OUTPUT(Makefile)
bin_PROGRAMS = hello hello_SOURCES = hello.c
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <stdio.h> main() { printf("Howdy, pardner!\n"); }
Note that we call a new macro in `configure.in':
AM_CONFIG_HEADER
. Also we include the configuration file conditionally
with the following three lines:
#ifdef HAVE_CONFIG_H #include <config.h> #endif
It is important to make sure that the `config.h' file is the first thing that gets included. Now do the usual routine:
% aclocal % autoconf % touch NEWS README AUTHORS ChangeLog % automake -a
Automake will give you an error message saying that it needs a file called `config.h.in'. You can generate such a file with the `autoheader' program. So run:
% autoheader Symbol `PACKAGE' is not covered by acconfig.h Symbol `VERSION' is not covered by acconfig.h
Again, you get error messages. The problem is that autoheader
is
bundled with the autoconf
distribution, not the automake
distribution, and consequently doesn't know how to deal with the
PACKAGE
and VERSION
macros. Of course, if `configure'
defines a macro, there's nothing to know. On the other hand, when a macro
is not defined then there are at least two possible defaults:
#undef PACKAGE #define PACKAGE 0
The autoheader
program here complains that it doesn't know the defaults
for the PACKAGE
and VERSION
macros. To provide the defaults,
create a new file `acconfig.h':
#undef PACKAGE #undef VERSION
and run `autoheader' again:
% autoheader
At this point you must run autoconf
again, so that it takes into account
the presense of acconfig.h
:
% aclocal % autoconf
Now you can go ahead and build the program:
% configure % make Computing dependencies for hello.cc... echo > .deps/.P gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c hello.cc gcc -g -O2 -o hello hello.o
Note that now instead of multiple -D
flags, there is only one
such flag passed: -DHAVE_CONFIG_H
. Also, appropriate -I
flags are passed to make sure that `hello.cc' can find and include
`config.h'.
To test the distribution, type:
% make distcheck ...... ======================== hello-0.1.tar.gz is ready for distribution ========================
and it should all work out.
The `config.h' files go a long way back in history. In the past, there
used to be packages where you would have to manually edit `config.h'
files and adjust the macros you wanted defined by hand. This made these
packages very difficult to install because they required intimate knowledge
of your operating system. For example, it was not unusual to see a comment
saying "if your system has a broken vfork, then define this macro".
How the hell are you supposed to know if your systems vfork
is
broken?? With auto-configuring packages all of these details are taken
care of automatically, shifting the burden from the user to the developer
where it belongs.
Normally in the `acconfig.h' file you put statements like
#undef MACRO #define MACRO default
These values are copied over to `config.h.in' and are supplemented with
additional defaults for C preprocessor macros that get defined by
native autoconf
macros like
AC_CHECK_HEADERS
, AC_CHECK_FUNCS
, AC_CHECK_SIZEOF
,
AC_CHECK_LIB
.
If the file `acconfig.h' contains the string @TOP@
then all
the lines before the string will be included verbatim to `config.h'
before the custom definitions. Also, if the file `acconfig.h'
contains the string @BOTTOM@
then all the lines after the string will
be included verbatim to `config.h' after the custom definitions.
This allows you to include further preprocessor directives that are related
to configuration. Some of these directives may be using the custom definitions
to conditionally issue further preprocessor directives. Due to a bug in
some versions of autoheader
if the strings @TOP@
and
@BOTTOM@
do appear in your acconfig.h
file, then you must
make sure that there is at least one line appearing before
@TOP@
and one line after @BOTTOM@
, even if it has to be
a comment. Otherwise, autoheader
may not work correctly.
With `autotools' we distribute a utility called `acconfig' which will build `acconfig.h' automatically. By default it will always make sure that
#undef PACKAGE #undef VERSION
are there. Additionally, if you install macros that are `acconfig' friendly
then `acconfig' will also install entries for these macros.
The acconfig
program may be revised in the future and perhaps
it might be eliminated. There is an unofficial patch to Autoconf that
will automate the maintance of `acconfig.h', eliminating the need
for a seperate program. I am not yet certain if that patch will be
part of the official next version of Autoconf, but I very much expect it
to. Until then, if you are interested, see:
http://www.clark.net/pub/dickey/autoconf/autoconf.html
This situation creates a bit of a dilemma about whether I should
document and encourage acconfig
in this tutorial or not.
I believe that the Autoconf patch is a superior solution. However since
I am not the one maintaining Autoconf, my hands are tied. For now
let's say that if you confine yourself to using only the macros provided
by autoconf
, automake
, and autotools
then
`acconfig.h' will be completely taken care for you by `acconfig'.
In the future, I hope that acconfig.h
will be generated
by configure
and be the sole responsibility of Autoconf.
You may be wondering whether it is worth using `config.h' files in the
programs you develop if there aren't all that many macros being defined.
My personal recommendation is yes. Use `config.h' files because
perhaps in the future your `configure' might need to define even more
macros. So get started on the right foot from the beginning. Also, it is
nice to just have a config.h
file lying around because you can have
all your configuration specific C preprocessor directives in one place.
In fact, if you are one of these people writing peculiar system software
where you get to #include
20 header files on every single source file
you write, you can just have them on all thrown into config.h
once
and for all.
In the next chapter we will tell you about the LF
macros that get
distributed with autotools
and this tutorial. These macros do require
you to use the `config.h' file. The bottom line is: `config.h'
is your friend; trust the config.h
.
FIXME: write about VPATH builds and how to modify optimization
In software engineering, people start from a precise, well-designed specification and proceed to implementation. In research, the specification is fluid and immaterial and the goal is to be able to solve a slightly different problem every day. To have the flexibility to go from variation to variation with the least amount of fuss is the name of the game. By fuss, we refer to debugging, testing and validation. Once you have a code that you know gives the right answer to a specific set of problems, you want to be able to move on to a different set of similar problems with reinventing, debugging and testing as little as possible. These are the two distinct situations that computer programmers get to confront in their lives.
Software engineers can take good care of themselves in both situations. It's part of their training. However, people whose specialty is the scientific problem and not software engineering, must confront the hardest of the two cases, the second one, with very little training in software engineering. As a result they develop code that's clumsy in implementation, clumsy in usage, and with only redeeming quality the fact that it gives the right answer. This way, they do get the work of the day done, but they leave behind them no legacy to do the work of tomorrow. No general-purpose tools, no documentation, no reusable code.
The key to better software engineering is to focus away from developing monolithic applications that do only one job, and focus on developing libraries. One way to think of libraries is as a program with multiple entry points. Every library you write becomes a legacy that you can pass on to other developers. Just like in mathematics you develop little theorems and use the little theorems to hide the complexity in proving bigger theorems, in software engineering you develop libraries to take care of low-level details once and for all so that they are out of the way everytime you make a different implementation for a variation of the problem.
On a higher level you still don't create just one application. You create many little applications that work together. The centralized all-in-one approach in my experience is far less flexible than the decentralized approach in which a set of applications work together as a team to accomplish the goal. In fact this is the fundamental principle behind the design of the Unix operating system. Of course, it is still important to glue together the various components to do the job. This you can do either with scripting or with actually building a suite of specialized monolithic applications derived from the underlying tools.
The name of the game is like this:
Break down the program to parts. And the parts to smaller parts, until you
get down to simple subproblems that can be easily tested, and from which
you can construct variations of the original problem. Implement each one
of these as a library, write test code for each library and make sure that
the library works. It is very important for your library to have a complete
test suite, a collection of programs that are supposed to run silently
and return normally (exit(0);
) if they execute successfully,
and return abnormally (assert(false); exit(1);
) if they fail.
The purpose of the test suite is to detect bugs in the library, and to
convince you, the developer, that the library works. The best time to
write a test program is as soon as it is possible! Don't be lazy.
Don't just keep throwing in code after code after code. The minute there
is enough new code in there to put together some kind of test program,
just do it! I can not emphasize that enough. When you write new code
you have the illusion that you are producing work, only to find out tomorrow
that you need an entire week to debug it. As a rule, internalize the reality
that you know you have produced new work everytime you write a working
test program for the new features, and not a minute before.
Another time when you should definetly write a test suite is when you
find a bug while ordinarily using the library. Then, before you even
fix the bug, write a test program that detects the bug. Then go fix it.
This way, as you add new features to your libraries you have insurance that
they won't reawaken old bugs.
Please keep documentation up to date as you go. The best time to write documentation is right after you get a few new test programs working. You might feel that you are too busy to write documentation, but the truth of the matter is that you will always be too busy. After long hours debugging these seg faults, think of it as a celebration of triumph to fire up the editor and document your brand-spanking new cool features.
Please make sure that computational code is completely seperated from I/O code so that someone else can reuse your computational code without being forced to also follow your I/O model. Then write programs that invoke your collection of libraries to solve various problems. By dividing and conquering the problem library by library with a test suite for each step along the way, you can write good and robust code. Also, if you are developing numerical software, please don't expect that other users of your code will be getting a high while entering data for your input files. Instead write an interactive utility that will allow users to configure input files in a user friendly way. Granted, this is too much work in Fortran. Then again, you do know more powerful languages, don't you?
Examples of useful libraries are things like linear algebra libraries, general ODE solvers, interpolation algorithms, and so on. As a result you end up with two packages. A package of libraries complete with a test suite, and a package of applications that invoke the libraries. The package of libraries is well-tested code that can be passed down to future developers. It is code that won't have to be rewritten if it's treated with respect. The package of applications is something that each developer will probably rewrite since different people will probably want to solve different problems. The effect of having a package of libraries is that C++ is elevated to a Very High Level Language that's closer to the problems you are solving. In fact a good rule of thumb is to make the libraries sufficiently sophisticated so that each executable that you produce can be expressed in one source file. All this may sound like common sense, but you will be surprised at how many scientific developers maintain just one does-everything-program that they perpetually hack until it becomes impossible to maintain. And then you will be even more surprised when you find that some professors don't understand why a "simple mathematical modification" of someone else's code is taking you so long.
Every library must have its own directory and Makefile
. So a library
package will have many subdirectories, each directory being one library.
And perhaps if you have too many of them, you might want to group them
even further down. Then, there's the applications. If you've done
everything right, there should be enough stuff in your libraries to enable
you to have one source file per application. Which means that all the source
files can probably go down under the same directory.
Very often you will come to a situation where there's something that your libraries to-date can't do, so you implement it and stick it along in your source file for the application. If you find yourself cut and pasting that implementation to other source files, then this means that you have to put this in a library somewhere. And if it doesn't belong to any library you've written so far, maybe to a new library. When you are in a deadline crunch, there's a tendency not to do this since it's easier to cut and paste. The problem is that if you don't take action right then, eventually your code will degenerate to a hard-to-use mess. Keeping the entropy down is something that must be done on a daily basis.
Finally, a word about the age-old issue of language-choice. The GNU coding standards encourage you to program in C and avoid using languages other than C, such as C++ or Fortran. The main advantage of C over C++ and Fortran is that it produces object files that can be linked by any C or C++ compiler. In contrast, C++ object files can only be linked by the compiler that produced them. As for Fortran, aside from the fact that Fortran 90 and 95 have no free compilers, it is not very trivial to mix Fortran 77 with C/C++, so it makes no sense to invite all that trouble without a compelling reason. Nevertheless, my suggestion is to code in C++. The main benefit you get with C++ is robustness. Having constructors and destructors and references can go a long way towayrds helping you to void memory errors, if you know how to make them work for you.
Now we get into the gory details of software organization. I'll tell you one
way to do it. This is advice, not divine will. It's simply a way that works
well in general, and a way that works well with autoconf
and
automake
in particular.
The first principle is to maintain the package of libraries seperate from the package of applications. This is not an iron-clad rule. In software engineering, where you have a crystal clear specification, it makes no sense to keep these two seperate. I found from experience that it makes a lot more sense in research. Either of these two packages must have a toplevel directory under which live all of its guts. Now what do the guts look like?
First of all you have the traditional set of information files that we described in Chapter 1:
README, AUTHORS, NEWS, ChangeLog, INSTALL, COPYING
You also have the following subdirectories:
configure
script link all public header files
in all the subdirectories under src
to this directory. This way
it will only be necessary to pass one -I
flag to test suites that
want to access the include files of other libraries in the distribution.
We will discuss this later.
Together with these subdirectories you need to put a `Makefile.am' and a `configure.in' file. I also suggest that you put a shell script, which you can call `reconf', that contains the following:
#!/bin/sh rm -f config.cache rm -f acconfig.h touch acconfig.h aclocal -I m4 autoconf autoheader acconfig automake -a exit
This will generate `configure' and `Makefile.in' and needs to
be called whenever you change a `Makefile.am' or a `configure.in'
as well as when you change something under the `m4' directory.
It will also call acconfig
which automatically generates
acconfig.h
and calle `autoheader' to make config.h.in
.
The `acconfig' utility is part of `autotools', and if you are
maintaining `acconfig.h' by hand, then you want to use this script
instead:
#!/bin/sh rm -f config.cache aclocal -I m4 autoconf autoheader automake -a exit
At the toplevel directory, you need to put a `Makefile.am' that will tell the computer that all the source code is under the `src' directory. The way to do it is to put the following lines in `Makefile.am':
EXTRA_DIST = reconf SUBDIRS = m4 doc src
automake
that the `reconf' script
is part of the distribution and must be included when you do make dist
.
automake
that the rest of the distribution is
in the subdirectories `m4', `doc' and `src'. It instructs
`make' to recursively call itself in these subdirectories. It is important
to include the `doc' and `m4' subdirectories here and enhance them
with `Makefile.am' so that make dist
includes them into the
distribution.
If you are also using a `lib' subdirectory, then it should be built before `src':
EXTRA_DIST = reconf SUBDIRS = m4 doc lib src
The `lib' subdirectory should build a static library that is linked by your executables in `src'. There should be no need to install that library.
At the toplevel directory you also need to put the `configure.in' file. That should look like this:
AC_INIT AM_INIT_AUTOMAKE(packagename,versionnumber) [...put your tests here...] AC_OUTPUT(Makefile \ doc/Makefile \ m4/Makefile \ src/Makefile \ src/dir1/Makefile \ src/dir2/Makefile \ src/dir3/Makefile \ src/dir1/foo1/Makefile \ ............ \ )
You will not need another `configure.in' file. However,
every directory level on your tree must have a `Makefile.am'.
When you call
automake
on the top-level directory, it looks at `AC_OUTPUT' at
your
`configure.in' to decide what other directories have a `Makefile.am'
that needs parsing. As you can see from above, a `Makefile.am' file
is needed even under the `doc' and `m4' directories. How to set
that up is up to you. If you aren't building anything, but just have files
and directories hanging around, you must declare these files and directories
in the `Makefile.am' like this:
SUBDIRS = dir1 dir2 dir3 EXTRA_DIST = file1 file2 file3
Doing that will cause make dist
to include these files and directories
to the package distribution.
This tedious setup work needs to be done everytime that you create a new package. If you create enough packages to get sick of it, then you want to look into the `acmkdir' utility that is distributed by Autotools. We will describe it at the next chapter.
Next we explain how to develop `Makefile.am' files for the source code directory levels. A `Makefile.am' is a set of assignments. These assignments imply the Makefile, a set of targets, dependencies and rules, and the Makefile implies the execution of building.
The first set of assignments going at the beginning look like this:
INCLUDES = -I/dir1 -I/dir2 -I/dir3 .... LDFLAGS = -L/dir1 -L/dir2 -L/dir3 .... LDADD = -llib1 -llib2 -llib3 ...
-I
flags that
you need to pass to your compiler. If the stuff in this directory is
dependent on a library in another directory of the same package, then
the -I
flag must point to that directory.
-L
flags
that are needed by the compiler when it links all the object files to
an executable.
-l
flag only for installed libraries. You can list
libraries that have been built but not installed yet as well, but
do this only be providing the full path to these libraries.
If your package contains subdirectories with libraries and you want to link these libraries in another subdirectory you need to put `-I' and `-L' flags in the two variables above. To express the path to these other subdirectories, use the `$(top_srcdir)' variable. For example if you want to access a library under `src/libfoo' you can put something like:
INCLUDES = ... -I$(top_srcdir)/src/libfoo ... LDFLAGS = ... -L$(top_srcdir)/src/libfoo ...
on the `Makefile.am' of every directory level that wants access to these libraries. Also, you must make sure that the libraries are built before the directory level is built. To guarantee that, list the library directories in `SUBDIRS' before the directory levels that depend on it. One way to do this is to put all the library directories under a `lib' directory and all the executable directories under a `bin' directory and on the `Makefile.am' for the directory level that contains `lib' and `bin' list them as:
SUBDIRS = lib bin
This will guarantee that all the libraries are available before building any executables. Alternatively, you can simply order your directories in such a way so that the library directories are built first.
Next we list the things that are to be built in this directory level:
bin_PROGRAMS = prog1 prog2 prog3 .... lib_LIBRARIES = libfoo1.a libfoo2.a libfoo3.a .... check_PROGRAMS = test1 test2 test3 .... TESTS = $(check_PROGRAMS) include_HEADERS = header1.h header2.h ....
make
and installed with make install
under
`/prefix/bin', where `prefix' is usually `/usr/local'.
make
and installed with make install
under
`/prefix/lib'.
make
but only with a
make check
. These programs serve as tests that you, the user
can use to test the library.
make check
. These programs
constitute the test suite and they are indispensible when you
develop a library. It is common to just set
TESTS = $(check_PROGRAMS)This way by commenting the line in and out, you can modify the behaviour of
make check
. While debugging your test suite, you will want to
comment out this line so that make check
doesn't run it. However,
in the end product, you will want to comment it back in.
/prefix/include
. You must
list a header file here if you want to cause it to be installed. You
can also list it under libfoo_a_SOURCES
for the library that it
belongs to, but it is imperative to list public headers here so that they
can be installed.
It is good programming practice to keep libraries and executables under seperate directory levels. However, it is okey to keep the library and the check executables that test the library under the same directory level because that makes it easier for you to link them with the library.
For each of these types of targets, we must state information that
will allow automake
and make
to infer the building process.
prog1_SOURCES = foo1.cc foo2.cc ... header1.h header2.h .... prog1_LDADD = -lbar1 -lbar2 -lbar3 prog1_LDFLAGS = -L/dir1 -L/dir2 -L/dir3 ... prog1_DEPENDENCIES = dep1 dep2 dep3 ...In each assignment substitute `prog1' with the name of the program that you are building as it appeared in `bin_PROGRAMS' or `check_PROGRAMS'.
make dist
.
To cause header files to be installed you must also put them in
`include_HEADERS'.
-l
flags for linking
whatever libraries are needed by your code. You may also list object files,
which have been compiled in an exotic way, as well as paths to uninstalled
yet libraries.
-L
flags that are needed to
resolve the libraries you passed in `prog_LDADD'. Certain flags that
need to be passed on every program can be expressed on a global
basis by assigning them at `LDFLAGS'.
lib_LIBRARIES = ... libfoo1.a ... libfoo1_a_SOURCES = foo1.cc foo2.cc private1.h private2.h ... libfoo1_a_LIBADD = obj1.o obj2.o obj3.o libfoo1_a_DEPENDENCIES = dep1 dep2 dep3 ...Note that if the name of the library is `libfoo1.a' the prefix that appears in the variables that are related with that library is `libfoo1_a_'.
include_HEADERS
it is not required to repeat
them a second time here.
In the previous section we described how to use Automake to compile programs, libraries and test suites. To exploit the full power of Automake however, it is important to understand the fundamental ideas behind it.
The simplest way to look at a `Makefile.am' is as a collection of assignments which infer a set of Makefile rules, which in turn infer the building process. There are three types of such assignments:
bindir = $(prefix)/bin libdir = $(prefix)/lib includedir = $(prefix)/includeThese are the directories where you install executables, libraries and public header files. You can override the defaults by inserting different assignments in your `Makefile.am', but please don't do that. Instead you can define new assignments. For example, if you do
foodir = $(prefix)/foothen that makes writing `foo_PROGRAMS', `foo_LIBRARIES' install in the `$(prefix)/foo' direcory instead. The symbols `check' and `noinst' have special meanings and you should not ever try to assign to `checkdir' and `noinstdir'.
bin_PROGRAMS = hellothis means that you can then say:
hello_SOURCES = ... hello_LDADD = ...and so on. The `SOURCES' and `LDADD' are properties of `hello' which is a `PROGRAMS' primitive.
In addition to all this, you may include ordinary targets in a `Makefile.am' just as you would in an ordinary `Makefile.in'. If you do that however, then please check at some point that your distribution can properly build with `make distcheck'. It is very important that when you define your own rules, to build whatever you want to build, to follow the following guidelines:
$(srcdir)
. This variable points to the directory where
your source code is located during the current `make', which is not
necessarily the same directory as the one returned by ``pwd`'.
It is possible to do what is called a VPATH build where the generated
files are created in a seperate directory tree from the source code.
What ``pwd`' would return to you in that case would be the directory in
which files are written, not the directory from which files are
read.
If you mess this up, then you will know when
make distcheck
fails, which attempts to do a VPATH build. The
directory in which files are written can be accessed by the dot.
For example, `./foo'.
$(top_srcdir)
for files which you wrote (and your compiler
tools read) and
$(top_builddir)
for files which the compiler wrote.
ar cat chmod cmp cp diff echo egrep expr false grep ls mkdir mv pwd rm rmdir sed sleep sort tar test touch trueAny other programs that you want to use, you must do so through make variables. That includes programs such as these:
awk bash bison cc flex install latex ld ldconfig lex ln make makeinfo perl ranlib shar texi2dvi yaccThe make variables you define through Autoconf in your
configure.in
.
For special-purpose tools, use the AC_PATH_PROGS macro. For example:
AC_PATH_PROGS(BASH, bash sh) AC_PATH_PROGS(PERL, perl perl5.005 perl5.004 perl5.003 perl5.002 perl5.001) AC_PATH_PROGS(SHAR, shar) AC_PATH_PROGS(BISON, bison)Some special tools have their own macros:
AC_PROG_MAKE_SET -> $(MAKE) -> make AC_PROG_RANLIB -> $(RANLIB) -> ranlib | (do-nothing) AC_PROG_AWK -> $(AWK) -> mawk | gawk | nawk | awk AC_PROG_LEX -> $(LEX) -> flex | lex AC_PROG_YACC -> $(YACC) -> 'bison -y' | byacc | yacc AC_PROG_LN_S -> $(LN_S) -> ln -sBefore using any of these macros, consult the Autoconf documentation to see exactly what it is that they do.
A real life example of a `Makefile.am' for libraries is the one I use to build the Blas-1 library. It looks like this:
* `blas1/Makefile.am'
SUFFIXES = .f .f.o: $(F77) $(FFLAGS) -c $< lib_LIBRARIES = libblas1.a libblas1_a_SOURCES = f2c.h caxpy.f ccopy.f cdotc.f cdotu.f crotg.f cscal.f \ csrot.f csscal.f cswap.f dasum.f daxpy.f dcabs1.f dcopy.f ddot.f dnrm2.f \ drot.f drotg.f drotm.f drotmg.f dscal.f dswap.f dzasum.f dznrm2.f icamax.f \ idamax.f isamax.f izamax.f sasum.f saxpy.f scasum.f scnrm2.f scopy.f \ sdot.f snrm2.f srot.f srotg.f srotm.f srotmg.f sscal.f sswap.f zaxpy.f \ zcopy.f zdotc.f zdotu.f zdrot.f zdscal.f zrotg.f zscal.f zswap.f
Because the Blas library is written in Fortran, I need to declare the Fortran suffix at the beginning of the `Makefile.am' with the `SUFFIXES' assignment and then insert an implicit rule for building object files from fortran files. The variables `F77' and `FFLAGS' are defined by Autoconf, by using the Fortran support provided by Autotools. For C or C++ files there is no need to include implicit rules. We discuss Fortran support at a later chapter.
Another important thing to note is the use of the symbol `$<'. We introduced these symbols in Chapter 2, where we mentioned that `$<' is the dependencies that changed causing the target to need to be rebuilt. If you've been paying attention you may be wondering why we didn't say `$(srcdir)/$<' instead. The reason is because for VPATH builds, `make' is sufficiently intelligent to substitute `$<' with the Right Thing.
Now consider the `Makefile.am' for building a library for solving linear systems of equations in a nearby directory:
* `lin/Makefile.am'
SUFFIXES = .f .f.o: $(F77) $(FFLAGS) -c $< INCLUDES = -I../blas1 -I../mathutil lib_LIBRARIES = liblin.a include_HEADERS = lin.h liblin_a_SOURCES = dgeco.f dgefa.f dgesl.f f2c.h f77-fcn.h lin.h lin.cc check_PROGRAMS = test1 test2 test3 TESTS = $(check_PROGRAMS) LDADD = liblin.a ../blas1/libblas1.a ../mathutil/libmathutil.a $(FLIBS) -lm test1_SOURCES = test1.cc f2c-main.cc test2_SOURCES = test2.cc f2c-main.cc test3_SOURCES = test3.cc f2c-main.cc
In this case, we have a library that contains mixed Fortran and C++ code. We also have an example of a test suite, which in this case contains three test programs. What's new here is that in order to link the test suite properly we need to link in libraries that have been built already in other directories but haven't been installed yet. Because every test program requires to be linked against the same libraries, we set these libraries globally with an `LDADD' assignment for all executables. Because the libraries have not been installed yet we specify them with their full path. This will allow Automake to track dependencies correctly; if `libblas1.a' is modified, it will cause the test suite to be rebuilt. Also the variable `INCLUDES' is globally assigned to make the header files of the other two libraries accessible to the source code in this directory. The variable `$(FLIBS)' is assigned by Autoconf to link the run-time Fortran libraries, and then we link the installed `libm.a' library. Because that library is installed, it must be linked with the `-l' flag. Another peculiarity in this example is the file `f2c-main.cc' which is shared by all three executables. As we will explain later, when you link executables that are derived from mixed Fortran and C or C++ code, then you need to link with the executable this kludge file.
The test-suite files for numerical code will usually invoke the library to perform a computation for which an exact result is known and then verify that the result is true. For non-numerical code, the library will need to be tested in different ways depending on what it does.
In some complicated packages, you want to generate part of their source code by executing a program at compile time. For example, in one of the packages that I wrote for an assignment, I had to generate a file `incidence.out' that contained a lot of hairy matrix definitions that were too ugly to just compute and write by hand. That file was then included by `fem.cc' which was part of a library that I wrote to solve simple finite element problems, with a preprocessor statement:
#include "incidence.out"
All source code files that are to be generated during compile time should be listed in the global definition of `BUILT_SOURCES'. This will make sure that these files get compiled before anything else. In our example, the file `incidence.out' is computed by running a program called `incidence' which of course also needs to be compiled before it is run. So the `Makefile.am' that we used looked like this:
noinst_PROGRAMS = incidence lib_LIBRARIES = libpmf.a incidence_SOURCES = incidence.cc mathutil.h incidence_LDADD = -lm incidence.out: incidence ./incidence > incidence.out BUILT_SOURCES = incidence.out libpmf_a_SOURCES = laplace.cc laplace.h fem.cc fem.h mathutil.h check_PROGRAMS = test1 test2 TESTS = $(check_PROGRAMS) test1_SOURCES = test1.cc test1_LDADD = libpmf.a -lm test2_SOURCES = test2.cc test2_LDADD = libpmf.a -lm
Note that because the executable `incidence' has been created at compile time, the correct path is `./incidence'. Always keep in mind, that the correct path to source files, such as `incidence.cc' is `$(srcdir)/incidence.cc'. Because the `incidence' program is used temporarily only for the purposes of building the `libpmf.a' library, there is no reason to install it. So, we use the `noinst' prefix to instruct Automake not to install it.
Previously, we mentioned that the symbols `bin', `lib' and `include' refer to installation locations that are defined respectively by the variables `bindir', `libdir' and `includedir'. For completeness, we will now list the installation locations available by default by Automake and describe their purpose.
All installation locations are placed under one of the following directories:
configure --prefix=/home/lf
configure --prefix=/home/lf --exec-prefix=/home/lf/gnulinuxThe purpose of using a seperate location for machine-dependent files is because then it makes it possible to install the software on a networked file server and make that available to machines with different architectures. To do that there must be seperate copies of all the machine-dependent files for each architecture in use.
Executable files are installed in one of the following locations:
bindir = $(exec_prefix)/bin sbindir = $(exec_prefix)/sbin libexecdir = $(exec_prefix)/libexec
Library files are installed under
libdir = $(exec_prefix)/lib
Include files are installed under
includedir = $(prefix)/include
Data files are installed in one of the following locations:
datadir = $(prefix)/share sysconfdir = $(prefix)/etc sharedstatedir = $(prefix)/com localstatedir = $(prefix)/var
Autoconf macros should be installed in `$(datadir)/aclocal'. There is no symbol defined for this location, so you need to define it yourself:
m4dir = $(datadir)/aclocal
FIXME: Emacs Lisp files?
FIXME: Documentation?
Automake, to encourage tidyness, also provides the following locations such that each package can keep its stuff under its own subdirectory:
pkglibdir = $(libdir)/@PACKAGE@ pkgincludedir = $(includedir)/@PACKAGE@ pkgdatadir = $(datadir)/@PACKAGE@
There are a few other such `pkg' locations, but they are not practically useful.
Sometimes you may feel the need to implement some of your programs in a scripting language like Bash or Perl. For example, the `autotools' package is exclusively a collection of shell scripts. Theoretically, a script does not need to be compiled. However, there are still issues pertaining to scripts such as:
make install
, uninstalled
with make uninstall
and distributed with make dist
.
#!
right.
To let Automake deal with all this, you need to use the `SCRIPTS' primitive. By listing a file under a `SCRIPTS' primitive assignment, you are telling Automake that this file needs to be built, and must be allowed to be installed in a location where executable files are normally installed. Automake by default will not clean scripts when you invoke the `clean' target. To force Automake to clean all the scripts, you need to add the following line to your `Makefile.am':
CLEANFILES = $(bin_SCRIPTS)
You also need to write your own targets for building the script by hand.
For example:
# -* bash *- echo "Howdy, world!" exit 0
# -* perl *- print "Howdy, world!\n"; exit(0);
bin_SCRIPTS = hello1 hello2 CLEANFILES = $(bin_SCRIPTS) EXTRA_DIST = hello1.sh hello2.pl hello1: $(srcdir)/hello1.sh rm -f hello1 echo "#! " $(BASH) > hello1 cat $(srcdir)/hello1.sh >> hello1 chmod ugo+x hello1 hello2: $(srcdir)/hello2.pl $(PERL) -c hello2.pl rm -f hello2 echo "#! " $(PERL) > hello2 cat $(srcdir)/hello2.pl >> hello2 chmod ugo+x hello2
AC_INIT AM_INIT_AUTOMAKE(hello,0.1) AC_PATH_PROGS(BASH, bash sh) AC_PATH_PROGS(PERL, perl perl5.004 perl5.003 perl5.002 perl5.001 perl5) AC_OUTPUT(Makefile)
Note that in the "source" files `hello1.sh' and `hello2.pl' we do not include a line like
#!/bin/bash #!/usr/bin/perl
Instead we let Autoconf pick up the correct path, and then we insert it
during make
. Since we omit the #!
line, we leave a comment
instead that indicates what kind of file this is.
In the special case of perl
we also invoke
perl -c hello2.pl
This checks the perl script for correct syntax. If your scripting language
supports this feature I suggest that you use it to catch errors at
"compile" time.
The AC_PATH_PROGS
macro looks for a specific utility and returns
the full path.
If you wish to conform to the GNU coding standards, you may want your script
to support the --help
and --version
flags, and you may want
--version
to pick up the version number from
AM_INIT_AUTOMAKE
.
Here's an enhanced hello world scripts:
VERSION=@VERSION@
$VERSION="@VERSION@";
# -* bash *- function usage { cat << EOF Usage: % hello [OPTION] Options: --help Print this message --version Print version information Bug reports to: monica@whitehouse.gov EOF } function version { cat << EOF hello $VERSION - The friendly hello world program Copyright (C) 1997 Monica Lewinsky <monica@whitehouse.gov> This is free software, and you are welcome to redistribute it and modify it under certain conditions. There is ABSOLUTELY NO WARRANTY for this software. For legal details see the GNU General Public License. EOF } function invalid { echo "Invalid usage. For help:" echo "% hello --help" } # ------------------------- if test $# -ne 0 then case $1 in --help) usage exit ;; --version) version exit ;; *) invalid exit ;; fi # ------------------------ echo "Howdy world" exit
# -* perl *- sub usage { print <<"EOF"; Usage: % hello [OPTION] Options: --help Print this message --version Print version information Bug reports to: monica@whitehouse.gov EOF exit(1); } sub version { print <<"EOF"; hello $VERSION - The friendly hello world program Copyright (C) 1997 Monica Lewinsky <monica@whitehouse.gov> This is free software, and you are welcome to redistribute it and modify it under certain conditions. There is ABSOLUTELY NO WARRANTY for this software. For legal details see the GNU General Public License. EOF exit(1); } sub invalid { print "Invalid usage. For help:\n"; print "% hello --help\n"; exit(1); } # -------------------------- if ($#ARGV == 0) { do version() if ($ARGV[0] eq "--version"); do usage() if ($ARGV[0] eq "--help"); do invalid(); } # -------------------------- print "Howdy world\n"; exit(0);
bin_SCRIPTS = hello1 hello2 CLEANFILES = $(bin_SCRIPTS) EXTRA_DIST = hello1.sh hello2.pl hello1: $(srcdir)/hello1.sh $(srcdir)/version.sh rm -f hello1 echo "#! " $(BASH) > hello1 cat $(srcdir)/version.sh $(srcdir)/hello1.sh >> hello1 chmod ugo+x hello1 hello2: $(srcdir)/hello2.pl $(srcdir)/version.pl $(PERL) -c hello2.pl rm -f hello2 echo "#! " $(PERL) > hello2 cat $(srcdir)/version.pl $(srcdir)/hello2.pl >> hello2 chmod ugo+x hello2
AC_INIT AM_INIT_AUTOMAKE(hello,0.1) AC_PATH_PROGS(BASH, bash sh) AC_PATH_PROGS(PERL, perl perl5.004 perl5.003 perl5.002 perl5.001 perl5) AC_OUTPUT(Makefile version.sh version.pl )
Basically the idea with this approach is that when configure
calls
AC_OUTPUT
it will substitute the files version.sh
and
version.pl
with the correct version information. Then, during
building, the version files are merged with the scripts. The scripts
themselves need some standard boilerplate code to handle the options.
I've included that code here as a sample implementation, which I hereby
place in the public domain.
This approach can be easily generalized with other scripting languages as well, like Python and Guile.
To install data files, you should use the `DATA' primitive instead of the `SCRIPTS'. The main difference is that `DATA' will allow you to install files in data installation locations, whereas `SCRIPTS' will only allow you to install files in executable installation locations.
Normally it is assumed that the files listed in `DATA' are not derived, so they are not cleaned. If you do want to derive them however from an executable file, then you can do so like this:
bin_PROGRAMS = mkdata mkdata_SOURCES = mkdata.cc pkgdata_DATA = thedata CLEANFILES = $(datadir_DATA) thedata: mkdata ./mkdata > thedata
In general however, data files are boring. You just write them, and list them in a `DATA' assignment:
pkgdata_DATA = foo1.dat foo2.dat foo3.dat ...
If your package requires you to edit a certain type of files, you might want to write an Emacs editing mode for that file type. Emacs modes are written in Elisp files that are prefixed with `.el' like in `foo.el'. Automake will byte-compile and install Elisp files using Emacs for you. You need to invoke the
AM_PATH_LISPDIR
macro in your `configure.in' and list your Elisp files under the `LISP' primitive:
lisp_LISP = mymode.el
The `LISP' primitive also accepts the `noinst' location.
There is also support for installing Autoconf macros, documentation and dealing with shared libraries. These issues however are complicated, and they will be discussed in seperate chapters.
At the moment Autotools distributes the following additional utilitities:
LF
macros which introduce mainly support for C++, Fortran and
embedded text.
We have already discussed the `gpl' utility in Chapter 1. In this
chapter we will focus mainly on the LF
macros and the `acmkdir'
utility but we will postpone our discussion of Fortran support until the
next chapter.
LF
macrosIn last chapter we explained that a minimal `configure.in' file looks like this:
AC_INIT AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(package,version) AC_PROG_CXX AC_PROG_RANLIB AC_OUTPUT(Makefile ... )
If you are not building libraries, you can omit AC_PROG_RANLIB
.
Alternatively you can use the following macros that are distributed with Autotools, and made accessible through the `aclocal' utility. All of them are prefixed with `LF' to distinguish them from the standard macros:
AC_PROG_CC AC_PROG_CPP AC_AIX AC_ISC_POSIX AC_MINIX AC_HEADER_STDCwhich is a traditional Autoconf idiom for setting up the C compiler.
AC_PROG_CXX AC_PROG_CXXCPPand then invokes the portability macro:
LF_CPP_PORTABILITYThis is the recommended way for configuring your C++ compiler.
#include <config.h>In the past it used to be necessary to have to include a file called `cpp.h'. I've sent this file straight to hell.
$ configure ... --with-warnings ...Warnings can help you find out many bugs, as well as help you improve your coding habits. On the other hand, in many cases, many of these warnings are false alarms, which is why the default behaviour of the compiler is to not show them to you. You are probably interested in warnings if you are the developer, or a paranoid end-user.
The minimal recommended `configure.in' file for a pure C++ project is:
AC_INIT AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(package,version) LF_CONFIGURE_CXX AC_PROG_RANLIB AC_OUTPUT(Makefile .... )
A full-blown `configure.in' file for projects that mix Fortran and C++ (and may need the C compiler also if using `f2c') invokes all of the above macros:
AC_INIT AM_INIT_AUTOMAKE(package,version) LF_CANONICAL_HOST LF_CONFIGURE_CC LF_CONFIGURE_CXX LF_CONFIGURE_FORTRAN LF_SET_WARNINGS AC_PROG_RANLIB AC_CONFIG_SUBDIRS(fortran/f2c fortran/libf2c) AC_OUTPUT(Makefile ...)
In order for LF_CPP_PORTABILITY
to work correctly you need to append
certain things at the bottom of your `acconfig.h'. This is done for you
automatically by acmkdir
.
When the LF_CPP_PORTABILITY
macro is invoked by `configure.in'
then the following portability problems are checked:
CXX_HAS_NO_BOOL
. It is possible to emulate bool
with the
following C preprocessor directives:
#ifdef CXX_HAS_NO_BOOL #define bool int #define true 1 #define false 0 #endifTo make your code portable to compilers that don't support bool, through this workaround, you must follow one rule: never overload your functions in a way in which the only distinguishing feature is
bool
vs int
.
This workaround is included in the default `acconfig.h' after
@BOTTOM@
that gets installed by acmkdir
.
#include <iostream.h> main() { for (int i=0;i<10;i++) { } for (int i=0;i<10;i++) { } }This is legal C++ and the variable
i
is supposed to have scope only
inside the forloop braces and the parentheses. Unfortunately, most C++
compilers use an obsolete version of the standard's draft in which the
scope of i
is the entire main
in this example.
The workaround we use is as follows:
#ifdef CXX_HAS_BUGGY_FOR_LOOPS #define for if(1) for #endifBy nesting the forloop inside an if-statement, the variable
i
is
assigned the correct scope. Now if your if-statement scoping is also broken
then you really need to get another compiler.
The macro CXX_HAS_BUGGY_FOR_LOOPS
is defined for you if appropriate,
and the code for the work-around is included with the
default acconfig.h
.
In addition to these workarounds, the following additional features are
introduced at the end of the default acconfig.h
. The features are
enabled only if your `configure.in' calls LF_CPP_PORTABILITY
.
loop
is defined such that
loop(i,a,b)is equivalent
for (int i = a; i <= b; i++)This is syntactic sugar that makes it easier on the hand to write nested loops like:
int Ni,Nj,Nk; loop(i,0,Ni) loop(j,0,Nj) loop(k,0,Nk) { ... }minimizing the probability of making a spelling bug. If you need to do more unusual looping you can use one of the following macros:
inverse_loop(i,a,b) <--> for (int i = a; i >= b; i--) integer_loop(i,a,b,s) <--> for (int i = a; i <= b; i += s)This feature depends on having correct scoping in `for' which fortunately is easily taken care of.
#define pub public: #define pro protected: #define pri private:Now you can declare a class prototype in a java-like style like this:
class foo { pri double a,b; pub double c,d; pub foo(); pub virtual ~foo(); pri void method1(void); pub void method2(void); };Personally I find this notation more lucid than the standard C++ syntax because this way I can see the protection level of each variable and method without having to possibly scroll up to see what it is. Also, it is less bug-prone this way.
const double pi = 3.14159265358979324;
assert
is simple. Suppose that at a certain point
in your code, you expect two variables to be equal. If this expectation
is a precondition that must be satisfied in order for the subsequent
code to execute correctly, you must assert
it with a statement
like this:
assert(var1 == var2);In general
assert
takes as argument a boolean expression.
If the boolean expression is true, execution continues. Otherwise the
`abort' system call is invoked and the program execution is stopped.
If a bug prevents the precondition from being true, then you
can trace the bug at the point where the precondition breaks down instead
of further down in execution or not at all. The `assert' call is
implemented as a C preprocessor macro, so it can be enabled or disabled
at will.
One way to enable assertions is to include `assert.h'.
#include <assert.h>Then it's possible to disable them by defining the `NDEBUG' macro. Alternatively, because it is easy to provide our own assert, if your `configure.in' invokes `LF_CPP_PORTABILITY' then `assert' will be conditionally defined for you in the `config.h' file. By default, the `configure' script will enable assertions. You can disable assertions at configure-time like this:
% configure ... --disable-assert ...During debugging and testing it is a good idea to leave assertions enabled. However, for production runs it's best to disable them. If your program crashes at an assertion, then the first thing you should do is to find out where the error happens. To do this, run the program under the `gdb' debugger. First invoke the debugger:
% gdb ...copyright notice...Then load the executable and set a breakpoint at the `abort' system call:
(gdb) file "executable" (gdb) break abortNow run the program:
(gdb) runInstead of crashing, under the debugger the program will be paused when the `abort' system call is invoked, and you will get back the debugger prompt. Now type:
(gdb) whereto see where the crash happened. You can use the `print' command to look at the contents of variables and you can use the `up' and `down' commands to navigate the stack. For more information, see the GDB documentation or type `help' at the prompt of gdb. Another suggestion is to never call the
abort
system call directly.
Instead, please do this:
assert(false); exit(1);This way if assertions are enabled, the program will stop and the stack will be retained. Otherwise the program will simply exit.
The C++ language has been standardized very recently. As a result, not all
compilers fully support all the features that the ANSI C++ standard requires,
including the g++
compiler itself. Some of the problems commonly
encountered, such as incorrect scoping in for-loops and lack of the
bool
data type can be easily worked around. In this section we
give some tips for avoiding more portability problems. I welcome people on
the net reading this to email me their tips, to be included in this
tutorial.
int n = 10; double **foo; foo = new (double *)[i];The
g++
compiler will parse this and do the right thing, but other
compilers are more picky. The correct way to do it is:
int n = 10; double **foo; foo = new double * [i];
g++
.
FIXME: I need to add some stuff here.
Putting all of this together, we will now show you how to create a super
Hello World package, using the LF
macros and the utilities that
are distributed with the `autotools' distribution.
The first step is to build a directory tree for the new project. Instead of doing it by hand, use the `acmkdir' utility. Type:
% acmkdir hello
`acmkdir' prompts you with the current directory pathname. Make sure that this is indeed the directory where you want to install the directory tree for the new package. You will be prompted for some information about the newly created package. When you are done, `acmkdir' will ask you if you really want to go for it. Say `y'. Then `acmkdir' will do the following:
AC_INIT AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(test,0.1) LF_HOST_TYPE LF_CONFIGURE_CXX LF_SET_WARNINGS AC_PROG_RANLIB AC_OUTPUT(Makefile doc/Makefile m4/Makefile src/Makefile)You can edit this and customize it to your needs. More specifically, you will need to update the version number here everytime to you cut a new distribution.
EXTRA_DIST = reconf configure SUBDIRS = m4 doc srcThe ones in the
src
and doc
subdirectories are empty. The
one in `m4' contains a template `Makefile.am' which you should
edit if you want to add new macros.
#!/bin/sh rm -f config.cache rm -f acconfig.h aclocal -I m4 autoconf acconfig autoheader automake -a exitThe makes sure that all the utilities are invoked, and in the right order. Before `acmkdir' exits, it will call the `reconf' script for you once to set things up.
It must be obvious that having to do these tasks manually for every package you write can get to be tiring. With `acmkdir' you can slap together all this grunt-work in a matter of seconds.
Now enter the directory `hello-0.1/src' and start coding:
% cd hello-0.1/src % gpl -cc hello.cc % vi hello.cc % vi Makefile.am
This time we will use the following modified hello world program:
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <iostream.h> main() { cout << "Welcome to " << PACKAGE << " version " << VERSION; cout << " for " << YOUR_OS << endl; cout << "Hello World!" << endl; }
and for `Makefile.am' the same old thing:
bin_PROGRAMS = hello hello_SOURCES = hello.cc
Now back to the toplevel directory:
% cd .. % reconf % configure % make % src/hello Welcome to test version 0.1 for i486-pc-linux-gnulibc1 Hello World!
Note that by using the special macros PACKAGE
, VERSION
,
YOUR_OS
the program can identify itself, its version number and the
operating system for which it was compiled. The PACKAGE
and
VERSION
are defined by AM_INIT_AUTOMAKE
and
YOUR_OS
by LF_HOST_TYPE
.
Now you can experiment with the various options that configure offers. You can do:
% make distclean
and reconfigure the package with one of the following variations in options:
% configure --disable-assert % configure --with-warnings
or a combination of the above. You can also build a distribution of your hello world and feel cool about yourself:
% make distcheck
The important thing is that you can write extensive programs like this and stay focused on writing code instead of maintaining stupid header file, scripts, makefiles and all that.
The `acmkdir' utility can be invoked in the simple manner that we showed in the last chapter to prepare the directory tree for writing C++ code. Alternatively, it can be instructed to create directory trees for Fortran/C++ code as well as documentation directories.
In general, you invoke `acmkdir' in the following manner:
% acmkdir [OPTIONS] "dirname"
If you are creating a toplevel directory, then everything will appear under `dirname-0.1'. Otherwise, the name `dirname' will be used instead.
`acmkdir' supports the following options:
texidoc
documentation directory.
What to put under that directory will be explained in more detail on a
separate chapter about documentation. If your package will have more than
one documentation texts, you usually want to invoke this under the
`doc' subdirectory:
% cd doc % acmkdir -doc tutorial % acmkdir -doc manualOf course, the `Makefile.am' under the `doc' directory will need to refer to these subdirectories with a
SUBDIRS
entry:
SUBDIRS = tutorial manualAlternatively, if you decide to use the `doc' directory itself for documentation (and you are massively sure about this), then you can
% rm -rf doc % acmkdir -doc docNote that this is not the FSF standard way of handling documentation. This is an Autotools feature.
latex
documentation directory.
Again, the details of how to do this will be explained in a separate
chapter. The disadvantage of using `latex' for your documentation
is that you can only produce a printed book; you can not also generate
on-line documentation. The advantage is that you can typeset very complex
mathematics, something which you can not do under Texinfo since it only
uses plain TeX. If you are documentating mathematical software, you may
prefer to write the documentation in Latex. Autotools will provide you
with LaTeX macros for making your documentation look like Texinfo
documentation.
TYPE
.
The types available are: default
, traditional
,
fortran
. Eventually I may implement two additional types:
f77
, f90
.
Now, a brief description of these toplevel types:
LF
macros installed by Autotools.
The `acconfig.h' file is automagically generated and a custom
`INSTALL' file is installed. The defaults reflect my own personal
habits.
#undef PACKAGE #undef VERSIONwhich are required by Automake.
f2c
translator. The software is configured such that if a Fortran
compiler is not available, f2c
is built instead, and then used
to compile the Fortran code. We will explain all about Fortran in the
next chapter.
In some cases, we want to embed text to the executable file of an application. This may be on-line help pages, or it may be a script of some sort that we intend to execute by an interpreter library that we are linking with, like Guile or Tcl. Whatever the reason, if we want to compile the application as a stand-alone executable, it is necessary to embed the text in the source code. Autotools provides with the build tools necessary to do this painlessly.
As a tutorial example, we will write a simple program that prints the contents of the GNU General Public License. First create the directory tree for the program:
% acmkdir copyleft
Enter the directory and create a copy of the txtc
compiler:
% cd copyleft-0.1 % mktxtc
Then edit the file `configure.in' and add a call to the
LF_PROG_TXTC
macro. This macro depends on
AC_PROG_CC AC_PROG_AWK
so make sure that these are invoked also. Finally add `txtc.sh' to
your AC_OUTPUT
.
The end-result should look like this:
AC_INIT(reconf) AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(copyleft,0.1) LF_HOST_TYPE LF_CONFIGURE_CC LF_CONFIGURE_CXX LF_SET_OPTIMIZATION LF_SET_WARNINGS AC_PROG_RANLIB AC_PROG_AWK LF_PROG_TXTC AC_OUTPUT(Makefile txtc.sh doc/Makefile m4/Makefile src/Makefile)
Then, enter the `src' directory and create the following files:
% cd src % gpl -l gpl.txt % gpl -cc gpl.h % gpl -cc copyleft.cc
The `gpl.txt' file is the text that we want to print. You can substitute
it with any text you want. This file will be compiled into `gpl.o'
during the build process. The `gpl.h' file is a header file that gives
access to the symbols defined by `gpl.o'. The file `copyleft.cc'
is where the main
will be written.
Next, add content to these files as follows:
extern int gpl_txt_length; extern char *gpl_txt[];
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <iostream.h> #include "gpl.h" main() { loop(i,1,gpl_txt_length) { cout << gpl_txt[i] << endl; } }
SUFFIXES = .txt .txt.o: $(TXTC) $< bin_PROGRAMS = copyleft foo_SOURCES = copyleft.cc gpl.h gpl.txt
and now you're set to build. Go back to the toplevel directory and go for it:
$ cd .. $ reconf $ configure $ make $ src/copyleft | less
To verify that this works properly, do the following:
$ cd src $ copyleft > copyleft.out $ diff gpl.txt copyleft.out
The two files should be identical. Finally, convince yourself that you can make a distribution:
$ make distcheck
and there you are.
Note that in general the text file, as encoded by the text compiler, will not be always identical to the original. There is one and only one modification being made: If any line has any blank spaces at the end, they are trimmed off. This feature was introduced to deal with a bug in the Tcl interpreter, and it is in general a good idea since it conserves a few bytes, it never hurts, and additional whitespace at the end of a line shouldn't really be there.
This magic is put together from many different directions. It begins with
the LF_PROG_TXTC
macro:
TXTC
to point to a Text-to-C
compiler. To create a copy of the compiler at the toplevel directory of your
source code, use the mktxtc
command:
% mktxtcThe compiler is implemented as a shell script, and it depends on
sed
,
awk
and the C compiler, so you should call the following two macros
before invoking AC_PROG_TXTC
:
AC_PROG_CC AC_PROG_AWKThe compiler is intended to be used as follows:
$(TXTC) text1.txt text2.txt text3.txt ...such that given the files `text1.txt', `text2.txt', etc. object files `text1.o', `text2.o', etc, are generated that contains the text from these files.
From the Automake point of view, you need to add the following two lines to Automake:
SUFFIXES = .txt .txt.o: $(TXTC) $<
assuming that your text files will end in the .txt
suffix. The first
line informs Automake that there exist source files using non-standard
suffixes. Then we describe, in terms of an abstract Makefile rule, how to
build an object file from these non-standard suffixes. Recall the use of
the symbol $<
. Also note that it is not necessary
to use $(srcdir)
on $<
for VPATH builds.
If you embed more than one type of files, then you may want to use more
than one suffixes. For example, you may have `.hlp' files containing
online help and `.scm' files containing Guile code. Then you
want to write a rule for each suffix as follows:
SUFFIXES = .hlp .scm .hlp.o: $(TXTC) $< .scm.o: $(TXTC) $<
It is important to put these lines before mentioning any SOURCES
assignments. Automake is smart enough to parse these abstract makefile
rules and recognize that files ending in these suffixes are valid source
code that can be built to object code. This allows you to simply list
`gpl.txt' with the other source files in the SOURCES
assignment:
copyleft_SOURCES = copyleft.cc gpl.h gpl.txt
In order for this to work however, Automake must be able to see your abstract rules first.
When you "compile" a text file `foo.txt' this makes an object file that defines the following two symbols:
int foo_txt_length; char *foo_txt[];
Note that the dot characters are converted into underscores. To make these symbols accessible, you need to define an appropriate header file with the following general form:
extern int foo_txt_length; extern char *foo_txt[];
When you include this header file into your other C or C++ files then:
foo_txt[0];and use it to print diagnostic messages.
char *foo_txt[1]; -> first line char *foo_txt[2]; -> second line ...
foo_txt_length
is defined such that
char *foo_txt[foo_txt_length+1] == NULLThe last line of the text is:
char *foo_txt[foo_txt_length];You can use a
for
loop (or the loop
macro defined by
LF_CPP_PORTABILITY)
together with foo_txt_length
to loop over the entire text, or you can
exploit the fact that the last line points to NULL
and do a
while
loop.
and that's all there is to it.
When making a package, you can organize it as a flat package or
a deep package. In a flat package, all the source files are placed
under src
without any subdirectory structure. In a deep package,
libraries and groups of executables are seperated by a subdirectory
structure. The perennial problem with deep packages is dealing with
interdirectory dependencies. What do you do if to compile one library you
need header files from another library in another directory? What do you do if
to compile the test suite of your library you need to link in another library
that has just been compiled in a different directory?
One approach is to just put all these interdependent things in the same
directory. This is not very unreasonable since the Makefile.am
can document quite thoroughly where each file belongs, in case you need to
split them up in the future. On the other hand, this solution becomes less
and less preferable as your project grows. You may not want to clutter
a directory with source code for too many different things. What do you
do then?
The second approach is to be careful about these dependencies and just invoke the necessary features of Automake to make everything work out.
For *.a
files (library binaries), the recommended thing to do
is to link them by giving the full relative pathname. Doing that allows
Automake to work out the dependencies correctly accross multiple directories.
It also allows you to easily upgrade to shared libraries with Libtool.
To retain some flexibility it may be best to list these interdirectory
link sequences in variables and then use these variables. This way, when you
move things around you minimize the amount of editing you have to do.
In fact, if all you need these library binaries for is to build a test suite
you can simply assign them to LDFLAGS
. To make these assignments
more uniform, you may want to start your pathnames with $(top_builddir)
.
For *.h
files (header files), you can include an
INCLUDES = -I../dir1 -I../dir2 -I../dir3 ...
assignment on every `Makefile.am' of every directory level listing
the directories that contain include files that you want to use. If your
directory tree is very complicated, you may want to make these assignments
more uniform by starting your pathnames from $(top_srcdir)
.
In your source code, you should use the syntax
#include "foo.h"
for include files in the current directory and
#include <foo.h>
for include files in other directories.
There is a better third approach, provided by Autotools, but it only
applies to include files. There is nothing more that can be done with
library binaries; you simply have to give the path. But with header files,
it is possible to arrange at configure-time that all header files are
symlinked under the directory $(top_builddir)/include
. Then you will
only need to list one directory instead of many.
Autotools provides two Autoconf macros: LF_LINK_HEADERS
and
LF_SET_INCLUDES
, to handle this symlinking.
LF_LINK_HEADERS(src/dir1 src/dir2 src/dir3 ... src/dirN)When this macro is invoked for the first time, the directory `$(top_srcdir)/include' is erased. Then for each directory `src/dirK' listed, we look for the file `src/dirK/Headers' and link the public header files mentioned in that file under `$(top_srcdir)/include'. The link will be either symbolic or hard, depending on the capabilities of your operating system. If possible, a symbolic link will be prefered. You can invoke the same macro by passing an optional argument that specifies a directory name. For example:
LF_LINK_HEADERS(src/dir1 src/dir2 ... src/dirN , foo)Then the symlinks will be created under the `$(top_srcdir)/include/foo' directory instead. This can be significantly useful if you have very many header files to install and you'd like to call them something like:
#include <foo/file1.h>During compilation, when you try to
$(default_includes)
to contain the correct collection of -I
flags, such that the include files are accessible. If you invoke it with
no arguments as
LF_SET_INCLUDESthen the following assignment will take place:
default_includes = -I$(prefix) -I$(top_srcdir)/includeIf you invoke it with arguments:
LF_SET_INCLUDES(dir1 dir2 ... dirN)then the following assignment will take place instead:
default_includes = -I$(prefix) -I$(top_srcdir)/include/dir1 \ -I$(top_srcdir)/include/dir2 ... \ -I$(top_srcdir)/include/dirNYou may use this variable as part of your
INCLUDES
assignment.
A typical use of this system involves invoking
LF_LINK_HEADERS(src/dir1 src/dir2 ... src/dirN) LF_SET_INCLUDES
in your `configure.in' and adding the following two lines in your `Makefile.am':
INCLUDES = $(default_includes) EXTRA_DIST = Headers
The variable $(default_includes)
will be assigned by the
configure
script to point to the Right Thing. You will also
need to include a file called `Headers' in every directory level
that you mention in LF_LINK_HEADERS
containing the public header
files that you wish to symlink. The filenames need to be separated by
carriage returns in the `Headers' file. You also need to mention
these public header files in a
include_HEADERS = foo1.h foo2.h ...
assignment, in your `Makefile.am', to make sure that they are installed.
With this usage, other programs can access the installed header files as:
#include <foo1.h>
Other directories within the same package can access the uninstalled yet header files in exactly the same manner. Finally, in the same directory you should access the header files as
#include "foo1.h"
This will force the header file in the current directory to be installed, even when there is a similar header file already installed. This is very important when you are rebuilding a new version of an already installed library. Otherwise, building might be confused if your code tries to include the already installed, and not up-to-date, header files from the older version.
Alternatively, you can categorize the header files under a directory, by invoking
LF_LINK_HEADERS(src/dir1 src/dir2 , name1) LF_LINK_HEADERS(src/dir3 src/dir4 , name2) LF_SET_INCLUDES(name1 name2)
in your `configure.in'. In your `Makefile.am' files you still add the same two lines:
INCLUDES = $(default_includes) EXTRA_DIST = Headers
and maintain the `Headers' file as before. However, now the header files will be symlinked to subdirectories of `$(top_srcdir)/include'. This means that although uninstalled header files in all directories must be included by code in the same directory as:
#include "header.h"
code in other directories must access these uninstalled header files as
#include <name1/header.h>
if the header file is under `src/dir1' or `src/dir2' or as
#include <name2/header.h>
if the header file is under `src/dir3' or `src/dir4'. It follows that you probably intend for these header files to be installed correspondingly in such a manner so that other programs can also include them the same way. To accomplish that, under `src/dir1' and `src/dir2' you should list the header files in your `Makefile.am' like this:
name1dir = $(includedir)/name1 name1_HEADERS = header.h ...
and under `src/dir3' and `src/dir4' like this:
name2dir = $(includedir)/name2 name2_HEADERS = header.h
One disadvantage of this approach is that the source tree is modified
during configure-time, even during a VPATH build. Some may not like that, but
it suits me just fine.
Unfortunately, because Automake requires the GNU compiler to compute
dependencies, the header files need to be placed in a constant location
with respect to the rest of the source code. If a mkdep
utility
were to be distributed by Automake to compute dependencies when the installer
installs the software and not when the developer builds a source code
distribution, then it would be possible to allow the location of the header
files to be dynamic. If that development ever takes place in Automake,
Autotools will immediate follow. If you really don't like this,
then don't use this feature.
Usually, if you are installing one or two header files per library you
want them to be installed under $(includedir)
and be includable
with
#include <foo.h>
On the other hand, there are many applications that install a lot of header files, just for one library. In that case, you should put them under a prefix and let them be included as:
#include <prefix/foo.h>
Examples of libraries doing this X11 and Mesa.
This mechanism for tracking include files is most useful for very large projects. You may not want to bother for simple homework-like throwaway hacks. When a project starts to grow, it is very easy to switch.
This chapter is devoted to Fortran. We will show you how to build programs that combine Fortran and C or C++ code in a portable manner. The main reason for wanting to do this is because there is a lot of free software written in Fortran. If you browse `http://www.netlib.org/' you will find a repository of lots of old, archaic, but very reliable free sources. These programs encapsulate a lot of experience in numerical analysis research over the last couple of decades, which is crucial to getting work done. All of these sources have been written in Fortran. As a developer today, if you know other programming languages, it is unlikely that you will want to write original code in Fortran. You may need, however, to use legacy Fortran code, or the code of a neighbour who still writes in Fortran.
The most portable way to mix Fortran with your C/C++ programs is to translate the Fortran code to C with the `f2c' compiler and compile everything with a C/C++ compiler. The `f2c' compiler is available at `http://www.netlib.org/' but as we will soon explain, it is also distributed with the `autotools' package. Another alternative is to use the GNU Fortran compiler `g77' with `g++' and `gcc'. This compiler is portable among many platforms, so if you want to use a native Fortran compiler without sacrificing portability, this is one way to do it. Another way is to use your OS's native Fortran compiler, which is usually called `f77', if it is compatible with `g77' and `f77'. Because performance is also very important in numerical codes, a good strategy is to prefer to use the native compiler if it is compatible, and support `g77' as a fall-back option. Because many sysadmins don't install `g77' supporting `f2c' as a third fall-back is also a good idea.
Autotools provides support for configuring and building source code written in part or in whole in Fortran. The implementation is based on the build system used by GNU Octave, which has been generalized for use by any program.
The traditional Hello world program in Fortran looks like this:
c....:++++++++++++++= PROGRAM MAIN PRINT*,'Hello World!' END
All lines that begin with `c' are comments. The first line is the
equivalent of main()
in C. The second line says hello, and the
third line indicates the end of the code. It is important that all command
lines are indented by 7 spaces, otherwise the compiler will issue a syntax
error. Also, if you want to be ANSI compliant, you must write your code all
in caps. Nowadays most compilers don't care, but some may still do.
To compile this with `g77' (or `f77') you do something like:
% g77 -o hello hello.f % hello
To compile it with the f2c translator:
% f2c hello.f % gcc -o hello hello.c -lf2c -lm
where `-lf2c' links in the translator's system library.
In order for this to work, you will have to make sure that the header file
f2c.h
is present since the translated code in `hello.c' includes
it with a statement like
#include "f2c.h"
which explicitly requires it to be present in the current working directory.
In this case, the `main' is written in Fortran. However most of the Fortran you will be using will actually be subroutines and functions. A subroutine looks like this:
c....:++++++++++++++ SUBROUTINE FHELLO (C) CHARACTER *(*) C PRINT*,'From Fortran: ',C RETURN END
This is the analog of a `void' function in C, because it takes arguments but doesn't return anything. The prototype declaration is K&R style: you list all the arguments in parenthesis, seperated with commas, and you declare the types of the variables in the subsequent lines.
Suppose that this subroutine is saved as `fhello.f'. To call it from C you need to know what it looks like from the point of the C compiler. To find out type:
% f2c -P fhello.f % cat fhello.P
You will find that this subroutine has the following prototype declaration:
extern int fhello_(char *c__, ftnlen c_len);
It may come as a surprise, and this is a moment of revelation, but although in Fortran it appears that the subroutine is taking one argument, in C it appears that it takes two! And this is what makes it difficult to link code in a portable manner between C and Fortran. In C, everything is what it appears to be. If a function takes two arguments, then this means that down to the machine language level, there is two arguments that are being passed around. In Fortran, things are being hidden from you and done in a magic fashion. The Fortran programmer thinks that he is passing one argument, but the compiler compiles code that actually passes two arguments around. In this particular case, the reason for this is that the argument you are passing is a string. In Fortran, strings are not null-terminated, so the `f2c' compiler passes the length of the string as an extra hidden argument. This is called the linkage method of the compiler. Unfortunately, linkage in Fortran is not standard, and there exist compilers that handle strings differently. For example, some compilers will prepend the string with a few bytes containing the length and pass a pointer to the whole thing. This problem is not limitted to strings. It happens in many other instances. The `f2c' and `g77' compilers follow compatible linkage, and we will use this linkage as the ad-hoc standard. A few proprietary Fortran compilers like the Dec Alpha `f77' and the Irix `f77' are also `f2c'-compatible. The reason for this is because most of the compiler developers derived their code from `f2c'. So although a standard was not really intended, there we have one anyway.
A few things to note about the above prototype declaration is that the symbol `fhello' is in lower-case, even though in Fortran we write everything uppercase, and it is appended with an underscore. On some platforms, the proprietary Fortran compiler deviates from the `f2c' standard either by forcing the name to be in upper-case or by omitting the underscore. Fortunately, these cases can be detected with Autoconf and can be worked around with conditional compilation. However, beyond this, other portability problems, such as the strings issue, are too involved to deal with and it is best in these cases that you fall back to `f2c' or `g77'. A final thing to note is that although `fhello' doesn't return anything, it has return type `int' and not `void'. The reason for this is that `int' is the default return type for functions that are not declared. Therefore, to prevent compilation problems, in case the user forgets to declare a Fortran function, `f2c' uses `int' as the return type for subroutines.
In Fortran parlance, a subroutine is what we'd call a `void' function. To Fortran programmers in order for something to be a function it has to return something back. This reflects on the syntax. For example, here's a function that adds two numbers and returns the result:
c....:++++++++++++++++ DOUBLE PRECISION FUNCTION ADD(A,B) DOUBLE PRECISION A,B ADD = A + B RETURN END
The name of the function is also the name of the return variable. If you run this one through `f2c -P' you will find that the C prototype is:
extern doublereal add_(doublereal *a, doublereal *b);
There's plenty of things to note here:
integer -> int real -> float doublereal -> double complex -> struct { real r,i; }; doublecomplex -> struct { doublereal r,i; };
A more interesting case is when we deal with complex numbers. Consider a function that multiplies two complex numbers:
c....:++++++++++++++++++++++++++++++ COMPLEX*16 FUNCTION MULT(A,B) COMPLEX*16 A,B MULT = A*B RETURN END
As it turns out, the prototype for this function is:
extern Z_f mult_(doublecomplex *ret_val, doublecomplex *a, doublecomplex *b);
Because complex numbers are not a native type in C, they can not be returned efficiently without going through at least one copy. Therefore, for this special case the return value is placed as the first argument in the prototype! Actually despite many people's feelings that Fortran must die, it is still the best tool to use to write optimized functions that are heavy on complex arithmetic.
Now that we have brought up some of the issues about Fortran linkage, let's show you how to work around them. We will write a simple Fortran function, and a C program that calls it, and then show you how to turn these two into a GNU-like package, enhanced with a configure script and the works. This discussion assumes that you have installed the utilities in `autotools', the package with which this tutorial is being distributed.
First, begin by building a directory for your new package. Because this project will involve Fortran, you need to pass the `-f' flag to `acmkdir':
% acmkdir -t fortran foo
The `-t' flag directs `acmkdir' to unpack a copy of the `f2c' translator and to build proper toplevel `configure.in' and `Makefile.am' files. This will take a while, so relax and stretch a little bit.
Now enter the `foo-0.1' directory and look around:
% cd foo-0.1 % cat configure.in AC_INIT AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(hello,0.1) LF_CONFIGURE_CC LF_CONFIGURE_CXX AC_PROG_RANLIB LF_HOST_TYPE LF_PROG_F77_PREFER_F2C_COMPATIBILITY dnl LF_PROG_F77_PREFER_NATIVE_VERSION LF_PROG_F77 LF_SET_WARNINGS AC_CONFIG_SUBDIRS(fortran/f2c fortran/libf2c) AC_OUTPUT([Makefile fortran/Makefile f2c_comp doc/Makefile m4/Makefile src/Makefile ]) % cat Makefile.am EXTRA_DIST = reconf configure SUBDIRS = fortran m4 doc src
There are some new macros in `configure.in' and a new subdirectory: `fortran'. There is also a file that looks like a shell script called `f2c_comp.in'. We will discuss the gory details about all this in the next section. Now let's write the code. Enter the `src' directory and type:
$ cd src $ mkf2c
This creates the following files:
#ifdef __cplusplus extern "C" { #endif #if defined (sun) int MAIN_ () { return 0; } #elif defined (linux) && defined(__ELF__) int MAIN__ () { return 0; } #endif #ifdef __cplusplus } #endif
Now, time to write some code:
$ vi fhello.f $ vi hello.cc
with
c....:++++++++++++++++++++++++++++++ SUBROUTINE FHELLO (C) CHARACTER *(*) C PRINT*,'From Fortran: ',C RETURN END
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <string.h> #include "f2c.h" #include "f77-fcn.h" extern "C" { extern int f77func(fhello,FHELLO)(char *c__, ftnlen c_len); } main() { char s[30]; strcpy(s,"Hello world!"); f77func(fhello,FHELLO)(s,ftnlen(strlen(s))); }
The definition of the f77func
macro is included in `acconfig.h'
automatically for you if the LF_CONFIGURE_FORTRAN
macro is included
in your `configure.in'. The definition is as follows:
#ifndef f77func #if defined (F77_APPEND_UNDERSCORE) # if defined (F77_UPPERCASE_NAMES) # define f77func(f, F) F##_ # else # define f77func(f, F) f##_ # endif #else # if defined (F77_UPPERCASE_NAMES) # define f77func(f, F) F # else # define f77func(f, F) f # endif #endif #endif
Recall that we said that the issue of whether to add an underscore and
whether to capitalize the name of the routine can be dealt with conditional
compilation. This macro is where this conditional compilation happens.
The LF_PROG_F77
macro will define
F77_APPEND_UNDERSCORE F77_UPPERCASE_NAMES
appropriately so that f77func
does the right thing.
To compile this, create a `Makefile.am' as follows:
SUFFIXES = .f .f.o: $(F77) -c $< bin_PROGRAMS = hello hello_SOURCES = hello.cc fhello.f f2c.h f2c-main.c hello_LDADD = $(FLIBS)
Note that the above `Makefile.am' is only compatible with version 1.3 of Automake, or newer versions. The previous versions don't grok Fortran filenames on the `hello_SOURCES' so you may want to upgrade.
Now you can compile and run the program:
$ cd .. $ reconf $ configure $ make $ src/hello From Fortran: Hello world!
If you have a native `f77' compiler that was used, or the portable `g77' compiler you missed out the coolness of using `f2c'. In order to check that out do:
$ make distclean $ configure --with-f2c $ make
and witness the beauty! The package will begin by building an `f2c' binary for your system. Then it will build the Fortran libraries. And finally, it will build the hello world program which you can run as before:
$ src/hello
It may seem an overkill to carry around a Fortran compiler. On the other hand you will find it very convenient, and the `f2c' compiler isn't really that big. If you are spoiled on a system that is well equiped and with a good system administrator, you may find it a nasty surprise one day when you discover that the rest of the world is not necessarily like that.
If you download a real Fortran package from Netlib you might find it very
annoying having to enter the filenames for all the Fortran files in
`*_SOURCES'. A work-around is to put all these files in their own
directory and then do this awk
trick:
% ls *.f | awk '{ printf("%s ", $1) }' > tmp
The awk filter will line-up the output of ls
in one line. You can use
your editor to insert its contents to your `Makefile.am'. Eventually
I may come around to write a utility for doing this automagically.
The best way to get started is by building the initial directory tree with `acmkdir' like this:
% acmkdir -t fortran <directory-filename>
This will install all the standard stuff. It will also install a directory called `fortran' containing a copy of the f2c compiler and `f2c_comp', a shell script invoking the compiler in a way that it looks the same as invoking a real compiler
The file `configure.in' uses the following special macros:
f2c
compatibility
over performance. In general Fortran programmers are willing to sacrifice
everything for the sake of performance. However, if you want to use
Fortran code with C and C++ code, you will have many reasons to also
give importance to f2c
compatibility. Use this macro to state this
preference. The effect is that if the installer's platform has a native
Fortran compiler installed, it will be used only if it is f2c
compatible. This macro must be invoked before invoking
LF_PROG_F77
.
f2c
compatibility. You may want to invoke this
instead if your entire program is written in Fortran.
This macro must be invoked before invoking LF_PROG_F77
.
F77_APPEND_UNDERSCORE
F77_UPPERCASE_NAMES
% f2c -P foo.fon the file containing the subroutine and examine the file `foo.P'. In order for this macro to work properly you must precede it with calls to
AC_PROG_CC AC_PROG_RANLIB LF_HOST_TYPEYou also need to call one of the two
*_PREFER_*
macros. The default
is to prefer f2c
compatibility.
In addition to invoking all of the above, you need to make provision for the bundled fortran compiler by adding the following lines at the end of your `configure.in':
AC_CONFIG_SUBDIRS(fortran/f2c fortran/libf2c) AC_OUTPUT([Makefile fortran/Makefile f2c_comp doc/Makefile m4/Makefile src/Makefile])
The AC_CONFIG_SUBDIRS
macro directs `configure' to execute the
configure scripts in `fortran/f2c' and `fortran/libf2c'.
The stuff in AC_OUTPUT
that are important to Fortran support are
building `fortran/Makefile' and `f2c_comp'. Because,
`f2c_comp' is mention in AC_OUTPUT
, Automake will automagically
bundle it when you build a source code distribution.
If you have originally set up your directory tree for a C or C++ only project and later you realize that you need to also use Fortran, you can upgrade your directory tree to Fortran as follows:
% mkfortranand the `f2c_oomp' by invoking
% mkf2c_compboth on the toplevel directory level.
AC_PROG_CC AC_PROG_RANLIB LF_HOST_TYPE LF_PROG_F77_PREFER_F2C_COMPATIBILITY LF_PROG_F77If you have invoked
LF_CONFIGURE_CC
then there is no need to
invoke AC_PROG_CC
again.
AC_OUTPUT
:
AC_CONFIG_SUBDIRS([fortran/f2c fortran/libf2c])and add the following files to
AC_OUTPUT
:
fortran/Makefile f2c_comp
% make distclean % ./reconf % ./configure % makeIt is important to call `reconf' for the changes to take effect.
If a directory level contains Fortran source code, then it is important to let Automake know about it by adding the following lines in the beginning.
SUFFIXES = .f .f.o: $(F77) -c $<
This is pretty much the same idea with the embedded text compiler.
You can list the Fortran source code filenames in the SOURCES
assignments together with your C and C++ code. To link executables,
you must add $(FLIBS)
to LDADD
and link against
`f2c-main.c' just as in the hello world example. Please do
not include `f2c-main.c' in any libraries however.
Now consider the file `hello.cc' line by line. First we include the standard configuration stuff:
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <string.h>
Then we include the Fortran related header files:
#include "f2c.h"
Then we declare the prototypes for the Fortran subroutine:
extern "C" { extern int f77func(fhello,FHELLO)(char *c__, ftnlen c_len); }
There is a few things to note here:
extern "C" { }The C++ language uses name mangling to support function overloading. This means that if you have two C++ functions called:
int foo(double x); int foo(double x,double y);the C++ compiler internally assigns them different names in an intelligent fashion to avoid conflict. Just like the Fortran compiler does things behind your back, so does the C++ compiler to support some of its special features. Any code written between `extern "C"' is compiled with name mangling disabled. This is necessary for the Fortran declarations because we don't want the names of the Fortran subroutines to be mangled.
f77func(fhello,FHELLO)(s,ftnlen(strlen(s)));This may seem pedantic but it is necessary for the C++ compiler, and it is a good habit even for C programmers. Since Fortran routines are supposed to be wrapped, this is not too much to ask.
integer
explicitly. Unfortunately the standard header file distributed with
f2c
defines integer
as long int
to account for 16-bit
machines. That's a bad idea, and on the 64-bit Dec Alpha it is a bug. The
header file distributed with
`mkf2c' does the right thing.
SOURCES
assignments on your `Makefile.am' to make sure that
they are included in the source code distribution.
Fortran is infested with portability problems. There exist two important
Fortran standards: one that was written in 1966 and one that was written
in 1977. The 1977 standard is considered to be the standard Fortran.
Most of the Fortran code is written by scientists who have never had any
formal training in computer programming. As a result, they often write
code that is dependent on vendor-extensions to the standard, and not
necessarily easy to port. The standard itself is to blame as well, since
it is sorely lacking in many aspects. For example, even though standard
Fortran has both REAL
and DOUBLE PRECISION
data types
(corresponding to float
and double
) the standard only
supports single precision complex numbers (COMPLEX
). Since many
people will also want double precision complex numbers, many vendors provided
extensions. Most commonly, the double precision complex number is called
COMPLEX*16
but you might also see it called DOUBLE COMPLEX
.
Other such vendors extensions include providing a flush
operation
of some sort for file I/O, and other such esoteric things.
To make things worse (or better) now there are two more standards out there: the 1990 standard and the 1995 standard. A 2000 standard is also at work. Fortran 90 and its successors try to make Fortran more like C and C++, and even though there are no free compilers for both variants, they are becoming alarmingly popular with the scientific community. In fact, I think that the main reason why these variants of Fortran are being developed is to make more bussiness for proprietary compiler developers. So far as I know, Fortran 90 does not provide any features that C++ can not support with a class library extension. Moreover Fortran 90 does not have the comprehensive foundation that allows C++ to be a self-extensible language. This makes it less worthwhile to invest effort on Fortran 90, because it means that eventually people will want features that can only be implemented by redefining the language and rewriting the compilers for it. Instead, in C++, you can add features to the language simply by writing C++ code, because it has enough core features to allow virtually unlimited self-extensibility.
If your primary interest is portability and free software, you should stay
away from Fortran
90 as well as Fortran 95, until someone writes a free compiler for them.
You will be better off developing in C++
and only migrating to
Fortran 77 the parts that are performance critical. This way you get the
best of both worlds.
On the flip side, if you limit your Fortran code just to number-crunching,
then it becomes much easier to write portable code. There are still a few
things you should take into account however.
Some Fortran code has been written in the archaic 1966 style. An example
of such code is the fftpack
package from netlib
. The main
problems with such code are the following:
I,J,...,N
are type INTEGER
. All others are REAL
To compile this code with
modern compilers it is necessary to add the following line to every source
file:
IMPLICIT DOUBLE PRECISION (A-H,O-Z)This instructs the compiler to do the right thing, which is to implicitly assume that all variables starting with
A-H
and O-Z
are
double precision and all other variables are integers. Alternatively you can
say
IMPLICIT REAL (A-H,O-Z)but it is very rarely that you will ever want to go with single precision. Occasionally, you may find that the programmer breaks the rules. For example, in
fftpack
the array IFAC
is supposed to be a double
even though implicitly it is suggested to be an int
. Such inconstances
will probably show up in compiler errors. To fix them, declare the type
of these variables explicitly. If it's an array then you do it like this:
DOUBLE PRECISION IFAC(*)If the variable also appears in a
DIMENSION
declaration, then you
should remove it from the declaration since the two can't coexist in
some compilers.
DIMENSION C(1)means that
C
has an unknown length, instead of meaning that it has
length 1. In modern Fortran, this is an unacceptable notation and modern
compilers do get confused over it. So all such instances must be replaced
with the correct form which is:
DIMENSION C(*)Such "arrays" in reality are just pointers. The user can reference the array as far as he likes, but of course, if he takes it too far, the program will either do the Wrong Thing or crash with a segmentation fault.
INTEGER
,
`9.435784839284958' is always type REAL
(even if the additional precision specified is lost, and even when used in
a `DOUBLE PRECISION' context such as being assigned to a
`DOUBLE PRECISION' variable!). On the other hand, 1E0
is
always REAL
and 1D0
is always `DOUBLE PRECISION'.
If you want your code to be exclusively double precision, then you should
scan the entire source for constants, and make sure that they all have the
D0
suffix at the end. Many compilers will tolerate this omission while
others will not and go ahead and introduce single precision error to your
computations leading to hard to find bugs.
In general the code in http://www.netlib.org/
is very reliable and
portable, but you do need to keep your eyes open for little problems like
the above.
The appendices
The GNU development tools were written primarily to aid the development of free software. Even though software development is mainly a technical issue, the free software movement has always be driven by many philosophical concerns as well.
In this appendix we include a few articles written by Richard Stallman that discuss these concerns. The text of these articles is copyrighted and included here with permission from the following terms:
Copying Notice
Copyright (C) 1998 Free Software Foundation Inc 59 Temple Place, Suite 330, Boston, MA 02111, USA Verbatim copying and distribution is permitted in any medium, provided this notice is preserved.
With the advent of the Linux movement, many people nowadays use free software without being informed of the philosophy and culture and its importance. It is our hope that by including some of these articles here, we'll help spread the word.
All of these articles, and others are also distributed on the web at:
@uref{http://www.gnu.org/philosophy/index.html}
Digital information technology contributes to the world by making it easier to copy and modify information. Computers promise to make this easier for all of us.
Not everyone wants it to be easier. The system of copyright gives software programs "owners", most of whom aim to withhold software's potential benefit from the rest of the public. They would like to be the only ones who can copy and modify the software that we use.
The copyright system grew up with printing--a technology for mass production copying. Copyright fit in well with this technology because it restricted only the mass producers of copies. It did not take freedom away from readers of books. An ordinary reader, who did not own a printing press, could copy books only with pen and ink, and few readers were sued for that.
Digital technology is more flexible than the printing press: when information has digital form, you can easily copy it to share it with others. This very flexibility makes a bad fit with a system like copyright. That's the reason for the increasingly nasty and draconian measures now used to enforce software copyright. Consider these four practices of the Software Publishers Association (SPA):
All four practices resemble those used in the former Soviet Union, where every copying machine had a guard to prevent forbidden copying, and where individuals had to copy information secretly and pass it from hand to hand as "samizdat". There is of course a difference: the motive for information control in the Soviet Union was political; in the US the motive is profit. But it is the actions that affect us, not the motive. Any attempt to block the sharing of information, no matter why, leads to the same methods and the same harshness.
Owners make several kinds of arguments for giving them the power to control how we use information:
As a computer user today, you may find yourself using a proprietary program. If your friend asks to make a copy, it would be wrong to refuse. Cooperation is more important than copyright. But underground, closet cooperation does not make for a good society. A person should aspire to live an upright life openly with pride, and this means saying "No" to proprietary software.
You deserve to be able to cooperate openly and freely with other people who use software. You deserve to be able to learn how the software works, and to teach your students with it. You deserve to be able to hire your favorite programmer to fix it when it breaks.
You deserve free software.
Here is a glossary of various categories of software that are often mentioned in discussions of free software. It explains which categories overlap or are part of other categories.
There are a number of words and phrases which we recommend avoiding, either because they are ambiguous or because they imply an opinion that we hope you may not entirely agree with.
To copyleft or not to copyleft? That is one of the major controversies in the free software community. The idea of copyleft is that we should fight fire with fire--that we should use copyright to make sure our code stays free. The GNU GPL is one example of a copyleft license.
Some free software developers prefer non-copyleft distribution. Non-copyleft licenses such as the XFree86 and BSD licenses are based on the idea of never saying no to anyone--not even to someone who seeks to use your work as the basis for restricting other people. Non-copyleft licensing does nothing wrong, but it misses the opportunity to actively protect our freedom to change and redistribute software. For that, we need copyleft.
For many years, the X Consortium was the chief opponent of copyleft. It exerted both moral suasion and pressure to discourage free software developers from copylefting their programs. It used moral suasion by suggesting that it is not nice to say no. It used pressure through its rule that copylefted software could not be in the X Distribution.
Why did the X Consortium adopt this policy? It had to do with their definition of success. The X Consortium defined success as popularity--specifically, getting computer companies to use X Windows. This definition put the computer companies in the driver's seat. Whatever they wanted, the X Consortium had to help them get it.
Computer companies normally distribute proprietary software. They wanted free software developers to donate their work for such use. If they had asked for this directly, people would have laughed. But the X Consortium, fronting for them, could present this request as an unselfish one. "Join us in donating our work to proprietary software developers," they said, suggesting that this is a noble form of self-sacrifice. "Join us in achieving popularity", they said, suggesting that it was not even a sacrifice.
But self-sacrifice is not the issue: tossing away the defenses of copyleft, which protect the freedom of everyone in the community, is sacrificing more than yourself. Those who granted the X Consortium's request entrusted the community's future to the good will of the X Consortium.
This trust was misplaced. In its last year, the X Consortium made a plan to restrict the forthcoming X11R6.4 release so that it will not be free software. They decided to start saying no, not only to proprietary software developers, but to our community as well.
There is an irony here. If you said yes when the X Consortium asked you not to use copyleft, you put the X Consortium in a position to license and restrict its version of your program, along with its own code.
Te X Consortium did not carry out this plan. Instead it closed down and transferred X development to the Open Group, whose staff are now carrying out a similar plan. To give them credit, when I asked them to release X11R6.4 under the GNU GPL in parallel with their planned restrictive license, they were willing to consider the idea. (They were firmly against staying with the old X11 distribution terms.) Before they said yes or no to this proposal, it had already failed for another reason: the XFree86 group follows the X Consortium's old policy, and will not accept copylefted software.
Even if the X Consortium and the Open Group had never planned to restrict X, someone else could have done it. Non-copylefted software is vulnerable from all directions; it lets anyone make a non-free version dominant, if he will invest sufficient resources to add some important feature using proprietary code. Users who choose software based on technical characteristics, rather than on freedom, could easily be lured to the non-free version for short term convenience.
The X Consortium and Open Group can no longer exert moral suasion by saying that it is wrong to say no. This will make it easier to decide to copyleft your X-related software.
When you work on the core of X, on programs such as the X server, Xlib, and Xt, there is a practical reason not to use copyleft. The XFree86 group does an important job for the community in maintaining these programs, and the benefit of copylefting our changes would be less than the harm done by a fork in development. So it is better to work with the XFree86 group and not copyleft our changes on these programs. Likewise for utilities such as xset and xrdb, which are close to the core of X, and which do not need major improvements. At least we know that the XFree86 group has a firm commitment to developing these programs as free software.
The issue is different for programs outside the core of X: applications, window managers, and additional libraries and widgets. There is no reason not to copyleft them, and we should copyleft them.
In case anyone feels the pressure exerted by the criteria for inclusion in X Distributions, the GNU project will undertake to publicize copylefted packages that work with X. If you would like to copyleft something, and you worry that its omission from X Distributions will impede its popularity, please ask us to help.
At the same time, it is better if we do not feel too much need for popularity. When a businessman tempts you with "more popularity", he may try to convince you that his use of your program is crucial to its success. Don't believe it! If your program is good, it will find many users anyway; you don't need to feel desperate for any particular users, and you will be stronger if you do not. You can get an indescribable sense of joy and freedom by responding, "Take it or leave it--that's no skin off my back." Often the businessman will turn around and accept the program with copyleft, once you call the bluff.
Friends, free software developers, don't repeat a mistake. If we do not copyleft our software, we put its future at the mercy of anyone equipped with more resources than scruples. With copyleft, we can defend freedom, not just for ourselves, but for our whole community.
The biggest deficiency in free operating systems is not in the software--it is the lack of good free manuals that we can include in these systems. Many of our most important programs do not come with full manuals. Documentation is an essential part of any software package; when an important free software package does not come with a free manual, that is a major gap. We have many such gaps today.
Once upon a time, many years ago, I thought I would learn Perl. I got a copy of a free manual, but I found it hard to read. When I asked Perl users about alternatives, they told me that there were better introductory manuals--but those were not free.
Why was this? The authors of the good manuals had written them for O'Reilly Associates, which published them with restrictive terms--no copying, no modification, source files not available--which exclude them from the free software community.
That wasn't the first time this sort of thing has happened, and (to our community's great loss) it was far from the last. Proprietary manual publishers have enticed a great many authors to restrict their manuals since then. Many times I have heard a GNU user eagerly tell me about a manual that he is writing, with which he expects to help the GNU project--and then had my hopes dashed, as he proceeded to explain that he had signed a contract with a publisher that would restrict it so that we cannot use it.
Given that writing good English is a rare skill among programmers, we can ill afford to lose manuals this way.
Free documentation, like free software, is a matter of freedom, not price. The problem with these manuals was not that O'Reilly Associates charged a price for printed copies--that in itself is fine. (The Free Software Foundation sells printed copies of free GNU manuals, too.) But GNU manuals are available in source code form, while these manuals are available only on paper. GNU manuals come with permission to copy and modify; the Perl manuals do not. These restrictions are the problems.
The criterion for a free manual is pretty much the same as for free software: it is a matter of giving all users certain freedoms. Redistribution (including commercial redistribution) must be permitted, so that the manual can accompany every copy of the program, on-line or on paper. Permission for modification is crucial too.
As a general rule, I don't believe that it is essential for people to have permission to modify all sorts of articles and books. The issues for writings are not necessarily the same as those for software. For example, I don't think you or I are obliged to give permission to modify articles like this one, which describe our actions and our views.
But there is a particular reason why the freedom to modify is crucial for documentation for free software. When people exercise their right to modify the software, and add or change its features, if they are conscientious they will change the manual too--so they can provide accurate and usable documentation with the modified program. A manual which forbids programmers to be conscientious and finish the job, or more precisely requires them to write a new manual from scratch if they change the program, does not fill our community's needs.
While a blanket prohibition on modification is unacceptable, some kinds of limits on the method of modification pose no problem. For example, requirements to preserve the original author's copyright notice, the distribution terms, or the list of authors, are ok. It is also no problem to require modified versions to include notice that they were modified, even to have entire sections that may not be deleted or changed, as long as these sections deal with nontechnical topics. (Some GNU manuals have them.)
These kinds of restrictions are not a problem because, as a practical matter, they don't stop the conscientious programmer from adapting the manual to fit the modified program. In other words, they don't block the free software community from doing its thing with the program and the manual together.
However, it must be possible to modify all the technical content of the manual; otherwise, the restrictions do block the community, the manual is not free, and so we need another manual.
Unfortunately, it is often hard to find someone to write another manual when a proprietary manual exists. The obstacle is that many users think that a proprietary manual is good enough--so they don't see the need to write a free manual. They do not see that the free operating system has a gap that needs filling.
Why do users think that proprietary manuals are good enough? Some have not considered the issue. I hope this article will do something to change that.
Other users consider proprietary manuals acceptable for the same reason so many people consider proprietary software acceptable: they judge in purely practical terms, not using freedom as a criterion. These people are entitled to their opinions, but since those opinions spring from values which do not include freedom, they are no guide for those of us who do value freedom.
Please spread the word about this issue. We continue to lose manuals to proprietary publishing. If we spread the word that proprietary manuals are not sufficient, perhaps the next person who wants to help GNU by writing documentation will realize, before it is too late, that he must above all make it free.
We can also encourage commercial publishers to sell free, copylefted manuals instead of proprietary ones. One way you can help this is to check the distribution terms of a manual before you buy it, and prefer copylefted manuals to non-copylefted ones.
This document was generated on 22 August 1998 using the texi2html translator version 1.51.