[Index]

10. Portable Shell Programming

When writing your own checks, there are some shell-script programming techniques you should avoid in order to make your code portable. The Bourne shell and upward-compatible shells like the Korn shell and Bash have evolved over the years, but to prevent trouble, do not take advantage of features that were added after UNIX version 7, circa 1977. You should not use shell functions, aliases, negated character classes, or other features that are not found in all Bourne-compatible shells; restrict yourself to the lowest common denominator. Even unset is not supported by all shells! Also, include a space after the exclamation point in interpreter specifications, like this:

#! /usr/bin/perl

If you omit the space before the path, then 4.2BSD based systems (such as Sequent DYNIX) will ignore the line, because they interpret `#! /' as a 4-byte magic number. Some old systems have quite small limits on the length of the `#!' line too, for instance 32 bytes (not including the newline) on SunOS 4.

The set of external programs you should run in a configure script is fairly small. See section `Utilities in Makefiles' in GNU Coding Standards, for the list. This restriction allows users to start out with a fairly small set of programs and build the rest, avoiding too many interdependencies between packages.

Some of these external utilities have a portable subset of features; see 10.9 Limitations of Usual Tools.

10.1 Shellology    A zoology of shells

10.2 Here-Documents    Quirks and tricks

10.3 File Descriptors    FDs and redirections

10.4 File System Conventions    File- and pathnames

10.5 Shell Substitutions    Variable and command expansions

10.6 Assignments    Varying side effects of assignments

10.7 Special Shell Variables    Variables you should not change

10.8 Limitations of Shell Builtins    Portable use of not so portable /bin/sh

10.9 Limitations of Usual Tools    Portable use of portable tools

10.10 Limitations of Make    Portable Makefiles

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.1 Shellology

There are several families of shells, most prominently the Bourne family and the C shell family which are deeply incompatible. If you want to write portable shell scripts, avoid members of the C shell family.

Below we describe some of the members of the Bourne shell family.

Ash

ash is often used on GNU/Linux and BSD systems as a light-weight Bourne-compatible shell. Ash 0.2 has some bugs that are fixed in the 0.3.x series, but portable shell scripts should workaround them, since version 0.2 is still shipped with many GNU/Linux distributions.

To be compatible with Ash 0.2:

don't use `$?' after expanding empty or unset variables:
foo= false $foo echo "Don't use it: $?"
don't use command substitution within variable expansion:
cat ${FOO=`bar`}
beware that single builtin substitutions are not performed by a sub shell, hence their effect applies to the current shell! See section 10.5 Shell Substitutions, item "Command Substitution".

Bash

To detect whether you are running bash, test if BASH_VERSION is set. To disable its extensions and require POSIX compatibility, run `set -o posix'. See section `Bash POSIX Mode' in The GNU Bash Reference Manual, for details.

Bash 2.05 and later

Versions 2.05 and later of bash use a different format for the output of the set builtin, designed to make evaluating this output easier. However, this output is not compatible with earlier versions of bash (or with many other shells, probably). So if you use bash 2.05 or higher to execute configure, you'll need to use bash 2.05 for all other build tasks as well.

/usr/xpg4/bin/sh on Solaris

The POSIX-compliant Bourne shell on a Solaris system is /usr/xpg4/bin/sh and is part of an extra optional package. There is no extra charge for this package, but it is also not part of a minimal OS install and therefore some folks may not have it.

Zsh

To detect whether you are running zsh, test if ZSH_VERSION is set. By default zsh is not compatible with the Bourne shell: you have to run `emulate sh' and set NULLCMD to `:'. See section `Compatibility' in The Z Shell Manual, for details.

Zsh 3.0.8 is the native /bin/sh on Mac OS X 10.0.3.

The following discussion between Russ Allbery and Robert Lipe is worth reading:

Russ Allbery:

The GNU assumption that /bin/sh is the one and only shell leads to a permanent deadlock. Vendors don't want to break user's existant shell scripts, and there are some corner cases in the Bourne shell that are not completely compatible with a POSIX shell. Thus, vendors who have taken this route will never (OK..."never say never") replace the Bourne shell (as /bin/sh) with a POSIX shell.

Robert Lipe:

This is exactly the problem. While most (at least most System V's) do have a Bourne shell that accepts shell functions most vendor /bin/sh programs are not the POSIX shell.
So while most modern systems do have a shell _somewhere_ that meets the POSIX standard, the challenge is to find it.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.2 Here-Documents

Don't rely on `\' being preserved just because it has no special meaning together with the next symbol. in the native /bin/sh on OpenBSD 2.7 `\"' expands to `"' in here-documents with unquoted delimiter. As a general rule, if `\\' expands to `\' use `\\' to get `\'.

With OpenBSD 2.7's /bin/sh

$ cat <<EOF > \" \\ > EOF " \

and with Bash:

bash-2.04$ cat <<EOF > \" \\ > EOF \" \

Many older shells (including the Bourne shell) implement here-documents inefficiently. Users can generally speed things up by using a faster shell, e.g., by using the command `bash ./configure' rather than plain `./configure'.

Some shells can be extremely inefficient when there are a lot of here-documents inside a single statement. For instance if your `configure.ac' includes something like:

if <cross_compiling>; then assume this and that else check this check that check something else ... on and on forever ... fi

A shell parses the whole if/fi construct, creating temporary files for each here document in it. Some shells create links for such here-documents on every fork, so that the clean-up code they had installed correctly removes them. It is creating the links that the shell can take forever.

Moving the tests out of the if/fi, or creating multiple if/fi constructs, would improve the performance significantly. Anyway, this kind of construct is not exactly the typical use of Autoconf. In fact, it's even not recommended, because M4 macros can't look into shell conditionals, so we may fail to expand a macro when it was expanded before in a conditional path, and the condition turned out to be false at run-time, and we end up not executing the macro at all.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.3 File Descriptors

Some file descriptors shall not be used, since some systems, admittedly arcane, use them for special purpose:

3 --- some systems may open it to `/dev/tty'. 4 --- used on the Kubota Titan.

Don't redirect several times the same file descriptor, as you are doomed to failure under Ultrix.

ULTRIX V4.4 (Rev. 69) System #31: Thu Aug 10 19:42:23 GMT 1995 UWS V4.4 (Rev. 11) $ eval 'echo matter >fullness' >void illegal io $ eval '(echo matter >fullness)' >void illegal io $ (eval '(echo matter >fullness)') >void Ambiguous output redirect.

In each case the expected result is of course `fullness' containing `matter' and `void' being empty.

Don't try to redirect the standard error of a command substitution: it must be done inside the command substitution: when running `: `cd /zorglub` 2>/dev/null' expect the error message to escape, while `: `cd /zorglub 2>/dev/null`' works properly.

It is worth noting that Zsh (but not Ash nor Bash) makes it possible in assignments though: `foo=`cd /zorglub` 2>/dev/null'.

Most shells, if not all (including Bash, Zsh, Ash), output traces on stderr, even for sub-shells. This might result in undesired content if you meant to capture the standard-error output of the inner command:

$ ash -x -c '(eval "echo foo >&2") 2>stderr' $ cat stderr + eval echo foo >&2 + echo foo foo $ bash -x -c '(eval "echo foo >&2") 2>stderr' $ cat stderr + eval 'echo foo >&2' ++ echo foo foo $ zsh -x -c '(eval "echo foo >&2") 2>stderr' # Traces on startup files deleted here. $ cat stderr +zsh:1> eval echo foo >&2 +zsh:1> echo foo foo

You'll appreciate the various levels of detail...

One workaround is to grep out uninteresting lines, hoping not to remove good ones...

Don't try to move/delete open files, such as in `exec >foo; mv foo bar', see See section 10.8 Limitations of Shell Builtins, mv for more details.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.4 File System Conventions

While autoconf and friends will usually be run on some Unix variety, it can and will be used on other systems, most notably DOS variants. This impacts several assumptions regarding file and path names.

For example, the following code:

case $foo_dir in /*) # Absolute ;; *) foo_dir=$dots$foo_dir ;; esac

will fail to properly detect absolute paths on those systems, because they can use a drivespec, and will usually use a backslash as directory separator. The canonical way to check for absolute paths is:

case $foo_dir in [\\/]* | ?:[\\/]* ) # Absolute ;; *) foo_dir=$dots$foo_dir ;; esac

Make sure you quote the brackets if appropriate and keep the backslash as first character (see section 10.8 Limitations of Shell Builtins).

Also, because the colon is used as part of a drivespec, these systems don't use it as path separator. When creating or accessing paths, use the PATH_SEPARATOR output variable instead. configure sets this to the appropriate value (`:' or `;') when it starts up.

File names need extra care as well. While DOS-based environments that are Unixy enough to run autoconf (such as DJGPP) will usually be able to handle long file names properly, there are still limitations that can seriously break packages. Several of these issues can be easily detected by the doschk package.

A short overview follows; problems are marked with SFN/LFN to indicate where they apply: SFN means the issues are only relevant to plain DOS, not to DOS boxes under Windows, while LFN identifies problems that exist even under Windows.

No multiple dots (SFN)

DOS cannot handle multiple dots in filenames. This is an especially important thing to remember when building a portable configure script, as autoconf uses a .in suffix for template files.

This is perfectly OK on Unices:

AC_CONFIG_HEADER(config.h) AC_CONFIG_FILES([source.c foo.bar]) AC_OUTPUT

but it causes problems on DOS, as it requires `config.h.in', `source.c.in' and `foo.bar.in'. To make your package more portable to DOS-based environments, you should use this instead:

AC_CONFIG_HEADER(config.h:config.hin) AC_CONFIG_FILES([source.c:source.cin foo.bar:foobar.in]) AC_OUTPUT

No leading dot (SFN)

DOS cannot handle filenames that start with a dot. This is usually not a very important issue for autoconf.

Case insensitivity (LFN)

DOS is case insensitive, so you cannot, for example, have both a file called `INSTALL' and a directory called `install'. This also affects make; if there's a file called `INSTALL' in the directory, make install will do nothing (unless the `install' target is marked as PHONY).

The 8+3 limit (SFN)

Because the DOS file system only stores the first 8 characters of the filename and the first 3 of the extension, those must be unique. That means that `foobar-part1.c', `foobar-part2.c' and `foobar-prettybird.c' all resolve to the same filename (`FOOBAR-P.C'). The same goes for `foo.bar' and `foo.bartender'.

Note: This is not usually a problem under Windows, as it uses numeric tails in the short version of filenames to make them unique. However, a registry setting can turn this behaviour off. While this makes it possible to share file trees containing long file names between SFN and LFN environments, it also means the above problem applies there as well.

Invalid characters

Some characters are invalid in DOS filenames, and should therefore be avoided. In a LFN environment, these are `/', `\', `?', `*', `:', `<', `>', `|' and `"'. In a SFN environment, other characters are also invalid. These include `+', `,', `[' and `]'.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.5 Shell Substitutions

Contrary to a persistent urban legend, the Bourne shell does not systematically split variables and backquoted expressions, in particular on the right-hand side of assignments and in the argument of case. For instance, the following code:

case "$given_srcdir" in .) top_srcdir="`echo "$dots" | sed 's,/$,,'`" *) top_srcdir="$dots$given_srcdir" ;; esac

is more readable when written as:

case $given_srcdir in .) top_srcdir=`echo "$dots" | sed 's,/$,,'` *) top_srcdir=$dots$given_srcdir ;; esac

and in fact it is even more portable: in the first case of the first attempt, the computation of top_srcdir is not portable, since not all shells properly understand "`..."..."...`". Worse yet, not all shells understand "`...\"...\"...`" the same way. There is just no portable way to use double-quoted strings inside double-quoted backquoted expressions (pfew!).

$@

One of the most famous shell-portability issues is related to `"$@"': when there are no positional arguments, it is supposed to be equivalent to nothing. But some shells, for instance under Digital Unix 4.0 and 5.0, will then replace it with an empty argument. To be portable, use `${1+"$@"}'.

${var:-value}

Old BSD shells, including the Ultrix sh, don't accept the colon for any shell substitution, and complain and die.

${var=literal}

Be sure to quote:

: ${var='Some words'}

otherwise some shells, such as on Digital Unix V 5.0, will die because of a "bad substitution".

Solaris' /bin/sh has a frightening bug in its interpretation of this. Imagine you need set a variable to a string containing `}'. This `}' character confuses Solaris' /bin/sh when the affected variable was already set. This bug can be exercised by running:

$ unset foo $ foo=${foo='}'} $ echo $foo } $ foo=${foo='}' # no error; this hints to what the bug is $ echo $foo } $ foo=${foo='}'} $ echo $foo }} ^ ugh!

It seems that `}' is interpreted as matching `${', even though it is enclosed in single quotes. The problem doesn't happen using double quotes.

${var=expanded-value}

On Ultrix, running

default="yu,yaa" : ${var="$default"}

will set var to `M-yM-uM-,M-yM-aM-a', i.e., the 8th bit of each char will be set. You won't observe the phenomenon using a simple `echo $var' since apparently the shell resets the 8th bit when it expands $var. Here are two means to make this shell confess its sins:

$ cat -v <<EOF $var EOF

and

$ set | grep '^var=' | cat -v

One classic incarnation of this bug is:

default="a b c" : ${list="$default"} for c in $list; do echo $c done

You'll get `a b c' on a single line. Why? Because there are no spaces in `$list': there are `M- ', i.e., spaces with the 8th bit set, hence no IFS splitting is performed!!!

One piece of good news is that Ultrix works fine with `: ${list=$default}'; i.e., if you don't quote. The bad news is then that QNX 4.25 then sets list to the last item of default!

The portable way out consists in using a double assignment, to switch the 8th bit twice on Ultrix:

list=${list="$default"}

...but beware of the `}' bug from Solaris (see above). For safety, use:

test "${var+set}" = set || var={value}

`commands`

While in general it makes no sense, do not substitute a single builtin with side effects as Ash 0.2, trying to optimize, does not fork a sub-shell to perform the command.

For instance, if you wanted to check that cd is silent, do not use `test -z "`cd /`"' because the following can happen:

$ pwd /tmp $ test -n "`cd /`" && pwd /

The result of `foo=`exit 1`' is left as an exercise to the reader.

$(commands)

This construct is meant to replace ``commands`'; they can be nested while this is impossible to do portably with back quotes. Unfortunately it is not yet widely supported. Most notably, even recent releases of Solaris don't support it:

$ showrev -c /bin/sh | grep version Command version: SunOS 5.8 Generic 109324-02 February 2001 $ echo $(echo blah) syntax error: `(' unexpected

nor does IRIX 6.5's Bourne shell:

$ uname -a IRIX firebird-image 6.5 07151432 IP22 $ echo $(echo blah) $(echo blah)

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.6 Assignments

When setting several variables in a row, be aware that the order of the evaluation is undefined. For instance `foo=1 foo=2; echo $foo' gives `1' with sh on Solaris, but `2' with Bash. You must use `;' to enforce the order: `foo=1; foo=2; echo $foo'.

Don't rely on the exit status of an assignment: Ash 0.2 does not change the status and propagates that of the last statement:

$ false || foo=bar; echo $? 1 $ false || foo=`:`; echo $? 0

and to make things even worse, QNX 4.25 just sets the exit status to 0 in any case:

$ foo=`exit 1`; echo $? 0

To assign default values, follow this algorithm:

If the default value is a literal and does not contain any closing brace, use:
: ${var='my literal'}
If the default value contains no closing brace, has to be expanded, and the variable being initialized will never be IFS-split (i.e., it's not a list), then use:
: ${var="$default"}
If the default value contains no closing brace, has to be expanded, and the variable being initialized will be IFS-split (i.e., it's a list), then use:
var=${var="$default"}
If the default value contains a closing brace, then use:
test "${var+set}" = set || var='${indirection}'

In most cases `var=${var="$default"}' is fine, but in case of doubt, just use the latter. See section 10.5 Shell Substitutions, items `${var:-value}' and `${var=value}' for the rationale.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.7 Special Shell Variables

Some shell variables should not be used, since they can have a deep influence on the behavior of the shell. In order to recover a sane behavior from the shell, some variables should be unset, but unset is not portable (see section 10.8 Limitations of Shell Builtins) and a fallback value is needed. We list these values below.

CDPATH

When this variable is set cd is verbose, so idioms such as `abs=`cd $rel && pwd`' break because abs receives the path twice.

Setting CDPATH to the empty value is not enough for most shells. A simple path separator is enough except for zsh, which prefers a leading dot:

zsh-3.1.6$ mkdir foo && (CDPATH=: cd foo) /tmp/foo zsh-3.1.6$ (CDPATH=:. cd foo) /tmp/foo zsh-3.1.6$ (CDPATH=.: cd foo) zsh-3.1.6$

(of course we could just unset CDPATH, since it also behaves properly if set to the empty string).

Life wouldn't be so much fun if bash and zsh had the same behavior:

bash-2.02$ mkdir foo && (CDPATH=: cd foo) bash-2.02$ (CDPATH=:. cd foo) bash-2.02$ (CDPATH=.: cd foo) /tmp/foo

Of course, even better style would be to use PATH_SEPARATOR instead of a `:'. Therefore, a portable solution to neutralize CDPATH is

CDPATH=${ZSH_VERSION+.}$PATH_SEPARATOR

Note that since zsh supports unset, you may unset CDPATH using PATH_SEPARATOR as a fallback, see 10.8 Limitations of Shell Builtins.

IFS

Don't set the first character of IFS to backslash. Indeed, Bourne shells use the first character (backslash) when joining the components in `"$@"' and some shells then re-interpret (!) the backslash escapes, so you can end up with backspace and other strange characters.

LANG

LC_ALL

LC_COLLATE

LC_CTYPE

LC_MESSAGES

LC_NUMERIC

LC_TIME

Autoconf-generated scripts normally set all these variables to `C' because so much configuration code assumes the C locale and POSIX requires that LC_ALL be set to `C' if the C locale is desired. However, some older, nonstandard systems (notably SCO) break if LC_ALL is set to `C', so when running on these systems Autoconf-generated scripts first try to unset the variables instead.

LANGUAGE

LANGUAGE is not specified by POSIX, but it is a GNU extension that overrides LC_ALL in some cases, so Autoconf-generated scripts set it too.

LINENO

Most modern shells provide the current line number in LINENO. Its value is the line number of the beginning of the current command. Autoconf attempts to execute configure with a modern shell. If no such shell is available, it attempts to implement LINENO with a Sed prepass that replaces the each instance of the string $LINENO (not followed by an alphanumeric character) with the line's number.

You should not rely on LINENO within eval, as the behavior differs in practice. Also, the possibility of the Sed prepass means that you should not rely on $LINENO when quoted, when in here-documents, or when in long commands that cross line boundaries. Subshells should be OK, though. In the following example, lines 1, 6, and 9 are portable, but the other instances of LINENO are not:

$ cat lineno echo 1. $LINENO cat <<EOF 3. $LINENO 4. $LINENO EOF ( echo 6. $LINENO ) eval 'echo 7. $LINENO' echo 8. '$LINENO' echo 9. $LINENO ' 10.' $LINENO $ bash-2.05 lineno 1. 1 3. 2 4. 2 6. 6 7. 1 8. $LINENO 9. 9 10. 9 $ zsh-3.0.6 lineno 1. 1 3. 2 4. 2 6. 6 7. 7 8. $LINENO 9. 9 10. 9 $ pdksh-5.2.14 lineno 1. 1 3. 2 4. 2 6. 6 7. 0 8. $LINENO 9. 9 10. 9 $ sed '=' <lineno | > sed ' > N > s,$,-, > : loop > s,^$[0-9]*$$.*$[$]LINENO$[^a-zA-Z0-9_]$,\1\2\1\3, > t loop > s,-$,, > s,^[0-9]*\n,, > ' | > sh 1. 1 3. 3 4. 4 6. 6 7. 7 8. 8 9. 9 10. 10

NULLCMD

When executing the command `>foo', zsh executes `$NULLCMD >foo'. The Bourne shell considers NULLCMD is `:', while zsh, even in Bourne shell compatibility mode, sets NULLCMD to `cat'. If you forgot to set NULLCMD, your script might be suspended waiting for data on its standard input.

status

This variable is an alias to `$?' for zsh (at least 3.1.6), hence read-only. Do not use it.

PATH_SEPARATOR

If it is not set, configure will detect the appropriate path separator for the build system and set the PATH_SEPARATOR output variable accordingly.

On DJGPP systems, the PATH_SEPARATOR environment variable can be set to either `:' or `;' to control the path separator bash uses to set up certain environment variables (such as PATH). Since this only works inside bash, you want configure to detect the regular DOS path separator (`;'), so it can be safely substituted in files that may not support `;' as path separator. So it is recommended to either unset this variable or set it to `;'.

RANDOM

Many shells provide RANDOM, a variable that returns a different integer when used. Most of the time, its value does not change when it is not used, but on IRIX 6.5 the value changes all the time. This can be observed by using set.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.8 Limitations of Shell Builtins

No, no, we are serious: some shells do have limitations! :)

You should always keep in mind that any built-in or command may support options, and therefore have a very different behavior with arguments starting with a dash. For instance, the innocent `echo "$word"' can give unexpected results when word starts with a dash. It is often possible to avoid this problem using `echo "x$word"', taking the `x' into account later in the pipe.

.

Use . only with regular files (use `test -f'). Bash 2.03, for instance, chokes on `. /dev/null'. Also, remember that . uses PATH if its argument contains no slashes, so if you want to use . on a file `foo' in the current directory, you must use `. ./foo'.

!

You can't use !, you'll have to rewrite your code.

break

The use of `break 2', etcetera, is safe.

case

You don't need to quote the argument; no splitting is performed.

You don't need the final `;;', but you should use it.

Because of a bug in its fnmatch, bash fails to properly handle backslashes in character classes:

bash-2.02$ case /tmp in [/\\]*) echo OK;; esac bash-2.02$

This is extremely unfortunate, since you are likely to use this code to handle UNIX or MS-DOS absolute paths. To work around this bug, always put the backslash first:

bash-2.02$ case '\TMP' in [\\/]*) echo OK;; esac OK bash-2.02$ case /tmp in [\\/]*) echo OK;; esac OK

Some shells, such as Ash 0.3.8, are confused by empty case/esac:

ash-0.3.8 $ case foo in esac; error-->Syntax error: ";" unexpected (expecting ")")

Many shells still do not support parenthesized cases, which is a pity for those of us using tools that rely on balanced parentheses. For instance, Solaris 2.8's Bourne shell:

$ case foo in (foo) echo foo;; esac error-->syntax error: `(' unexpected

echo

The simple echo is probably the most surprising source of portability troubles. It is not possible to use `echo' portably unless both options and escape sequences are omitted. New applications which are not aiming at portability should use `printf' instead of `echo'.

Don't expect any option. See section 4.7.1 Preset Output Variables, ECHO_N etc. for a means to simulate `-c'.

Do not use backslashes in the arguments, as there is no consensus on their handling. On `echo '\n' | wc -l', the sh of Digital Unix 4.0, MIPS RISC/OS 4.52, answer 2, but the Solaris' sh, Bash and Zsh (in sh emulation mode) report 1. Please note that the problem is truly echo: all the shells understand `'\n'' as the string composed of a backslash and an `n'.

Because of these problems, do not pass a string containing arbitrary characters to echo. For example, `echo "$foo"' is safe if you know that foo's value cannot contain backslashes and cannot start with `-', but otherwise you should use a here-document like this:

cat <<EOF $foo EOF

exit

The default value of exit is supposed to be $?; unfortunately, some shells, such as the DJGPP port of Bash 2.04, just perform `exit 0'.

bash-2.04$ foo=`exit 1` || echo fail fail bash-2.04$ foo=`(exit 1)` || echo fail fail bash-2.04$ foo=`(exit 1); exit` || echo fail bash-2.04$

Using `exit $?' restores the expected behavior.

Some shell scripts, such as those generated by autoconf, use a trap to clean up before exiting. If the last shell command exited with nonzero status, the trap also exits with nonzero status so that the invoker can tell that an error occurred.

Unfortunately, in some shells, such as Solaris 8 sh, an exit trap ignores the exit command's status. In these shells, a trap cannot determine whether it was invoked by plain exit or by exit 1. Instead of calling exit directly, use the AC_MSG_ERROR macro that has a workaround for this problem.

export

The builtin export dubs environment variable a shell variable. Each update of exported variables corresponds to an update of the environment variables. Conversely, each environment variable received by the shell when it is launched should be imported as a shell variable marked as exported.

Alas, many shells, such as Solaris 2.5, IRIX 6.3, IRIX 5.2, AIX 4.1.5 and DU 4.0, forget to export the environment variables they receive. As a result, two variables are coexisting: the environment variable and the shell variable. The following code demonstrates this failure:

#! /bin/sh echo $FOO FOO=bar echo $FOO exec /bin/sh $0

when run with `FOO=foo' in the environment, these shells will print alternately `foo' and `bar', although it should only print `foo' and then a sequence of `bar's.

Therefore you should export again each environment variable that you update.

false

Don't expect false to exit with status 1: in the native Bourne shell of Solaris 8, it exits with status 255.

for

To loop over positional arguments, use:

for arg do echo "$arg" done

You may not leave the do on the same line as for, since some shells improperly grok:

for arg; do echo "$arg" done

If you want to explicitly refer to the positional arguments, given the `$@' bug (see section 10.5 Shell Substitutions), use:

for arg in ${1+"$@"}; do echo "$arg" done

if

Using `!' is not portable. Instead of:

if ! cmp -s file file.new; then mv file.new file fi

use:

if cmp -s file file.new; then :; else mv file.new file fi

There are shells that do not reset the exit status from an if:

$ if (exit 42); then true; fi; echo $? 42

whereas a proper shell should have printed `0'. This is especially bad in Makefiles since it produces false failures. This is why properly written Makefiles, such as Automake's, have such hairy constructs:

if test -f "$file"; then install "$file" "$dest" else : fi

set

This builtin faces the usual problem with arguments starting with a dash. Modern shells such as Bash or Zsh understand `--' to specify the end of the options (any argument after `--' is a parameters, even `-x' for instance), but most shells simply stop the option processing as soon as a non-option argument is found. Therefore, use `dummy' or simply `x' to end the option processing, and use shift to pop it out:

set x $my_list; shift

shift

Not only is shifting a bad idea when there is nothing left to shift, but in addition it is not portable: the shell of MIPS RISC/OS 4.52 refuses to do it.

source

This command is not portable, as POSIX does not require it; use . instead.

test

The test program is the way to perform many file and string tests. It is often invoked by the alternate name `[', but using that name in Autoconf code is asking for trouble since it is an M4 quote character.

If you need to make multiple checks using test, combine them with the shell operators `&&' and `||' instead of using the test operators `-a' and `-o'. On System V, the precedence of `-a' and `-o' is wrong relative to the unary operators; consequently, POSIX does not specify them, so using them is nonportable. If you combine `&&' and `||' in the same statement, keep in mind that they have equal precedence.

You may use `!' with test, but not with if: `test ! -r foo || exit 1'.

test (files)

To enable configure scripts to support cross-compilation, they shouldn't do anything that tests features of the build system instead of the host system. But occasionally you may find it necessary to check whether some arbitrary file exists. To do so, use `test -f' or `test -r'. Do not use `test -x', because 4.3BSD does not have it. Do not use `test -e' either, because Solaris 2.5 does not have it.

test (strings)

Avoid `test "string"', in particular if string might start with a dash, since test might interpret its argument as an option (e.g., `string = "-n"').

Contrary to a common belief, `test -n string' and `test -z string' are portable, nevertheless many shells (such as Solaris 2.5, AIX 3.2, UNICOS 10.0.0.6, Digital Unix 4 etc.) have bizarre precedence and may be confused if string looks like an operator:

$ test -n = test: argument expected

If there are risks, use `test "xstring" = x' or `test "xstring" != x' instead.

It is frequent to find variations of the following idiom:

test -n "`echo $ac_feature | sed 's/[-a-zA-Z0-9_]//g'`" && action

to take an action when a token matches a given pattern. Such constructs should always be avoided by using:

echo "$ac_feature" | grep '[^-a-zA-Z0-9_]' >/dev/null 2>&1 && action

Use case where possible since it is faster, being a shell builtin:

case $ac_feature in *[!-a-zA-Z0-9_]*) action;; esac

Alas, negated character classes are probably not portable, although no shell is known to not support the POSIX.2 syntax `[!...]' (when in interactive mode, zsh is confused by the `[!...]' syntax and looks for an event in its history because of `!'). Many shells do not support the alternative syntax `[^...]' (Solaris, Digital Unix, etc.).

One solution can be:

expr "$ac_feature" : '.*[^-a-zA-Z0-9_]' >/dev/null && action

or better yet

expr "x$ac_feature" : '.*[^-a-zA-Z0-9_]' >/dev/null && action

`expr "Xfoo" : "Xbar"' is more robust than `echo "Xfoo" | grep "^Xbar"', because it avoids problems when `foo' contains backslashes.

trap

It is safe to trap at least the signals 1, 2, 13 and 15. You can also trap 0, i.e., have the trap run when the script ends (either via an explicit exit, or the end of the script).

Although POSIX is not absolutely clear on this point, it is widely admitted that when entering the trap `$?' should be set to the exit status of the last command run before the trap. The ambiguity can be summarized as: "when the trap is launched by an exit, what is the last command run: that before exit, or exit itself?"

Bash considers exit to be the last command, while Zsh and Solaris 8 sh consider that when the trap is run it is still in the exit, hence it is the previous exit status that the trap receives:

$ cat trap.sh trap 'echo $?' 0 (exit 42); exit 0 $ zsh trap.sh 42 $ bash trap.sh 0

The portable solution is then simple: when you want to `exit 42', run `(exit 42); exit 42', the first exit being used to set the exit status to 42 for Zsh, and the second to trigger the trap and pass 42 as exit status for Bash.

The shell in FreeBSD 4.0 has the following bug: `$?' is reset to 0 by empty lines if the code is inside trap.

$ trap 'false echo $?' 0 $ exit 0

Fortunately, this bug only affects trap.

true

Don't worry: as far as we know true is portable. Nevertheless, it's not always a builtin (e.g., Bash 1.x), and the portable shell community tends to prefer using :. This has a funny side effect: when asked whether false is more portable than true Alexandre Oliva answered:

In a sense, yes, because if it doesn't exist, the shell will produce an exit status of failure, which is correct for false, but not for true.

unset

You cannot assume the support of unset, nevertheless, because it is extremely useful to disable embarrassing variables such as CDPATH, you can test for its existence and use it provided you give a neutralizing value when unset is not supported:

if (unset FOO) >/dev/null 2>&1; then unset=unset else unset=false fi $unset CDPATH || CDPATH=:

See section 10.7 Special Shell Variables, for some neutralizing values. Also, see 10.8 Limitations of Shell Builtins, documentation of export, for the case of environment variables.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.9 Limitations of Usual Tools

The small set of tools you can expect to find on any machine can still include some limitations you should be aware of.

awk

Don't leave white spaces before the parentheses in user functions calls, GNU awk will reject it:

$ gawk 'function die () { print "Aaaaarg!" } BEGIN { die () }' gawk: cmd. line:2: BEGIN { die () } gawk: cmd. line:2: ^ parse error $ gawk 'function die () { print "Aaaaarg!" } BEGIN { die() }' Aaaaarg!

If you want your program to be deterministic, don't depend on for on arrays:

$ cat for.awk END { arr["foo"] = 1 arr["bar"] = 1 for (i in arr) print i } $ gawk -f for.awk </dev/null foo bar $ nawk -f for.awk </dev/null bar foo

Some AWK, such as HPUX 11.0's native one, have regex engines fragile to inner anchors:

$ echo xfoo | $AWK '/foo|^bar/ { print }' $ echo bar | $AWK '/foo|^bar/ { print }' bar $ echo xfoo | $AWK '/^bar|foo/ { print }' xfoo $ echo bar | $AWK '/^bar|foo/ { print }' bar

Either do not depend on such patterns (i.e., use `/^(.*foo|bar)/', or use a simple test to reject such AWK.

cat

Don't rely on any option. The option `-v', which displays non-printing characters, seems portable, though.

cc

When a compilation such as `cc foo.c -o foo' fails, some compilers (such as CDS on Reliant UNIX) leave a `foo.o'.

HP-UX cc doesn't accept `.S' files to preprocess and assemble. `cc -c foo.S' will appear to succeed, but in fact does nothing.

cmp

cmp performs a raw data comparison of two files, while diff compares two text files. Therefore, if you might compare DOS files, even if only checking whether two files are different, use diff to avoid spurious differences due to differences of newline encoding.

cp

SunOS cp does not support `-f', although its mv does. It's possible to deduce why mv and cp are different with respect to `-f'. mv prompts by default before overwriting a read-only file. cp does not. Therefore, mv requires a `-f' option, but cp does not. mv and cp behave differently with respect to read-only files because the simplest form of cp cannot overwrite a read-only file, but the simplest form of mv can. This is because cp opens the target for write access, whereas mv simply calls link (or, in newer systems, rename).

date

Some versions of date do not recognize special % directives, and unfortunately, instead of complaining, they just pass them through, and exit with success:

$ uname -a OSF1 medusa.sis.pasteur.fr V5.1 732 alpha $ date "+%s" %s

diff

Option `-u' is nonportable.

Some implementations, such as Tru64's, fail when comparing to `/dev/null'. Use an empty file instead.

dirname

Not all hosts have a working dirname, and you should instead use AS_DIRNAME (see section 8.4 Programming in M4sh). For example:

dir=`dirname "$file"` # This is not portable. dir=`AS_DIRNAME(["$file"])` # This is more portable.

This handles a few subtleties in the standard way required by POSIX. For example, under UN*X, should `dirname //1' give `/'? Paul Eggert answers:

No, under some older flavors of Unix, leading `//' is a special path name: it refers to a "super-root" and is used to access other machines' files. Leading `///', `////', etc. are equivalent to `/'; but leading `//' is special. I think this tradition started with Apollo Domain/OS, an OS that is still in use on some older hosts.
POSIX allows but does not require the special treatment for `//'. It says that the behavior of dirname on path names of the form `//([^/]+/*)?' is implementation defined. In these cases, GNU dirname returns `/', but it's more portable to return `//' as this works even on those older flavors of Unix.

egrep

The empty alternative is not portable, use `?' instead. For instance with Digital Unix v5.0:

> printf "foo\n|foo\n" | egrep '^(|foo|bar)$' |foo > printf "bar\nbar|\n" | egrep '^(foo|bar|)$' bar| > printf "foo\nfoo|\n|bar\nbar\n" | egrep '^(foo||bar)$' foo |bar

egrep also suffers the limitations of grep.

expr

No expr keyword starts with `x', so use `expr x"word" : 'xregex'' to keep expr from misinterpreting word.

Don't use length, substr, match and index.

expr (`|')

You can use `|'. Although POSIX does require that `expr "' return the empty string, it does not specify the result when you `|' together the empty string (or zero) with the empty string. For example:

expr '' \| ''

GNU/Linux and POSIX.2-1992 return the empty string for this case, but traditional Unix returns `0' (Solaris is one such example). In the latest POSIX draft, the specification has been changed to match traditional Unix's behavior (which is bizarre, but it's too late to fix this). Please note that the same problem does arise when the empty string results from a computation, as in:

expr bar : foo \| foo : bar

Avoid this portability problem by avoiding the empty string.

expr (`:')

Don't use `\?', `\+' and `\|' in patterns, they are not supported on Solaris.

The POSIX.2-1992 standard is ambiguous as to whether `expr a : b' (and `expr 'a' : '$b$'') output `0' or the empty string. In practice, it outputs the empty string on most platforms, but portable scripts should not assume this. For instance, the QNX 4.25 native expr returns `0'.

You may believe that one means to get a uniform behavior would be to use the empty string as a default value:

expr a : b \| ''

unfortunately this behaves exactly as the original expression, see the `expr (`:')' entry for more information.

Older expr implementations (e.g. SunOS 4 expr and Solaris 8 /usr/ucb/expr) have a silly length limit that causes expr to fail if the matched substring is longer than 120 bytes. In this case, you might want to fall back on `echo|sed' if expr fails.

Don't leave, there is some more!

The QNX 4.25 expr, in addition of preferring `0' to the empty string, has a funny behavior in its exit status: it's always 1 when parentheses are used!

$ val=`expr 'a' : 'a'`; echo "$?: $val" 0: 1 $ val=`expr 'a' : 'b'`; echo "$?: $val" 1: 0 $ val=`expr 'a' : '$a$'`; echo "?: $val" 1: a $ val=`expr 'a' : '$b$'`; echo "?: $val" 1: 0

In practice this can be a big problem if you are ready to catch failures of expr programs with some other method (such as using sed), since you may get twice the result. For instance

$ expr 'a' : '$a$' || echo 'a' | sed 's/^$a$$/\1/'

will output `a' on most hosts, but `aa' on QNX 4.25. A simple work around consists in testing expr and use a variable set to expr or to false according to the result.

find

The option `-maxdepth' seems to be GNU specific. Tru64 v5.1, NetBSD 1.5 and Solaris 2.5 find commands do not understand it.

The replacement of `{}' is guaranteed only if the argument is exactly {}, not if it's only a part of an argument. For instance on DU, and HP-UX 10.20 and HP-UX 11:

$ touch foo $ find . -name foo -exec echo "{}-{}" \; {}-{}

while GNU find reports `./foo-./foo'.

grep

Don't use `grep -s' to suppress output, because `grep -s' on System V does not suppress output, only error messages. Instead, redirect the standard output and standard error (in case the file doesn't exist) of grep to `/dev/null'. Check the exit status of grep to determine whether it found a match.

Don't use multiple regexps with `-e', as some grep will only honor the last pattern (eg., IRIX 6.5 and Solaris 2.5.1). Anyway, Stardent Vistra SVR4 grep lacks `-e'... Instead, use alternation and egrep.

ln

Don't rely on ln having a `-f' option. Symbolic links are not available on old systems, use `ln' as a fall back.

For versions of the DJGPP before 2.04, ln emulates soft links for executables by generating a stub that in turn calls the real program. This feature also works with nonexistent files like in the Unix spec. So `ln -s file link' will generate `link.exe', which will attempt to call `file.exe' if run. But this feature only works for executables, so `cp -p' is used instead for these systems. DJGPP versions 2.04 and later have full symlink support.

mv

The only portable options are `-f' and `-i'.

Moving individual files between file systems is portable (it was in V6), but it is not always atomic: when doing `mv new existing', there's a critical section where neither the old nor the new version of `existing' actually exists.

Moving directories across mount points is not portable, use cp and rm.

Moving/Deleting open files isn't portable. The following can't be done on DOS/WIN32:

exec > foo mv foo bar

nor can

exec > foo rm -f foo

sed

Patterns should not include the separator (unless escaped), even as part of a character class. In conformance with POSIX, the Cray sed will reject `s/[^/]*$//': use `s,[^/]*$,,'.

Sed scripts should not use branch labels longer than 8 characters and should not contain comments.

Don't include extra `;', as some sed, such as NetBSD 1.4.2's, try to interpret the second as a command:

$ echo a | sed 's/x/x/;;s/x/x/' sed: 1: "s/x/x/;;s/x/x/": invalid command code ;

Input should have reasonably long lines, since some sed have an input buffer limited to 4000 bytes.

Alternation, `\|', is common but POSIX.2 does not require its support, so it should be avoided in portable scripts. Solaris 8 sed does not support alternation; e.g. `sed '/a\|b/d'' deletes only lines that contain the literal string `a|b'.

Anchors (`^' and `$') inside groups are not portable.

Nested parenthesization in patterns (e.g., `$\(a*$b*)\)') is quite portable to modern hosts, but is not supported by some older sed implementations like SVR3.

Of course the option `-e' is portable, but it is not needed. No valid Sed program can start with a dash, so it does not help disambiguating. Its sole usefulness is helping enforcing indenting as in:

sed -e instruction-1 \ -e instruction-2

as opposed to

sed instruction-1;instruction-2

Contrary to yet another urban legend, you may portably use `&' in the replacement part of the s command to mean "what was matched". All descendents of Bell Lab's V7 sed (at least; we don't have first hand experience with older seds) have supported it.

sed (`t')

Some old systems have sed that "forget" to reset their `t' flag when starting a new cycle. For instance on MIPS RISC/OS, and on IRIX 5.3, if you run the following sed script (the line numbers are not actual part of the texts):

s/keep me/kept/g # a t end # b s/.*/deleted/g # c : end # d

delete me # 1 delete me # 2 keep me # 3 delete me # 4

you get

deleted delete me kept deleted

instead of

deleted deleted kept deleted

Why? When processing 1, a matches, therefore sets the t flag, b jumps to d, and the output is produced. When processing line 2, the t flag is still set (this is the bug). Line a fails to match, but sed is not supposed to clear the t flag when a substitution fails. Line b sees that the flag is set, therefore it clears it, and jumps to d, hence you get `delete me' instead of `deleted'. When processing 3 t is clear, a matches, so the flag is set, hence b clears the flags and jumps. Finally, since the flag is clear, 4 is processed properly.

There are two things one should remind about `t' in sed. Firstly, always remember that `t' jumps if some substitution succeeded, not only the immediately preceding substitution, therefore, always use a fake `t clear; : clear' to reset the t flag where indeed.

Secondly, you cannot rely on sed to clear the flag at each new cycle.

One portable implementation of the script above is:

t clear : clear s/keep me/kept/g t end s/.*/deleted/g : end

touch

On some old BSD systems, touch or any command that results in an empty file does not update the timestamps, so use a command like echo as a workaround.

GNU touch 3.16r (and presumably all before that) fails to work on SunOS 4.1.3 when the empty file is on an NFS-mounted 4.2 volume.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.10 Limitations of Make

Make itself suffers a great number of limitations, only a few of which being listed here. First of all, remember that since commands are executed by the shell, all its weaknesses are inherited...

$<

POSIX says that the `$<' construct in makefiles can be used only in inference rules and in the `.DEFAULT' rule; its meaning in ordinary rules is unspecified. Solaris 8's make for instance will replace it with the argument.

Leading underscore in macro names

Some Make don't support leading underscores in macro names, such as on NEWS-OS 4.2R.

$ cat Makefile _am_include = # _am_quote = all:; @echo this is test $ make Make: Must be a separator on rules line 2. Stop. $ cat Makefile2 am_include = # am_quote = all:; @echo this is test $ make -f Makefile2 this is test

VPATH

Don't use it! For instance any assignment to VPATH causes Sun make to only execute the first set of double-colon rules.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Dirk Vermeir on May, 8 2002 using texi2html

10.1 Shellology		A zoology of shells
10.2 Here-Documents		Quirks and tricks
10.3 File Descriptors		FDs and redirections
10.4 File System Conventions		File- and pathnames
10.5 Shell Substitutions		Variable and command expansions
10.6 Assignments		Varying side effects of assignments
10.7 Special Shell Variables		Variables you should not change
10.8 Limitations of Shell Builtins		Portable use of not so portable /bin/sh
10.9 Limitations of Usual Tools		Portable use of portable tools
10.10 Limitations of Make		Portable Makefiles