The Bash shell contains no debugger, nor even any debugging-specific commands or constructs. [1] Syntax errors or outright typos in the script generate cryptic error messages that are often of no help in debugging a non-functional script.
Example 30-1. A buggy script
#!/bin/bash # ex74.sh # This is a buggy script. # Where, oh where is the error? a=37 if [$a -gt 27 ] then echo $a fi exit 0 |
Output from script:
./ex74.sh: [37: command not found |
Example 30-2. Missing keyword
#!/bin/bash # missing-keyword.sh: What error message will this generate? for a in 1 2 3 do echo "$a" # done # Required keyword 'done' commented out in line 7. exit 0 |
Output from script:
missing-keyword.sh: line 10: syntax error: unexpected end of file |
Error messages may disregard comment lines in a script when reporting the line number of a syntax error.
What if the script executes, but does not work as expected? This is the all too familiar logic error.
Example 30-3. test24, another buggy script
#!/bin/bash # This script is supposed to delete all filenames in current directory #+ containing embedded spaces. # It doesn't work. # Why not? badname=`ls | grep ' '` # Try this: # echo "$badname" rm "$badname" exit 0 |
Try to find out what's wrong with Example 30-3 by uncommenting the echo "$badname" line. Echo statements are useful for seeing whether what you expect is actually what you get.
In this particular case, rm "$badname" will not give the desired results because $badname should not be quoted. Placing it in quotes ensures that rm has only one argument (it will match only one filename). A partial fix is to remove to quotes from $badname and to reset $IFS to contain only a newline, IFS=$'\n'. However, there are simpler ways of going about it.
# Correct methods of deleting filenames containing spaces. rm *\ * rm *" "* rm *' '* # Thank you. S.C. |
Summarizing the symptoms of a buggy script,
It bombs with a "syntax error" message, or
It runs, but does not work as expected (logic error).
It runs, works as expected, but has nasty side effects (logic bomb).
Tools for debugging non-working scripts include
echo statements at critical points in the script to trace the variables, and otherwise give a snapshot of what is going on.
using the tee filter to check processes or data flows at critical points.
setting option flags -n -v -x
sh -n scriptname checks for syntax errors without actually running the script. This is the equivalent of inserting set -n or set -o noexec into the script. Note that certain types of syntax errors can slip past this check.
sh -v scriptname echoes each command before executing it. This is the equivalent of inserting set -v or set -o verbose in the script.
The -n and -v flags work well together. sh -nv scriptname gives a verbose syntax check.
sh -x scriptname echoes the result each command, but in an abbreviated manner. This is the equivalent of inserting set -x or set -o xtrace in the script.
Inserting set -u or set -o nounset in the script runs it, but gives an unbound variable error message at each attempt to use an undeclared variable.
Using an "assert" function to test a variable or condition at critical points in a script. (This is an idea borrowed from C.)
Example 30-4. Testing a condition with an "assert"
#!/bin/bash
# assert.sh
assert () # If condition false,
{ #+ exit from script with error message.
E_PARAM_ERR=98
E_ASSERT_FAILED=99
if [ -z "$2" ] # Not enough parameters passed.
then
return $E_PARAM_ERR # No damage done.
fi
lineno=$2
if [ ! $1 ]
then
echo "Assertion failed: \"$1\""
echo "File \"$0\", line $lineno"
exit $E_ASSERT_FAILED
# else
# return
# and continue executing script.
fi
}
a=5
b=4
condition="$a -lt $b" # Error message and exit from script.
# Try setting "condition" to something else,
#+ and see what happens.
assert "$condition" $LINENO
# The remainder of the script executes only if the "assert" does not fail.
# Some commands.
# ...
echo "This statement echoes only if the \"assert\" does not fail."
# ...
# Some more commands.
exit 0 |
trapping at exit.
The exit command in a script triggers a signal 0, terminating the process, that is, the script itself. [2] It is often useful to trap the exit, forcing a "printout" of variables, for example. The trap must be the first command in the script.
Specifies an action on receipt of a signal; also useful for debugging.
trap '' 2 # Ignore interrupt 2 (Control-C), with no action specified. trap 'echo "Control-C disabled."' 2 # Message when Control-C pressed. |
Example 30-5. Trapping at exit
#!/bin/bash # Hunting variables with a trap. trap 'echo Variable Listing --- a = $a b = $b' EXIT # EXIT is the name of the signal generated upon exit from a script. # # The command specified by the "trap" doesn't execute until #+ the appropriate signal is sent. echo "This prints before the \"trap\" --" echo "even though the script sees the \"trap\" first." echo a=39 b=36 exit 0 # Note that commenting out the 'exit' command makes no difference, #+ since the script exits in any case after running out of commands. |
Example 30-6. Cleaning up after Control-C
#!/bin/bash
# logon.sh: A quick 'n dirty script to check whether you are on-line yet.
TRUE=1
LOGFILE=/var/log/messages
# Note that $LOGFILE must be readable
#+ (as root, chmod 644 /var/log/messages).
TEMPFILE=temp.$$
# Create a "unique" temp file name, using process id of the script.
KEYWORD=address
# At logon, the line "remote IP address xxx.xxx.xxx.xxx"
# appended to /var/log/messages.
ONLINE=22
USER_INTERRUPT=13
CHECK_LINES=100
# How many lines in log file to check.
trap 'rm -f $TEMPFILE; exit $USER_INTERRUPT' TERM INT
# Cleans up the temp file if script interrupted by control-c.
echo
while [ $TRUE ] #Endless loop.
do
tail -$CHECK_LINES $LOGFILE> $TEMPFILE
# Saves last 100 lines of system log file as temp file.
# Necessary, since newer kernels generate many log messages at log on.
search=`grep $KEYWORD $TEMPFILE`
# Checks for presence of the "IP address" phrase,
#+ indicating a successful logon.
if [ ! -z "$search" ] # Quotes necessary because of possible spaces.
then
echo "On-line"
rm -f $TEMPFILE # Clean up temp file.
exit $ONLINE
else
echo -n "." # The -n option to echo suppresses newline,
#+ so you get continuous rows of dots.
fi
sleep 1
done
# Note: if you change the KEYWORD variable to "Exit",
#+ this script can be used while on-line
#+ to check for an unexpected logoff.
# Exercise: Change the script, per the above note,
# and prettify it.
exit 0
# Nick Drage suggests an alternate method:
while true
do ifconfig ppp0 | grep UP 1> /dev/null && echo "connected" && exit 0
echo -n "." # Prints dots (.....) until connected.
sleep 2
done
# Problem: Hitting Control-C to terminate this process may be insufficient.
#+ (Dots may keep on echoing.)
# Exercise: Fix this.
# Stephane Chazelas has yet another alternative:
CHECK_INTERVAL=1
while ! tail -1 "$LOGFILE" | grep -q "$KEYWORD"
do echo -n .
sleep $CHECK_INTERVAL
done
echo "On-line"
# Exercise: Discuss the relative strengths and weaknesses
#! of each of these various approaches. |
Of course, the trap command has other uses aside from debugging.
Example 30-8. Running multiple processes (on an SMP box)
#!/bin/bash
# multiple-processes.sh: Run multiple processes on an SMP box.
# Script written by Vernia Damiano.
# Used with permission.
# Must call script with at least one integer parameter
#+ (number of concurrent processes).
# All other parameters are passed through to the processes started.
INDICE=8 # Total number of process to start
TEMPO=5 # Maximum sleep time per process
E_BADARGS=65 # No arg(s) passed to script.
if [ $# -eq 0 ] # Check for at least one argument passed to script.
then
echo "Usage: `basename $0` number_of_processes [passed params]"
exit $E_BADARGS
fi
NUMPROC=$1 # Number of concurrent process
shift
PARAMETRI=( "$@" ) # Parameters of each process
function avvia() {
local temp
local index
temp=$RANDOM
index=$1
shift
let "temp %= $TEMPO"
let "temp += 1"
echo "Starting $index Time:$temp" "$@"
sleep ${temp}
echo "Ending $index"
kill -s SIGRTMIN $$
}
function parti() {
if [ $INDICE -gt 0 ] ; then
avvia $INDICE "${PARAMETRI[@]}" &
let "INDICE--"
else
trap : SIGRTMIN
fi
}
trap parti SIGRTMIN
while [ "$NUMPROC" -gt 0 ]; do
parti;
let "NUMPROC--"
done
wait
trap - SIGRTMIN
exit $?
: << SCRIPT_AUTHOR_COMMENTS
I had the need to run a program, with specified options, on a number of
different files, using a SMP machine. So I thought [I'd] keep running
a specified number of processes and start a new one each time . . . one
of these terminates.
The "wait" instruction does not help, since it waits for a given process
or *all* process started in background. So I wrote [this] bash script
that can do the job, using the "trap" instruction.
--Vernia Damiano
SCRIPT_AUTHOR_COMMENTS |
![]() | trap '' SIGNAL (two adjacent apostrophes) disables SIGNAL for the remainder of the script. trap SIGNAL restores the functioning of SIGNAL once more. This is useful to protect a critical portion of a script from an undesirable interrupt. |
trap '' 2 # Signal 2 is Control-C, now disabled. command command command trap 2 # Reenables Control-C |
| [1] | Rocky Bernstein's Bash debugger partially makes up for this lack. |
| [2] | By convention, signal 0 is assigned to exit. |