UnixReview.com
June 2006
Shell Corner: Bash Dynamically Loadable Built-in Commands
Hosted by Ed SchaeferBash shell programmers can improve the efficiency of their scripts by using the shell's dynamically loadable built-in commands. This month, Chris F.A. Johnson shows us how to use them.
Bash Dynamically Loadable Built-In Commands
by Chris F.A. Johnson
If you use the shell for serious programming, as I do, speed of
execution is a serious issue. A script should not appear sluggish;
it should not be noticeably slower than a program written in Perl or
Python — or even C. One of the major contributors to slowdown
of scripts is starting a new process, whether it is an external
command or command substitution. (All shells except Korn Shell 93 create a new process for command substitution.) I first noticed how long command substitution takes when
I was writing a script to print a form on the screen. A function
converted dates in ISO form (YYYY-MM-DD) to a "human friendly" DD Month YYYY with a_date=$(format_date "$date")
. There were only four date fields in the form, but with the conversion there was a noticeable delay; without it, the screen was redrawn immediately.
When I started writing Unix shell scripts, I used a Bourne shell. It
was far more powerful than the Amiga or MS-DOS shells I had used
previously, but it still relied on external commands for most useful
work. There was no arithmetic in the shell; I used
expr
and awk
for calculations. I used
expr
, cut
, tr
,
basename
, and various other commands to manipulate
strings.
With the Korn shell, and the later POSIX/SUS standardization, string
chopping (via parameter expansion: ${var%PATTERN}
,
${var#PATTERN}
, etc.) and integer arithmetic were
brought into the shell itself, speeding up many operations. It
became possible to write many useful programs without
calling any external commands.
This still left many trivial operations requiring external commands
(converting uppercase letters to lowercase, for example).
Bash
has a solution — commands that can be compiled and
loaded at run time if and when needed.
Compiling and Loading Bash Built-Ins
The bash
source package has a directory full of
examples ready to be compiled. To do that, download the source from
ftp://ftp.cwru.edu/pub/bash/bash-3.1.tar.gz. Unpack the tarball,
cd
into the top-level directory, and run the
configure
script:
wget ftp://ftp.cwru.edu/pub/bash/bash-3.1.tar.gz gunzip bash-3.1.tar.gz tar xf bash-3.1.tar cd bash-3.1 ./configureThe
configure
script creates Makefiles
throughout the source tree, including one in
examples/loadables
. In that directory are the source
files for built-in versions of a number of standard commands "whose
execution time is dominated by process startup time". You can
cd
into that directory and run make
:
cd examples/loadables make -k ## I use -k because I get some errors.You'll now have a number of commands ready to load into your
bash
shell. These include:
logname basename dirname tee head mkdir rmdir uname ln cat id whoamiThere are also some useful new commands:
print ## Compatible with the ksh print command finfo ## Print file information strftime ## Format date and timeThese built-ins can be loaded into a running shell with:
enable -f filename built-in-nameThey include documentation, and the
help
command can be
used with them, just as with other built-in commands:
$ enable ./strftime strftime $ help strftime strftime: strftime format [seconds] Converts date and time format to a string and displays it on the standard output. If the optional second argument is supplied, it is used as the number of seconds since the epoch to use in the conversion, otherwise the current time is used.Modifying Loadable Built-Ins With the
strftime
command, I can now do date arithmetic
without external commands. For example, to get yesterday's date (a
very frequently asked question in the newsgroups):
strftime %Y-%m-%d $(( $(strftime %s) - 86400 ))That script has one drawback — it uses command substitution. The timing of commands must not be taken too literally (they can vary a great deal even on the same system, depending on what else is running at the time), but they give a useful basis for comparison. The difference between using the built-in
strftime
(with command substitution) and the GNU date command is surprisingly
small:
$ time strftime %Y-%m-%d $(( $(strftime %s) - 86400 )) 2006-04-04 real 0m0.006s user 0m0.000s sys 0m0.005s $ time date -d yesterday +%Y-%m-%d 2006-04-04 real 0m0.007s user 0m0.000s sys 0m0.007sIn absolute terms, it's not very long, but in a script there may be many such commands and they may be repeated many times. Since built-in commands are executed in the current shell, why not have it set a variable instead of printing the result? I added an option to
strftime
to store the result in a variable rather than
printing it on stdout
. The difference was significant:
$ time { strftime -p now %s strftime %Y-%m-%d $(( $now - 86400 )) } 2006-04-04 real 0m0.000s user 0m0.000s sys 0m0.000sThe changes to
strftime.c
are relatively minor.
I included the header for bash
's internal options
parser:
#include "bashgetopt.h"Then I declared two variables:
int ch; char *var = NULL;The longest piece of code parses the options, which are passed as a linked list and parsed by
bash
's own function:
reset_internal_getopt (); while ((ch = internal_getopt (list, "p:")) != -1) switch(ch) { case 'p': var = list_optarg; /* should add check for valid variable name */ break; default: builtin_usage(); return (EX_USAGE); } list = loptend;The
bind_variable
function stores the result in a shell
variable if the -p
option was used:
if ( var ) bind_variable (var, tbuf, 0); elseFinally, there are two lines to add to the documentation. The first is added to the array of strings that are printed when
help
strftime
command is used:
"OPTION: -p VAR - Store the result in shell variable VAR",The second is the short documentation or usage string that modifies the existing string:
"strftime [-p VAR] format [seconds]", /* usage synopsis; becomes short_doc */The final
strftime.c
file is shown in Listing 1.
Writing New Bash Built-Ins
To write your own loadable commands, create a directory for them and
copy the Makefile
and the template.c
files
from bash-3.1/examples/loadables
into it. The
Makefile
, which was created by running
./configure
at the root of the bash
source
tree, contains the location of that source so that header files can
be found. Make sure that top_dir
points to the same
place as BUILD_DIR
. I also strip out all that I don't
need. My resulting Makefile
is shown in Listing 2, and template.c
is shown in Listing 3.
The template.c
file is compilable and has the bare bones
necessary to write a dynamically loadable built-in, plus the skeleton
for adding command-line options. There are three necessary
sections:
- The function that implements the built-in,
- A
struct
containing the documentation to be printed with thehelp
command, and - A
struct
telling bash where to find the built-in and its documentation, and a short documenation or usage string.
examples/loadables/hello.c
. See Listing 4 .
To write a new built-in command, I use this script to change the
references to template
in template.c
to
the name of my built-in and add it to the Makefile
. See Listing 5 .
One of the scripts most frequently requested in the Unix and Linux
newsgroups converts filenames from uppercase (or partly uppercase)
to lowercase. This usually means calling tr
once for
every file. (ksh
has typeset -u
, but it's
non-standard and not implemented in bash
.)
A shell function is quite efficient for converting short strings:
lcase() { word=$1 while [ -n "$word" ] do temp=${word#?} case ${word%"$temp"} in A*) _LWR=a ;; B*) _LWR=b ;; C*) _LWR=c ;; D*) _LWR=d ;; E*) _LWR=e ;; F*) _LWR=f ;; G*) _LWR=g ;; H*) _LWR=h ;; I*) _LWR=i ;; J*) _LWR=j ;; K*) _LWR=k ;; L*) _LWR=l ;; M*) _LWR=m ;; N*) _LWR=n ;; O*) _LWR=o ;; P*) _LWR=p ;; Q*) _LWR=q ;; R*) _LWR=r ;; S*) _LWR=s ;; T*) _LWR=t ;; U*) _LWR=u ;; V*) _LWR=v ;; W*) _LWR=w ;; X*) _LWR=x ;; Y*) _LWR=y ;; Z*) _LWR=z ;; *) _LWR=${1%${1#?}} ;; esac printf "%s" "$_LWR" word=$temp done }...but it drags when used for long words, and approaches the execution time of
tr
. A built-in command would be an
order of magnitude faster, so I wrote lcase
. See Listing 6 .
Having done that, I added the inverse, ucase
to convert
lowercase to uppercase. Then, I used icase
to convert upper to
lower and lower to upper. Next came pattern creation to match either
upper or lower case:
$ icase "John Doe" jOHN dOE $ ncase qwerty [Qq][Ww][Ee][Rr][Tt][Yy]Finally, I added
cap
, to capitalize the first letters
of words. I amalgamated all of these into a single file (Listing 7 , case.c
), and they are all enabled
with a single command:
enable -f $HOME/src/loadables/case ucase lcase icase ncase cap
Chris Johnson is the author of Shell Scripting Recipes: A Problem Solution Approach, Apress (2005). When not pushing shell scripting to the limits, Chris composes cryptic crosswords and teaches chess.