AWK Scripts: Last updated 97-04-02
While doing my other jobs, I often write AWK scripts to help me accomplish
certain tasks. The scripts here are the ones which I think might be useful
to other people, maybe even you.
Feel free to browse through my collection and take any scripts which might
help you. If you want to know more about AWK, you can borrow the book
Awk Words from
my bookshelf.
Here is a list of the scripts. Following the list is the Extract script,
which allows you to extract an AWK script from this page, followed by the
other scripts in alphabetical order.
(96-09-19) extract.awk - Extract an AWK script from this page
This script will extract a single AWK script from this HTML page. Use Save
to save a copy of this page. (In the example below, the page is saved as
scripts.html.) Then cut-and-paste the extract script into
your favourite editor, or just type it calling the file
extract.awk. Then run it on the page. For example, to
extract the AWK script called double from this page, you
would enter:
awk extract script=double scripts.html >double.awk
Note: This particular script has been written without comments and on as few
lines as possible in order to help you get started with extracting scripts
as quickly as possible.
#File: extract.awk
BEGIN { inListing = 0; copy = 0; script = "extract" }
/^[<]LISTING>/ { inListing = 1; copy = 0 }
/^<\/LISTING>/ { inListing = 0; copy = 0 }
copy { print }
inListing && /^#*File: / { if ( $0 ~ script ".awk" ) copy = 1 }
Note that the [<] on the third line is just there to convince
some browsers (like Netscape) to display the script properly. The script
will work, or you can replace [<] by <
(96-12-15) double.awk - Double-space a GREP result
This script double-spaces the results of a GREP search. It ignores the
"File" lines, and double-spaces the other lines. You might use this script as
part of a text processing batch file to create a Table of Contents by
extracting and then spacing the Contents entries.
#File: double.awk
# DOUBLE.AWK
#
# Written 96-07-09
# By Rob Ewan
#
# This is an AWK program to double-space a text file.
# Since it is intended for use on GREP results, it will ignore the lines
# containing the file names.
#
# Usage:
# AWK DOUBLE file(s)
#
# Index entries:
!/^File [a-zA-Z0-9.]*:/ {
# Double-space for table of contents
print
print ""
}
(96-12-15) givenext.awk - Update a version-based build
This is an interesting script which I wrote quite a while ago. We had a
Build program which built a version of our software. Since we were building
versions very quickly, we wanted an automated way of updating the version
number.
We created a BuildNxt batch file to run the Build program. If we were about
to build version 16 of THEPROG, the BuildNxt batch file would look like
this:
Call Build THEPROG 16
Then we created an outer batch file which would run BuildNxt and
automatically update the version number. It did this by calling the AWK
script, Givenext. Notice that Givenext uses 'printf' to control the format
of the version number in the created file.
NEXT.BAT:
BuildNxt
Awk GiveNext BuildNxt.Bat >Tmp.tmp
Copy Tmp.tmp BuildNxt.Bat
#File: givenext.awk
/Build/ {
newNumber = $3 + 1
printf("%s %s %03d", $1, $2, newNumber)
}
(97-01-26) histolog.awk - Convert timestamped log to histogram
One of the most common diagnostic tools I have used is a timestamped log of
One of the most common diagnostic tools I have used is a timestamped log of
events. Whenever one of my programs creates such a log, it consists of one
line per event, with the first thing on that line being a timestamp, followed
by the event identification information (severity, what actually happened,
relevant values, etc).
Since the first few columns of the log are the time, and since the log is
automatically time sorted (since it was written that way), I can find out
how many events happened in any given hour (or minute, or whatever).
This script simply automates the conversion from a log file (or a filtered
log file) into a histogram. The output shows the number of events (lines)
for each time. By setting the variables to the AWK script, you can change
the time resolution (I typically use hours, but set width=5 if I want
minutes), or change to a line (with error bars) plot.
Generally, if you keep a log of all the interesting events, you will want to
use GREP first to extract the interesting events before using this script to
convert to a histogram. You could easily write a batch file which allows
you to filter the log before passing it to this script.
#File: histolog.awk
# HISTOLOG.AWK
#
# Written 97-01-07
# By Rob Ewan
#
# This is an AWK program to convert a timestamped log into a chart of number
# of messages versus time.
# Normally, the log file will have been preparsed to extract the messages
# which are of interest.
# The script assumes that each line in the log will begin with a timestamp,
# but provides options for the timestamp to be elsewhere on the line.
#
# Usage:
# AWK HISTOLOG [options] logFile(s)
#
# Options:
# scale=nn Set the scaling factor for the display ( 1 )
# field=n Select field which contains the 'time' (0 - whole line)
# column=n Set where the 'time' starts within the field ( 1 )
# width=n Set the width (characters) of the 'time' input ( 2 )
# fuzz=n Display eith bars (0) or lines with error (1) ( 0 )
#
# Put initialization code here
BEGIN {
scale = 1 # Each '*' corresponds to this count
field = 0 # Which field contains the 'time'
column = 1 # Start of 'time' within the field
width = 2 # Width of the 'time' defining field
fuzz = 0 # Flag: Turn on to see 'error bars' on readings
totalCounts = 0
}
# DUMP HISTOGRAM BAR - Draw one bar of the histogram
function DumpHistogramBar() {
histoBar = ""
# If we want 'error bars',
if ( fuzz ) {
# Generate lower/upper bound for reading
lbound = nCounts - sqrt(nCounts);
ubound = nCounts + sqrt(nCounts);
# Generate a two-colour bar to show the reading
for ( i = 0; i < ((ubound+scale-1)/scale); i++ ) {
if ( (i+1)*scale <= lbound ) {
histoBar = histoBar " "
} else if ( i*scale < nCounts ) {
histoBar = histoBar "*"
} else {
histoBar = histoBar "+"
}
}
# Otherwise, just a straight histogram
} else {
for ( i = 0; i < ((nCounts+scale-1)/sca i++) {
histoBar = histoBar "*"
}
}
printf( "%" width "s - %3d: %s\n", oldTime, nCounts, histoBar );
totalCounts += nCounts;
}
# SHOW OPTIONS - List the current values of the options (for heading)
function ShowOptions( fileName) {
printf( "%s - Scale: %d, Time width: %d", fileName, scale, width );
if ( fuzz ) printf( ", with fuzz bars" );
printf( "\n" );
}
# Now put the active code. pattern { action }
{ used = 0 }
/^File/ {
ShowOptions( $0 );
used = 1
}
!used {
timeStr = substr( $field, column, width );
if ( timeStr == oldTime ) {
nCounts++;
} else {
if ( oldTime != "") DumpHistogramBar();
oldTime = timeStr;
nCounts = 1;
}
}
END {
if ( nCounts > 0 ) DumpHistogramBar();
print "Total counts = " totalCounts
}
(96-12-15) oneline.awk - One-line AWK scripts (for command line use)
These scripts are all intended for use in command-line applications of AWK.
Most simply provide examples of using AWK at the command line. Some you may
want to capture as scripts.
I use NARROW to reduce the width of grep output so that it won't wrap
when printed, and SUM (with grep -c) to get the total number of
matches.
#File: oneline.awk
# (96-11-28) NOBLANK.AWK - Remove all the blank lines from the input
!/^ *$/
# (96-11-28) FILES.AWK - Keep only the file lines from a full DIR listing
$0 != "" && !/^ / { print }
# (97-02-28) NARROW.AWK - Removes excess white space to make listings narrower
{ gsub( / +/, " " ); print }
# (97-02-28) SUM.AWK - Add the counts from a 'grep -c' listing (2 line script)
!/^File/ { count += $1 }
END { print "Grand total " count }
Do you have a script which I might find useful? Did you find a problem in
one of my scripts? Write to Rob to let me know.
-
Read the book on AWK.
-
Read about innumeracy, or sleight of number.
-
Friday the 13th and other bugs.
-
More of my ideas on programming.
-
Go back to the front gate.
Without tools he is nothing, with tools he is all.
Thomas Carlyle, 1834
Page maintained by Rob.