AWK Scripts: Last updated 97-04-02

While doing my other jobs, I often write AWK scripts to help me accomplish certain tasks. The scripts here are the ones which I think might be useful to other people, maybe even you.

Feel free to browse through my collection and take any scripts which might help you. If you want to know more about AWK, you can borrow the book Awk Words from my bookshelf.

Here is a list of the scripts. Following the list is the Extract script, which allows you to extract an AWK script from this page, followed by the other scripts in alphabetical order.

(96-09-19) extract.awk - Extract an AWK script from this page

This script will extract a single AWK script from this HTML page. Use Save to save a copy of this page. (In the example below, the page is saved as scripts.html.) Then cut-and-paste the extract script into your favourite editor, or just type it calling the file extract.awk. Then run it on the page. For example, to extract the AWK script called double from this page, you would enter:
awk extract script=double scripts.html >double.awk

Note: This particular script has been written without comments and on as few lines as possible in order to help you get started with extracting scripts as quickly as possible.

#File: extract.awk BEGIN { inListing = 0; copy = 0; script = "extract" } /^[<]LISTING>/ { inListing = 1; copy = 0 } /^<\/LISTING>/ { inListing = 0; copy = 0 } copy { print } inListing && /^#*File: / { if ( $0 ~ script ".awk" ) copy = 1 } Note that the [<] on the third line is just there to convince some browsers (like Netscape) to display the script properly. The script will work, or you can replace [<] by <

(96-12-15) double.awk - Double-space a GREP result

This script double-spaces the results of a GREP search. It ignores the "File" lines, and double-spaces the other lines. You might use this script as part of a text processing batch file to create a Table of Contents by extracting and then spacing the Contents entries. #File: double.awk # DOUBLE.AWK # # Written 96-07-09 # By Rob Ewan # # This is an AWK program to double-space a text file. # Since it is intended for use on GREP results, it will ignore the lines # containing the file names. # # Usage: # AWK DOUBLE file(s) # # Index entries: !/^File [a-zA-Z0-9.]*:/ { # Double-space for table of contents print print "" }

(96-12-15) givenext.awk - Update a version-based build

This is an interesting script which I wrote quite a while ago. We had a Build program which built a version of our software. Since we were building versions very quickly, we wanted an automated way of updating the version number.

We created a BuildNxt batch file to run the Build program. If we were about to build version 16 of THEPROG, the BuildNxt batch file would look like this:

Call Build THEPROG 16
Then we created an outer batch file which would run BuildNxt and automatically update the version number. It did this by calling the AWK script, Givenext. Notice that Givenext uses 'printf' to control the format of the version number in the created file. NEXT.BAT: BuildNxt Awk GiveNext BuildNxt.Bat >Tmp.tmp Copy Tmp.tmp BuildNxt.Bat #File: givenext.awk /Build/ { newNumber = $3 + 1 printf("%s %s %03d", $1, $2, newNumber) }

(97-01-26) histolog.awk - Convert timestamped log to histogram

One of the most common diagnostic tools I have used is a timestamped log of One of the most common diagnostic tools I have used is a timestamped log of events. Whenever one of my programs creates such a log, it consists of one line per event, with the first thing on that line being a timestamp, followed by the event identification information (severity, what actually happened, relevant values, etc).

Since the first few columns of the log are the time, and since the log is automatically time sorted (since it was written that way), I can find out how many events happened in any given hour (or minute, or whatever).

This script simply automates the conversion from a log file (or a filtered log file) into a histogram. The output shows the number of events (lines) for each time. By setting the variables to the AWK script, you can change the time resolution (I typically use hours, but set width=5 if I want minutes), or change to a line (with error bars) plot.

Generally, if you keep a log of all the interesting events, you will want to use GREP first to extract the interesting events before using this script to convert to a histogram. You could easily write a batch file which allows you to filter the log before passing it to this script.

#File: histolog.awk # HISTOLOG.AWK # # Written 97-01-07 # By Rob Ewan # # This is an AWK program to convert a timestamped log into a chart of number # of messages versus time. # Normally, the log file will have been preparsed to extract the messages # which are of interest. # The script assumes that each line in the log will begin with a timestamp, # but provides options for the timestamp to be elsewhere on the line. # # Usage: # AWK HISTOLOG [options] logFile(s) # # Options: # scale=nn Set the scaling factor for the display ( 1 ) # field=n Select field which contains the 'time' (0 - whole line) # column=n Set where the 'time' starts within the field ( 1 ) # width=n Set the width (characters) of the 'time' input ( 2 ) # fuzz=n Display eith bars (0) or lines with error (1) ( 0 ) #  # Put initialization code here BEGIN { scale = 1 # Each '*' corresponds to this count field = 0 # Which field contains the 'time' column = 1 # Start of 'time' within the field width = 2 # Width of the 'time' defining field fuzz = 0 # Flag: Turn on to see 'error bars' on readings totalCounts = 0 } # DUMP HISTOGRAM BAR - Draw one bar of the histogram function DumpHistogramBar() { histoBar = "" # If we want 'error bars', if ( fuzz ) { # Generate lower/upper bound for reading lbound = nCounts - sqrt(nCounts); ubound = nCounts + sqrt(nCounts); # Generate a two-colour bar to show the reading for ( i = 0; i < ((ubound+scale-1)/scale); i++ ) { if ( (i+1)*scale <= lbound ) { histoBar = histoBar " " } else if ( i*scale < nCounts ) { histoBar = histoBar "*" } else { histoBar = histoBar "+" } } # Otherwise, just a straight histogram } else { for ( i = 0; i < ((nCounts+scale-1)/sca i++) { histoBar = histoBar "*" } } printf( "%" width "s - %3d: %s\n", oldTime, nCounts, histoBar ); totalCounts += nCounts; } # SHOW OPTIONS - List the current values of the options (for heading) function ShowOptions( fileName) { printf( "%s - Scale: %d, Time width: %d", fileName, scale, width ); if ( fuzz ) printf( ", with fuzz bars" ); printf( "\n" ); } # Now put the active code. pattern { action } { used = 0 } /^File/ { ShowOptions( $0 ); used = 1 } !used { timeStr = substr( $field, column, width ); if ( timeStr == oldTime ) { nCounts++; } else { if ( oldTime != "") DumpHistogramBar(); oldTime = timeStr; nCounts = 1; } } END { if ( nCounts > 0 ) DumpHistogramBar(); print "Total counts = " totalCounts }

(96-12-15) oneline.awk - One-line AWK scripts (for command line use)

These scripts are all intended for use in command-line applications of AWK. Most simply provide examples of using AWK at the command line. Some you may want to capture as scripts.

I use NARROW to reduce the width of grep output so that it won't wrap when printed, and SUM (with grep -c) to get the total number of matches.

#File: oneline.awk # (96-11-28) NOBLANK.AWK - Remove all the blank lines from the input !/^ *$/ # (96-11-28) FILES.AWK - Keep only the file lines from a full DIR listing $0 != "" && !/^ / { print } # (97-02-28) NARROW.AWK - Removes excess white space to make listings narrower { gsub( / +/, " " ); print } # (97-02-28) SUM.AWK - Add the counts from a 'grep -c' listing (2 line script) !/^File/ { count += $1 } END { print "Grand total " count }
Do you have a script which I might find useful? Did you find a problem in one of my scripts? Write to Rob to let me know.
- Read the book on AWK.
- Read about innumeracy, or sleight of number.
- Friday the 13th and other bugs.
- More of my ideas on programming.
- Go back to the front gate.

Without tools he is nothing, with tools he is all.
Thomas Carlyle, 1834

Page maintained by Rob.