Matt J. Gumbley | Website: http://www.gumbley.me.uk
Blog: On The Edge of Occam's Razor

webmacro - Matt's HTML Macro Expander

Abstract

WebMacro is Matt's HTML Macro Expander. It is a small filter program written in Perl 5 which provides some time-saving features for HTML writers. The main benefits it gives to the writer are tables of contents, standardised headers and footers, user-defined tags with macro expansion and variable interpolation, user-defined indices, keyword/subject index, and inclusion of other documents. Matt uses it to build all his home pages, technical documentation, and other HTML design.

Table Of Contents

Contents



1 Introduction

WebMacro is a filter program which takes a pseudo-HTML file ("HTML plus") as its input, processes it, and creates a HTML file as its output.

To use the macro expander, create a file containing WebMacro tags, (let's say, mydoc.htp) and pass it through the expander:


   webmacro < mydoc.htp > mydoc.html

The mydoc.html file now contains your HTML, which can be viewed with a browser, e.g. Netscape Communicator.

You could call your master document mydoc.htp, and build a makefile which converts .htp files to .html files by passing them through WebMacro.

Contents



2 Copyright, licensing, availability and acknowledgements

WebMacro is © 1998, 1999, 2000 Matthew James Gumbley. It is distributed under the terms of the GNU General Public License, a copy of which you will find in the file 'COPYING'.

This software was partly developed while working for my employer, Petroleum Exploration Computer Consultants, and Enigma Data Systems. Thanks to Dr. Ugur Algan for allowing me to release WebMacro under the GPL.

Thanks also to the following users of WebMacro for providing several excellent suggestions for improvements:

  • Mark Clegg
  • David Crosta

To download the latest version, click this link: ( bytes)

webmacro-0.03.tar.gz

Please let me know if you'd like any changes making to WebMacro. I maintain a small mailing list for announcements and discussions about it - please mail me to be added to it.

Contents



3 Revision History

  • 19th June 1998: Released version 0.01 via my home page.
  • 12th January 1999: Released version 0.02 via my home page. Added the BOX macro to the standard distribution, added the INCLUDE directive, added the ability to perform macro expansion with variable interpolation.
  • 7th July 2000; Released version 0.03. Added TOCLINKS tag from Mark Clegg, some useful quickie tags from me, and EVAL, the possibilities of which are just too staggering to comprehend. INDEX and INDEXREF were added after a suggestion from David Crosta. IDX and the subject/keyword index now works.

Contents



4 Using the macro expander

A "HTML plus" file is a normal HTML file with special tags embedded in it. WebMacro basically does a search and replace on these tags.

4.1 Building a Table of Contents

In a structured HTML document, you would have several sections, subsections, and sub-subsections ad nauseum. These are identified by section headings, and in HTML, a section heading is given a level number:

   <H1>First main section</H1>
   blah...
   <H2>First subsection</H2>
   more blah...
   <H2>Second subsection</H2>
   more blah...
   <H1>Second main section</H1>

Building a table of contents by hand in HTML is tedious. It should contain nested lists of sections, each one being a hypertext reference to the actual section in the main body of the document. At the places where the section heading occur in the document, a hypertext fragment identifier needs inserting, to provide a place to jump to, when the user selects a link in the table of contents.

This is an ideal candidate for automation :-)

WebMacro will automatically scan through your document for heading tags, and build them into a table of contents. It will add section numbers automatically to both the table of contents entries, and the section headings in the main document, should you so desire.

To add a table of contents, simply place a <TOC> tag at the place you want the table to appear.

To turn on section numbering, place a <TOCNUMBERS> tag before the first section heading you use.

To disable section numbering - for instance, just before your appendices - place a <NOTOCNUMBERS> tag before the next section heading. The following appendix section headings should then be declared with <H1>A. Whatever title</H1> - WebMacro does not yet perform automatic appendix numbering (or should that be lettering?).

Numbered sections are off by default.

To place a section heading in your document that will be ignored by the heading scanner, place a <TOCIGNORE> tag immediately before the <Hx> tag. This is useful for the heading called "Table Of Contents". You might want a heading claled this, but you might not want it to appear in the table of contents itself. Usually, you would write the following:


   <TOCNUMBERS>
   <TOCIGNORE><H1>Table of Contents</H1>
   <TOC>
  

4.2 Adding horizontal rules after sections of a given level

To separate sections in your document, you may want to place a horizontal rule at the start of every second-level heading. To do this, use:
   <SECTIONRULE 2>
Or, you could have rules after every first level heading, by replacing the 2 with a 1, etc. Section rules are off by default.

4.3 Adding links back to the table of contents

To create links at the start of each section heading, which point back to the table of contents, you need to use the <TOCLINKS ! text !> tag. Simply place this tag before the <TOC> tag, supplying whatever you want between the exclamation marks.

For example, this document uses:

   <TOCLINKS !<P ALIGN="RIGHT">Contents</P>!>
To achieve the effect you see. You could place right-justified graphics in the links - this looks good.

4.4 Creating indices

In your document, you may want a table of contents, as described above, but you might also want other tables, such as a table of figures, a table of equations, or UML diagrams, whatever. These are called indices.

WebMacro can automate the building of these indices. You still have to do the work of adding IMG SRC="....." whenever a picture is included in your document, but in order for WebMacro to add an entry in a specific index, use the tags <INDEXREF iname>some caption</INDEXREF>

Here, iname is an index name. For example:

   ... some complex looking bit of maths ...
   <INDEXREF EQUATIONS>Schrödinger's Wave Equation</INDEXREF>
At the point where you would like the index to appear, use the tag
   <INDEX EQUATIONS>
For example, an index of key illustrations in a text on quantum mechanics might look like:

Quantum Mechanical Illustrations
4.0 Schrödinger's Wave Equation
4.1 Dirac's Equation
4.2 A Feynman Diagram
4.3 Hawking Radiation

4.5 Adding entries to a subject or keyword index

In addition to user-defined indices of figures, equations, and the like, a master index may be created, just like the index at the back of a book - a subject index.

To create this, you need to place index entries throughout your document, at places where you're referring to new concepts. These index entries will be collected, and built into a subject index, which is usually placed at the end of the document.

To insert the subject index, use the <INDEX> tag. You may want to define a section heading before it (usually containing the phrase "Index") - you may want to use the <NOTOCNUMBERS> tag before this, so that the index doesn't get given a section number.

Defining individual entries to be collected into the index is a little more complicated. There are several forms, and the syntax is tricky.

Supposing you want an entry in the index indicating where in your document you are starting to talk about left-handed flange benders (!). You would write your first words thusly:

   <IDX>left-handed flange bender</IDX>
   In 1879, Pierre la Vache invented the 
   left-handed flange bender.  It was an instant success....
When viewed, the text will appear as you would expect:
   In 1879, Pierre la Vache invented the 
   left-handed flange bender. 
   It was an instant success....
In the index, an entry for "left-handed flange bender" will appear, with a hypertext link back to the place where you introduced the <IDX> tag.

Your treatise on the left-handed flange bender will probably require several index entries, relating to different facets of this intriguing device. To support the creation of grouped index entries, you may supply a context for the index reference. The following example will help clarify this:

   <IDX>left-handed flange bender(invention)</IDX>
   In 1879, Pierre la Vache invented the 
   left-handed flange bender.
   It was an instant success....
   ... several pages later ...
   <IDX>left-handed flange bender(handedness)</IDX>
   Mass adoption of the device did not take place until 1892, when the
   handedness of the device was questioned by John Cuckson. 
   <IDX>left-handed flange bender(history)</IDX>
   In his landmark text on the subject, A Natural History of Flange 
   Deforming Devices......

4.6 Creating per-user settings files

Before your HTML file is read, WebMacro reads a configuration file, if you create one. This enables you to predefine your own frequently-used tags and use them in all your documents, withough having to define them in every document you write. The name of this file is .webmacros, and it should be created in your home directory. The file is read in the same manner as the HTML files you normally use.

One use for this is to define a standardised header and footer for all your documents.

Note that this option is per-user, so when one user builds a document, they'll get their .webmacros, whereas another user would get their set, which will probably be different.

So if you want a library of macros used throughout all of your organisation's documents, it may be better to create this in a separate file, and use the <INCLUDE> tag.

4.7 Adding a header and footer

To define a header for the document, enclose the header text (and any tags you want) inside <HEADER> and </HEADER> tags.

Similarly, for the footer, use <FOOTER> and </FOOTER> tags.

Note: these tags must appear on their own, on a separate line all to themselves. The contents of the header and footer can span multiple lines.

4.8 Using the built-in "quickie" tags

One of the time-savers is the library of predefined tags. Suppose you want to present a piece of text in green. You might do this:

   <FONT COLOR="#00C000">green text</FONT>

If you frequently write in green, this can quickly become painful. To help, you can use the built in tags:
   <GREEN>green text</GREEN>
When WebMacro sees these tags, it converts the text into the long-winded version above. There are several of these tags pre-defined for you:
  • <RED>
  • <GREEN>
  • <BLUE>
  • <YELLOW>
  • <CYAN>
  • <PURPLE>
Some other predefined tags are <PAPER>, <ABSTRACT> and <NOTE>.

Use <PAPER> in place of <BODY> and </PAPER> in place of </BODY> to create a document with a white background with nicely-coloured links.

Note: you can't use <PAPER> in conjunction with <HEADER>. Just use <HEADER> and then place <PAPER> inside your header. Similarly for the ending tags.
Use <ABSTRACT> to create a formatted abstract, like the one at the start of this document.

Use <NOTE> to create indented notes, with a "Note:" at their start, like the one above.

If you want to put text in nice boxes, you can use the <BOX>...</BOX> directives. The box will be created with a black border and white background. Sorry if these colours aren't your style. You can always redefine the BOX macro.

Boxes will expand
to fit - if you want
a narrow column
(like this)
you have to add BR line
breaks yourself

4.9 Defining your own "quickie" tags

To define your own tags, use the <DEFTAG> tag. The syntax of this is rather inscrutable, since the whole tag has to be on one line. This shouldn't present a great problem: tags can be nested - you can reference pre-defined tags, and tags you created earlier. To define a tag, you need to give its name, and one or two strings, which define the pre- and post-strings of this tag. To take the above green example, if this was defined using <DEFTAG>, the line would look like:
   <DEFTAG NAME=!GREEN! PRE=!<FONT COLOR="#00C000">!  POST=!</FONT>!>
The name, and the pre- and post-strings must all be enclosed in exclamation marks. When WebMacro encounters the <GREEN> tag, it replaces it with the pre-string, in this case, <FONT COLOR="#00C000">. When it encounters the ending tag, in this case, </GREEN>, it replaces it with the post-string, </FONT>

You don't have to provide a post-string, if you don't need one. You can use this to create short-cuts for large commonly-used pieces of HTML.

There is also another tag, <DEFBLK>. This allows you to create large, multi-line blocks of text, and use a name to abbreviate them. for instance:

   <DEFBLK logo>
   LL
   LL      ooo   ggg   ooo
   LL     o   o g   g o   o
   LLLLLL  ooo   gggg  ooo
                    g
                gggg
   </DEFBLK>
   ...
   <logo>

4.10 Using replacement macros

The real power of macros comes when you give them arguments. If you define a tag or block containing the magic characters $0 $1 $2 etc., then they will be replaced by the arguments you give when you call upon the macro. Like this:
   <DEFBLK MYBLOCK>
   This is a block of text
   containing the word `$0'
   which will be replaced 
   in the macro expansion.
   </DEFBLK>

   <DEFTAG NAME=!MYVAR1! PRE=!Oh what a lovely day!>
On its own, the block looks like this:
   <MYBLOCK>
But with macro expansion, you can do this:
   <MYBLOCK artichoke>
Or, if you want to reference another block inside the macro:
   <MYBLOCK $MYVAR1>
And here's a slightly more complex example...
   <DEFBLK ART>
   <TABLE BGCOLOR="BLACK" BORDER=0 CELLPADDING=1 CELLSPACING=0><TR><TD>
   <TABLE BGCOLOR="WHITE" BORDER=1 CELLPADDING=10 CELLSPACING=0>
   <TR><TD>$0</TD><TD>$1</TD></TR>
   <TR><TD>$2</TD></TR>
   </TABLE></TD></TR></TABLE>
   </DEFBLK>

   <ART fred bert sydney>
When processed, the results look like this:


On its own, the block looks like this:
This is a block of text
containing the word `$0'
which will be replaced 
in the macro expansion.


But with macro expansion, you can do this:
This is a block of text
containing the word `artichoke'
which will be replaced 
in the macro expansion.


Or, if you want to reference another block inside the macro:
This is a block of text
containing the word `Oh what a lovely day'
which will be replaced 
in the macro expansion.


And here's a slightly more complex example...


fredbert
sydney
Note that in macro arguments, you can use a word, such as artichoke, or a previously-defined block, which you must prefix with a $, as in $MYVAR1 above. The expansion will give the PRE-part of the block, so if you define a tag as MYVAR1 is defined above, then referring to $MYVAR1 will yield whatever you defined in the PRE=!...! section of the DEFTAG. Or, it will yield the whole of a DEFBLK.

The macro arguments are split on whitespace; there is no quoting mechanism. You'd have to assign arguments containing spaces to a block, as in $MYVAR1 above. <MYBLOCK Oh what a lovely day> has five arguments, not one, and they'll be assigned to $0..4.

One of my favourite uses for this is to avoid having to type a URL twice. For instance, you can get to the Perl language home page with this link. However, if you print this document out, you'll have no idea where that link points to. So, if you want to do it the hard way, you would use the following HTML:

   <A HREF="http://www.perl.com">http://www.perl.com</A>
... Which gives a visible URL that works as a hypertext link. In WebMacro, you could define a new tag, <LINK>...
   <DEFTAG NAME=!LINK! PRE=!<A HREF="$0">$0</A>!>
So that if you want to get to the Perl home page, you'll find it at http://www.perl.com .

In fact, I like <LINK> so much, I made it part of the built-in quickie tags in 0.03.

4.11 Inserting the date of last document translation

Simply place a <DATE> tag in your document somewhere. For instance, this document was last run through WebMacro on Fri Dec 22 00:28:35 2006.

4.12 Including other files

You may include other documents at any point in your document, and these may include other documents, and so on - there's no limit to the nesting level.

For example you might do:

   Here is my main document
   <INCLUDE bigmacros.htp>
   
There is no restriction on the filename used in the INCLUDE directive.

This, and Perl's intelligent open command can be exploited nicely:

   <INCLUDE ls -lh downloadable.tar.gz | awk '{ print $5 }'|>

This allows you to insert the output of pretty much any shell commands in the output of WebMacro (with macro expansions performed on them). Thanks to Mark Clegg for pointing that out.

This option can be used to build up a library of macros used in all your documentation. At the top of your documents, you'd do:

   <INCLUDE housestyle.htp>

4.13 Embedding Perl code

You may embed Perl code in your documents within <DEFEVAL ...> blocks, like this:
   <DEFEVAL CUBE>
   $0 * $0 * $0
   </DEFEVAL>

   The value of 27<SUP>3</SUP> is <CUBE 27>.
When run, this looks like this:
The value of 273 is 19683.

This was used to show you the size of the current WebMacro downloadable archive above. Here's how, showing the size of the webmacro source code:

   <DEFEVAL FILESIZE>
   (stat("$0"))[7]
   </DEFEVAL>

   The size of webmacro is currently <FILESIZE webmacro>.
And in case you're wondering, it is 17362 bytes in size.

WebMacro will report the line number and error should your eval fail. Note that I supplied quotes around the argument in stat, above. You'll need to supply quotes too, or you'll get bareword problems.

If you know the internal operations of WebMacro, this option really opens up a whole world of possibilities.

Here's a slightly more advanced example. Given a numeric argument, build up a square out of # characters, with sides of that length.

   <DEFEVAL SQUARE>
   my $len = $0;
   my $out = "<PRE>\n" . ('#' x $len) . "\n";
   my $sqline = '#' . (' ' x ($len-2)) . "#\n";
   $out .= ($sqline x ($len-2));
   $out .= ('#' x $len) . "\n</PRE>\n";
   $out;
   </DEFEVAL>
Here's a square I built earlier with <SQUARE 8>...
########
#      #
#      #
#      #
#      #
#      #
#      #
########

Contents



Index

<ABSTRACT>
<BLUE>
<BOX>
<CYAN>
<DATE>
<FOOTER>...</FOOTER>
<GREEN>
<H1>...</H1>
<H2>...</H2>
<H3>...</H3>
<HEADER>...</HEADER>
<IDX>...(...)</IDX>
<IDX>...</IDX>
<INCLUDE ...>
<INDEX iname>
<INDEX>
<INDEXREF iname>...</INDEXREF>
<LINK>
<NOTE>
<NOTOCNUMBERS>
<PAPER>
<PURPLE>
<RED>
<SECTIONRULE>
<TOC>
<TOCIGNORE>
<TOCLINKS>
<TOCNUMBERS>
<YELLOW>
WebMacro
        acknowledgements
        copyright
        file naming conventions
        usage
        versions
block tags
        arguments
        defining
        macro replacement
customisation
        .webmacros file
        per-user settings
        with <INCLUDE...>
embedding Perl code
eval tags
        defining
footer
header
horizontal section rules
indices
        keyword
        subject
        user-defined
quickie tags
        arguments
        blocks
        colour
        defining
        macro replacement
table of contents
        ignoring certain headings
        links back to the table
        numbering
        un-numbered
translation
        date
(C) Matt J. Gumbley 1998-2005 - All Tights Reversed