15 October 2000
The current version is 3.0 (a release version)
big change under the hood, not many user-noticable changes, though.
and that's about it.
maintenance release
Some not-so-major changes, and one big one.
The re-write is complete. Doubtless version 2.0 still has bugs (and probably needs more woodshedding before it earns the name 2.0), but my available time for development has come to an end, and this is good enough.
...
The source for versions 0.x through 1.3 was so convoluted and un-architected that i've undertaken a complete rewrite, mostly from scratch. this version is larger, but runs faster and is more reliable.
...
I'm releasing version 1.0 of my program to the world. I haven't considered a license, but you may assume something like GPL. Enjoy.
mkhtmlindex a program to simplify my web publishing. It automagically generates a hypertext file containing descriptive links to each hypertext file in a directory.
It does some nifty things with template files, and is scriptable. see the explanation section below for a better explanation (oddly enough...).
Right now, mkhtmlindex is a command-line program only. I'm developing it in C++ on a Linux system, but am trying to make it as portable as possible. So far it has compiled cleanly on any system using GNU's C compiler (GCC), including the Cygnus cygwin32 setup, as well as with Visual C++ 4.0/5.0/6.0.
I have a couple of websites which are just large collections of files in a directory, with a file that indexes them with links and such. Maintaining this file gets to be a real hassle when the number of files exceeds about thirty or so... and so I undertook to write a program to automate the process. I'm sure something like it exists on the web, I couldn't find one that did exactly what I wanted.
I've tailored the program's behavior mostly to suit the needs of my humor page: sorting files by date (newest on top), separating them by category or date (eventually to be ``category and date''), printing the link with a pretty format (the meta info), et cetera. This has proven useful also for the University of Kentucky IEEE hardware contest page, on which the use of this program (hopefully) would make me (the maintainer) obselete.
I envision that this program should be mildly useful to anyone who maintains a page that is just a list of links to other files.
The point of the program is to generate a hypertext file that contains a list of links to other hypertext files, an index of sorts. When you start the program, it scans the current directory (or, eventually, an arbitrary directory you specify) for files with the *.html or *.htm extensions. It creates for itself a list of all of these files, and reads the html title and several meta strings from each one. Then it creates a list of hypertext links (using the page title as the link text), and then exits. (``that doesn't sound so bad...'')
The program searches each *.html or *.htm file for the <title> tag --- if this tag is in the file, then the program will add a link to it in the page, using the page's title (from the title tag) as the link text. If that <title> tag is empty, the link text will be the filename.
Alternatively, you can use <meta> information in each source file as the link text (and eventually even as sorting keys). Words enclosed in vertical bars become link text, words in asterisks become bold (strong), and words enclosed in underscores become italicized (emphasis). To wit, the line:
<meta name="description" content="isn't |this| _really *cool*_?!?">
in a file called foo.html will result in this list item:
<li>isn't <a href="foo.html">this</a> <em>really <strong>cool</strong></em>?!?"</li>
which looks like this:
- isn't this really cool?!?
You aren't restricted to using unnumbered lists (<ul></ul>); there are options for choosing several different styles:
Unnumbered | <ul><li> [link text] </li></ul> | |
Numbered | <ol><li> [link text] </li></ol> | |
paragraph-based | <div><p> [link text] </p></div> | |
``new!'' | <ul><li><em>---NEW!---</em> [link text]</li></ul> | |
custom | (as of 2.0 this works only in templates) |
The custom liststyles enable you to do spiffy stuff... (more later)
It is also possible to use <meta> information in each source file as filters and sorting keys. As of release 1.1, mkhtmlindex can sort by ``category'' when using a template file. Release 2.0 supports some extensive sorting and filtering capability with template files (see below)
You can specify the page title (the text that appears as the heading and in the generated file's <title> tags). The program also supports that marvellous invention, Cascading Style Sheets. You can use an interal stylesheet (configurable at compile time), or specify an external stylesheet's URL. Or if you really hate the look of the internal page, you can use a template file of your own design.
With mkhtmlindex's file templates you can generate much more specialized html files. Basically, the template file is any html file containing some special, custom tags. For example:
<html> <head> <title>This is my template</title> </head> <body> Here is a list of some files: <!insertlist> Here is another list, numbered: <!insertlist numbered> Here is yet another list, using meta info as the titles: <!insertlist meta> Here is yet another list, using meta info as the titles, unnumbered, and containing only files which contain a tag <meta name="category" content="foo">: <!insertlist unnumbered meta category="foo"> Here is a list containing only files whose meta date is newer than 27-1-1998: <!insertlist unnumbered meta newerthan="27-1-1998" liststyle="newstyle.sty"> The custom liststyle in the above tag is supplied in the file "newstyle.sty". <hr noshade> Here is an ad for this program: <!insertcredits> </body> </html>
The above example showcases plenty of features. Read on for descriptions.
As you can see, the template file is just your ordinary, run-of-the-mill HTML file. The interesting part is the specialized comment tags (the old-style, single-tag comment). Currently, you can do two things, which corresponds to two template tags --- inserting a list, and inserting the credits (an ad or banner for the program). The tag is replaced by the credits or the list, or whatever, in the output file.
The credits tag is the simpler of the two. There are only two options for the credits: show the date of file creation, and show the url of the program. This batch of credits appears in its own <div> and <p>.
Here's the syntax:
<!insertcredits [showdate] [showurl]>
The text in square brakets ([]) is optional. Note that the tag is case-insensitive, and the options can come in any order, with (just about) any amount of whitespace in between (but try to stay under 256 characters per tag --- otherwise, the program may barf, segfault, or something else nasty).
This one has plenty of options, but the basic form works very well. To insert a vanilla list of all the files in the current directory (at this point in the file)
<!insertlist>
The options can come in any order, and are case insensitive. (quoted arguments, however, are taken literally.) In essence, you want to describe the setup as follows:
<!insertlist [liststyle] [use meta titles?] [sort key] [sort direction] [filter key]>
The accepted/understood values for liststyles are:
The file supplying the custom liststyle follows this format:
The tags can be any string of characters. It will be inserted verbatim into the output file. For example, say you want the list to be in one centered paragraph, with a little image called bullet.gif as the bullet. Here's a file that would do the trick:
<p align=center> </p> <img src="bullet.gif" alt="o" height=10 width=10> <br>
Note that there's an extra space at the end of the third line. This will be included verbatim. Just try it out, and you'll get the hang of it.
Simply type meta to use meta titles (if they exist in the files). This option is off by default, so just leave it out to turn it off.
This selects the attribute by which to sort the list. Accepted values:
The default is to sort by filename.
Defaults to forward. Accepted values are sortreverse and sortforward.
Basic syntax:
Accepted values:
(datestring is of the format dd-mm-yy or dd/mm/yy.)
to be written...
In version 2.0, sorting really only works from template files...
Unfotunately, there can be only one sorting algorithm in effect at a time. In other words, you can choose to sort either by filename or by date, but you can't have all files with the same date sorted by filename. This is due to some limitiations with the internal architecture of the engine... and should change at some point in the future.
filtering only works from template files...
like sorting, the current filtering scheme is limited to one filter, such as newerthan or category. i would like to be able to filter out all files in a certain category that are newer than a certain date, but this will require some relatively large changes in the code and i'm too lazy to do it just yet. this will change in the future.
currently available filter options (as of 2.1.1):
category=text | include only files whose category is text |
category!=text | exclude files whose category is text |
author=text | include only files whose author is text |
author!=text | exclude files whose author is text |
newerthan=datestring | include only files newer than datestring |
olderthan=datestring | include only files older than datestring |
these came straight from TemplateFile.cpp; eventuallyi'll change this to read the template files with a flex parser, and the template tag syntax will be much richer...
to be completed...
to be written...
In version 2.0, custom list styles really only work from template files...
here's the usage message from version 2.0:
$ mkhtmlindex --help Usage: mkhtmlindex [ options ] -h --help print this message -v --verbose verbose mode (writes messages to stderr) -q --quiet suppress everything (implies --overwrite) -V --version print version information -f --overwrite force overwrite of output file - write output to stdout -o <filename> use <filename> as the output file -u write unnumbered lists (default) -n write numbered lists -p write lists in a paragraph style -d --date show the date of file generation -m --usemeta use meta description string for the title fall back is html title, then filename. -T --title <string> use <string> as the output file's header and title -s --stylesheet [<url>] use a Cascading Style Sheet to format the output file. if <url> is specified, it is included as an externally <link>ed sheet. It may be either a local file or a fully-qualified URL. -t --template <filename> use <filename> as a template for the output file -i[=<filename>] -i --ignoreold[=<filename>] do not include a list item for "index.html" or file specified by the optional <filename> (this is the default) -I --no-ignore do NOT ignore old output files. http://www.asofyet.org/muppet/software/mkhtmlindex.html
There are indeed ``undocumented features,'' but since they're mostly buggy, they shall stay undocumented. If you're really curious, read the source code.
You're reading it.
The usage message (mkhtmlindex --help), and the source code are also rather helpful.
It's free. I don't hold much stock in paying for software (it's ones and zeroes, for cryin' out loud!), and i'd rather not be obligated to take care of it on account of making (very little) money from it, anyway. You can get the current version of mkhtmlindex from my homepage (i.e., http://www.asofyet.org/muppet/software/mkhtmlindex/)... I don't have all that much disk space on the server, so the source distribution is the de facto. If you use a platform that I use, there might be a binary for you.
Otherwise, just mail me (scott arrington) at scott at asofyet dot org.