README for 'urlredir'
While this software is free, it is subject to the GNU General Public Licence,
see the copyright section at the bottom of this file.  I know I probably 
didn't need to worry about this, but I figure "what the hell"(tm) :)

Some parts of this program were extracted from other sources, such as the
search algorithm from "algorithms in C" by Sedgwick.  As far as we can tell
copyright has been observed in all cases.



INSTALLATION


1.  Once you have unpacked the tar file, configure the package for your
system by running './configure'.  If you're using `csh' on an old version 
of System V, you might need to type `sh configure' instead to prevent 
`csh' from trying to execute `configure' itself.

The `configure' shell script attempts to guess correct values for
various system-dependent variables used during compilation, and
creates the Makefile(s) (one in each subdirectory of the source
directory).  In some packages it creates a C header file containing
system-dependent definitions.  It also creates a file `config.status'
that you can run in the future to recreate the current configuration.

Running `configure' takes a minute or two.  While it is running, it
prints some messages that tell what it is doing.  If you don't want to
see the messages, run `configure' with its standard output redirected
to `/dev/null'; for example, `./configure >/dev/null'.

Once you have configured the package, you can simply type 'make install' to
install the program, config file and documentation.  By default the binary
is installed in /usr/local/bin, and the config file in /usr/local/etc.
You can specify an installation other than /usr/local by giving 'configure'
the option '--prefix=PATH'.  Alternately, you can do so by consistently 
giving a value for the 'prefix' variable when you run 'make', e.g.,
   make prefix=/usr
   make prefix=/usr install

You can also specify independent locations for the binary file and the
config file, by giving 'configure' the options '--bindir=PATH' and/or
'--sysconfdir=PATH'.  If either is unspecified then the regular prefix is
used.

By default the documentation is installed in /usr/local/doc/urlredir.  
To change this, give 'configure' the option '--with-docdir=PATH'.

If your system requires unusual options for compilation or linking
that `configure' doesn't know about, you can give `configure' initial
values for some variables by setting them in the environment.  In
Bourne-compatible shells, you can do that on the command line like
this:
   CC='gcc -traditional' DEFS=-D_POSIX_SOURCE ./configure

The `make' variables that you might want to override with environment
variables when running `configure' are:
(For these variables, any value given in the environment overrides the
value that `configure' would choose:)
CC          C compiler program.
            Default is `cc', or `gcc' if `gcc' is in your PATH.
INSTALL     Program to use to install files.
            Default is `install' if you have it, `cp' otherwise.

(For these variables, any value given in the environment is added to
the value that `configure' chooses:)
DEFS        Configuration options, in the form `-Dfoo -Dbar ...'
            Do not use this variable in packages that create a
            configuration header file.
LIBS        Libraries to link with, in the form `-lfoo -lbar ...'



2.  Type `make' to compile the package.  If you want, you can override
the `make' variables CFLAGS and LDFLAGS like this:

   make CFLAGS=-O2 LDFLAGS=-s


3.  Type `make install' to install programs, data files, and
documentation.


4.  You can remove the program binaries and object files from the source 
directory by typing `make clean'.  To also remove the Makefile and all the 
files that `configure' created, type `make distclean'.


The file `configure.in' is used as a template to create `configure' by
a program called `autoconf'.  You will only need it if you want to
regenerate `configure' using a newer version of `autoconf'.



USAGE


As for actually using this program?  Well, thats left largely up to you,
but it is usually used as an addition to the squid proxy server.  As such
it reads a string containing the url from stdin, and returns either a
newline if the url is unmatched, or the new url if it was redirected.  The
redirection is controlled by the config file, explained below.

The format of the input string (as given by squid) is:

     <url> <source address> <other stuff....>



CONFIGURATION

This is the important section. 

The redirector uses a configuration file (normally /etc/urlredir.conf) to 
control the way it redirects proxy requests.  The configuration file is
structured in a c-style format, with command lines starting subgroups (or 
functions - determined by enclosing braces {}), or terminated by ";"'s.
In the config file, there are two different command types:  search commands,
and action commands.

First I'll define the search commands.  These are:

      contains <key>... {
      hostcontains <key>... {
      pathcontains <key>... {
      exempt <key>... ;
      hostexempt <key>... ;
      pathexempt <key>... ;


The "contains" commands search the url's for any of the keys (more than 
one can be specified per line, note the ... notation), and then enter the
subgroup (note the opening '{') if found.  The "exempt" commands also 
search for any of the keys, and if found exempt the url from any further
searching.

The prefixes "host" and "path" imply that only the host or path of the url
should be searched for the keys.

A note on keys:  At the moment the only special characters that these keys
support are a ^ at the start, of a $ at the end, which implies matching only
to the start or end of the search space respectively.  It would be nice to
convert this to search regular expressions sometime (see wish list).
Also, if the command "file" is given instead of a key, then the following
key is interpreted as a file, from which keys are extracted, one per line.

      eg.   contains file "/var/lib/bann/banned.list" {


Now, onto the actions.  The current actions supported are as follows:

      redirect <url>... ;
      logfile filename;
      randomness 0..100;


All these commands only act upon a subgroup, except for logfile, which not
only acts on a subgroup, but is inherited into all child subgroups (unless
a new logfile is defined).

First redirect.  This specifies url's to redirect the request to.  This is
most useful inside a "contains" class, to redirect the request on certain
keys.  If more than one url is provided, a random one is chosen.  There
can also be multiple redirect lines in a group, with the same effect as a 
single line containing all the url's.

The logfile entry specifies the file to log to when redirecting.  A log 
entry of the following format is made:

      <date & time> <source of request> <url requested>

As mentioned above, the active logfile is inherited into any child
subgroups, unless the logfile is set within the child.  This means that
you could set a logfile at the top of the config file, that would be used
for all redirect actions.  This is possible as the entire config file is
considered as a "contains" class, with a key that always matches.

Lastly the randomness command specifies the percentage likelihood that a
particular redirecting class (ie. a "contains" class) applies.  This is 
useful for redirecting only part of the time.  This is 100% by default.
Currently if a redirect is not applied, then the url is exempted from
further matching, however this is likely to change in later releases.

Also, see the archive example.tar.gz in the source directory or 
/usr/local/doc/urlredir for an further example.



WISH LIST

This program was put together rather quickly, and as such possibly has a 
few bugs.  It also has a few possible refinements that would have been nice
to implement.  Some of these are:

      Regular Expression searching.
      Man pages
      etc.

If anyone feels like adding any of these then go right ahead, and please 
send a copy back to us to add into the source.



COPYRIGHT

This utility has been put together by Chris Leishman and Trevor Cohn for the
Ormond College IT Department.  It is currently maintained by us.

   This utility is free software: you can redistribute it and/or modify it
   under the terms of the GNU General Public License as published by the 
   Free Software Foundation; version 2 dated June, 1991.

   This program is distributed in the hope that it will be useful, but 
   WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
   for more details.

   You should have received a copy of the GNU General Public License
   along with this program;  if not, write to the Free Software
   Foundation, Inc., 675 Mass Ave., Cambridge, MA 02139, USA.

On Debian GNU/Linux systems, the complete text of the GNU General
Public License can be found in `/usr/doc/copyright/GPL'.


Chris Leishman   <chris@ormond.unimelb.edu.au>
Trevor Cohn      <trev@ormond.unimelb.edu.au>
