% Documentation for: HydraLib --- (C interface to Hydra)
%                    !HydraFP --- Floating Point support for Hydra CPUs
%                    ParaBrot --- Parallel Mandelbrot set generator

\documentstyle[a4wide,11pt,titlepage]{article}

% Misc. bumf...

\title{HydraLib, HydraFP and ParaBrot\\
       Users Guide}
\author{Neil A Carson}
\date{Mon 29th April 1996 AD}

% The document :-)

\begin{document}

 \maketitle

 % Might as well have a TOC for now --- depends how big this thing gets...

 \tableofcontents

 % And the start on a fresh page please!

 \newpage

 \begin{abstract}
  This document aims to act as a guide to using, and in
  the case of ParaBrot, examining, the programs listed
  below. It is not intended to be a guide to programming
  the Hydra; for that, look at the official {\bf Simtec
  Hydra Multiprocessor System} documentation.
  \begin{description}
   \item[HydraLib:] A C library for interfacing with the
    RiscOS Hydra support module. This provides several
    useful types and functions in order to avoid the use
    of {\tt \_kernel\_swi}.
   \item[!HydraFP/!KillFP:] Floating point emulation and
    floating point accelerator support code for processors
    on the Hydra card. Processors with and without Floating
    Point Accelerators may be mixed, and support for the
    full FP instruction set is provided.
   \item[ParaBrot:] A parallel Mandelbrot generator which
    serves as an example of writing single tasking code
    for the Hydra, and is also interesting to explore!
  \end{description}
  This document was composed on an Acorn RiscPC, and printed
  on a VAX/VMS system, using Leslie Lamport's \LaTeX\ and
  Donald Knuth's \TeX\ typesetting system.
 \end{abstract}

 \section{About Hydra}
  \subsection{Introduction}
   Hydra is a hardware add--on for Acorn Computers' RiscPC
   ARM--based desktop computer systems. RiscPC machines
   have the ability to support more than one processor. As
   stadard, they have two processor slots. One is normally
   occupied by the ARM processor card (the primary processor),
   and the other is free, allowing the addition of a second
   ARM, Intel, Motorola or other secondary processor.

   While the design of the primary processor card may be
   relatively simple, the second processor card must
   incorporate a certain amount of arbitration logic in
   order for it to be able to share the bus with the primary
   processor.

   Although there are different design requirements for
   primary and secondary processor cards, the processor slots
   on a standard RiscPC are both electrically identical.

   Hydra interfaces with the RiscPC via one of the processor
   slots, and provides:
   \begin{itemize}
    \item Duplicates of the primary and secondary processor slots
    \item Four {\em further\/} processor slots, together with
     the necessary arbitration logic to support a further four
     ARM or other processor cards.
   \end{itemize}
   In addition to providing physical connections for a total of
   six processor cards, Hydra also implements arbitration logic
   which allows the four slave ARM processor cards to be `simple'
   cards like the primary processor card. This makes it possible
   to add up to four off--the--shelf ARM processor cards to
   any RiscPC system.

  \subsection{The Hydra API}
   With four slave processor cards fitted, a RiscPC with Hydra
   has, in theory, five times the processing power of a standard
   RiscPC. Unfortunately, the operating system, RISC OS, is not
   a multiprocessor OS, and therefore has no way of taking
   advantage of this increased processing power.

   One way to make effective use of Hydra is to switch to an
   operating system which {\em does\/} support multiprocessing,
   such as RiscBSD or Taos\footnote{Hydra versions of both
   RiscBSD and Taos are in development}. This has the advantage
   that ay applications software which can multithread will
   automatically take advantage of any available extra processors.

   The other way to harness the power of Hydra is to write
   applications software which uses the {\bf Hydra API}, which
   is described in the document {\bf Simtec Hydra ARM Multiprocessor
   System}.

 \section{HydraLib}
  \subsection{Introduction}
   The C Library ``HydraLib'' provides a simple set of
   veneers to allow calls to the Hydra SWIs from the
   C language, without having to directly use the
   {\tt \_kernel\_swi} call, which looks ugly in code.
   It provides veneers for most of the {\tt Hydra\_} SWIs which
   anyone would want to use in an application. For example,
   the Hydra Floating Point Emulator and Floating Point
   Accelerator Support Code (FPE/FPASC) in fact uses this
   library to operate, as does the ParaBrot fractal plotter.

   The library interface is analogous
   to the SWI Interface as provided by the {\tt HydraSupport}
   module. More precise details can be found by looking at
   the {\verb|hydra.h|} header file. The root directory
   {\tt Hydra} should ideally be copied to the same place as
   your {\tt Risc\_OSLib}, {\tt CLib}, etc.

   Several types are defined which may be of use. The C source
   code for the library is also provided should you wish to
   modify it (or correct any bugs!).
  \subsection{Function Calls}
   In more detail, the following are provided:
   \begin{tabbing}
    {\em Name of HydraLib function\/\/\/} \= {\em Analogous Hydra SWI\/\/\/} \=
    {\em Description} \\
    hydra\_version         \> Hydra\_Version     \> Returns code version      \\
    hydra\_processors      \> Hydra\_Processors  \> Returns CPU type words    \\
    hydra\_reset           \> Hydra\_Reset       \> Resets some CPUs          \\
    hydra\_snapshot\_regs  \> Hydra\_SnapshotRegs\> Snapshot a given CPU      \\
    hydra\_new\_chunk      \> Hydra\_NewChunk    \> Create a new memory chunk \\
    hydra\_free\_chunk     \> Hydra\_FreeChunk   \> Free a memory chunk       \\
    hydra\_resize\_chunk   \> Hydra\_ResizeChunk \> Resize a memory chunk     \\
    hydra\_map\_chunk      \> Hydra\_MapChunk    \> Map a chunk onto a slave  \\
    hydra\_call            \> Hydra\_Call        \> {\tt *GO} for a given CPU \\
    hydra\_poll\_task      \> Hydra\_PollTask    \> Poll the state of a thread\\
    hydra\_new\_alias      \> Hydra\_NewAlias    \> Alias a dynamic area
   \end{tabbing}
   Fore more information on these calls, look in the main
   Hydra documentation, referencing the corresponding SWI.
   Most of these functions return a standard
   {\tt \_kernel\_oserror} pointer, which is {\tt NULL} if
   no error occured.
  \subsection{Types}
   The following useful types and constant are defined:
   \begin{tabbing}
    {\em Name of C language type\/}  \= {\em Usage} \\
    hydra\_chunk          \> Chunk handle                                     \\
    hydra\_thread         \> Thread handle                                    \\
    hydra\_Max\_CPUs      \> Maximum possible CPUs on a Hydra card            \\
    hydra\_cpu\_id\_block \> For holding processor ID words                   \\
    hydra\_cpu\_mask      \> A bit mask enumeration for selecting a given CPU \\
    hydra\_cpu\_snapshot  \> Large type for holding a CPU state snapshot      \\
    hydra\_access         \> Enumeration for determining ro/rw chunk state    \\
    hydra\_map\_type      \> For specifying whether to map or unmap a chunk   \\
    hydra\_chunk\_parms   \> Data needed by hydra\_new\_chunk                 \\
    hydra\_chunk\_data    \> Information returned when requesting chunk info  \\
    hydra\_thread\_state  \> Indicates the state of an executing thread
   \end{tabbing}
   For more information on exactly where these should be
   used, take a look at the arguments to the function
   calls above, in {\verb|hydra.h|}.

 \section{The Hydra FPE/FPASC}
  \subsection{Introduction}
   As it stands, the previously mentioned Hydra API makes no provision
   for allowing any instructions contained within the ARM Floating Point
   instruction set to be executed on slave processors if they
   lack Floating Point Accelerator coprocessors, even if they
   do, the more exotic instructions in the instruction set still
   cannot be used. !HydraFP exists to fill these gaps.
  \subsection{Overview}
   The Hydra FPE/FPASC (Floating Point Emulator/Floating Point
   Accelerator Support Code) is a piece of software to allow
   instructions in the ARM Floating Point Instruction Set
   to execute on CPUs on the Hydra multiprocessor card. It can
   be though of as behaving in the same way as the RiscOS FPEmulator
   relocatable module. The
   code can be invoked by double--clicking on the !HydraFP
   application, note however !Hydra must be loaded first.
   Upon loading, it will:
   \begin{enumerate}
    \item Probe to find what processors are present.
    \item Create a memory chunk to hold the shared code.
    \item Map the shared code into each CPU's memory map.
    \item Execute initialisation code on each processor, which
     will set up a small amount of workspace for the code, in
     the lower kernel area of each processor.
    \item Detect whether or not an FPA is present, and output a
     string to the appropriate window in !HydraTerm reflecting
     the result of this probe.
    \item Claim the Undefined Instruction vector on each procesor.
   \end{enumerate}

   For code executed on CPU cards with no Floating Point
   Accelerator IC, the code will emulate {\em all\/} floating
   point instructions.
   
   For CPU cards {\em with\/} Floating Point Accelerators,
   the code will emulate instructions like {\tt SIN}, which
   cannot currently be performed by the hardware. Cards with
   and without Floating Point Accelerators {\em can\/} be
   mixed at will.
   
   When run, several system variables will be set; in
   particular, if {\verb|HydraFP$ChunkHandle|} is set to
   something other than {\verb|""|} or {\verb|"0"|} then
   you can count on the code being loaded.
  \subsection{Traps}
   The standard IEEE floating point traps are catered for,
   namely:
   \begin{itemize}
    \item Invalid Operation
    \item Division by Zero
    \item Underflow
    \item Overflow
    \item Inexact Operation
   \end{itemize}
   When a trap (or ``exception'') occurs, a message to say
   what has happened is output to the given !HydraTerm window,
   and the task that went wrong is instantly terminated.
  \subsection{Resetting}
   The CPUs on the Hydra card can be reset at any time---you
   may need to do this if your code crashes. This will
   pose a problem to any FP code which may have been loaded
   onto it. In order to fix this, kill the floating point
   code, then restart it.
   
   On some early versions of the Hydra API this can cause some
   errors, ignore them.
  \subsection{Removal}
   It may not be appropriate to have floating point emuation
   and FPA support code loaded all the time, consuming
   32Kb of memory. You can remove the code by running the
   !KillFP application. Upon doing so, a message should appear
   in the !HydraTerm windows stating that the code has been
   removed (it is a good idea to have !HydraTerm running when
   doing {\em any\/} work on the Hydra!)

   !HydraFP will keep track of the number of times it is run;
   if the code is already loaded then it will not re--load.
   In a similar vein, !KillFP will not in fact remove the
   code until the number of ``copies'' of it running reach
   zero. This means that several different client programs
   can invoke !HydraFP and !KillFP without worrying whether
   or not anyone else is doing, or has done, the same.

 \section{ParaBrot}
  \subsection{Introduction}
   The Mandelbrot set is a fractal figure, renowned for
   being computationally expensive; it is highly iterative,
   and does not require much memory access---a great candidate
   for porting to Hydra.

   The source code is present to look over, and it serves as
   a good example of how to write a single--tasking Hydra
   application (note multi--tasking programs should use the
   ``thread'' interface provided, in order to avoid clashing
   with other tasks also using the card).

   The front end is driven by a small basic program (!RunBas)
   which handles zooming, exiting, etc. It invokes the compiled C
   program called !RunImage every time a screen needs to be
   computed, which takes a set of command line arguments.
   
   !ParaBrot uses both the host ARM and any number of slave ones
   on the Hydra to compute the Mandelbrot set as quickly as possible.
   The actual assembly language code is contained in the
   HostRm and SlaveRM HydraApps within the application directory.
   Note that 32--bit variants (running at a higher precision) are
   also provided, for when you zoom a long way into the set.
   These HydraApps are assembled separately.
  \subsection{Algorithm}
   For the technically interested, the speedy algorithm is roughly
   as follows:
   \begin{enumerate}
    \item If the current screen reflects, divide it up and mirror
     the other half.
    \item Divide the remaining space into 9x9 squares. Recursively
    subdivide each 9x9 square into
    smaller squares; if the colour at a corner of the square is 0
    (black) then iterate it; if the colour at all four corners is
    the same, then fill the whole square with that colour and
    carry on until the end of the list of squares is reached.
    Several 9x9 squares may be executed concurrently (in fact as
    many as there are CPUs available, in a fully loaded RiscPC, five).
   \end{enumerate}
   Both the number of squares to be calculated, the start address
   of the screen memory (in the case of the host this varies; for a
   slave it is mapped in as an aliased dynamic area at 16Mb), and the
   maximum number of iterations desired for the Mandelbrot iterator, need
   to be ``poked'' into the code images as they are loaded; this is
   done by entries in the code headers showing where to put them,
   in a similar vein to a RiscOS module. The data is mapped in at
   8Mb in the slave environment, and is at an arbitrary address on
   the host.
  \subsection{Code}
   The C language source is subdivided into a core and a veneer; this
   allows simple porting to other environments, such as the RiscBSD
   RiscPC Unix--like operating system.
   In the ``Module'' directory lies the source for all of the various
   code modules (the HydraApps within the !ParaBrot directory); these
   {\em must\/} be filetyped as {\tt HydraApp} before running. All
   of these sources are in assembly language, and can be reassembled
   with objasm.
\end{document}
