by Stewart Brodie
This article was going to be about manipulating Draw diagrams, in response to Paul's request in the November 1994 CAUGers. However, since Graham Jones is doing that, Paul asked if I wouldn't mind giving an overview of my current project: ArcWeb, a World Wide Web browser. I'm not going to explain the details of the Web here, although a brief overview is required.
The World Wide Web, or WWW for short, is a collection of documents,
scattered around the Internet. Many of these documents are in HTML
(HyperText Markup Language) format, which is plain ASCII text where some
characters ("markup") have special meaning. For example:
<em>hello</em> is an instruction that the
"hello" should be emphasized. There are many other
"tags" to produce special effects such as headings, paragraph
breaks, bold, italics etc. However, the anchor tag
can contain an instruction that the following text is a link to another
document: it is a hypertext link (hyperlink). ArcWeb is a tool which
displays HTML documents by converting the raw HTML into a Draw diagram
which can then be displayed to the user, saved in a common data format and
easily printed (by Draw itself). When ArcWeb is displaying the page, you
can click on hyperlinks and the page which the link refers to will
So how does it do that? As soon as an HTML file is ready to be displayed, it is first parsed into an intermediate form which the rendering routines can easily decode and translate into instructions for creating Draw objects. I chose to wrote my own parser instead of using lex and yacc because HTML is context-sensitive.
I shall leave the non-trivial details of parser design for now and move on to the Draw object creation. Parsing results in instructions to create five different types of Draw object. Although RISC_OSLib contains functions for all the object manipulation, I chose to add objects "by hand" since it is much faster. However, I use the types declared in the RISC_OSLib header files, because these are useful. I use Acorn's new DrawFile module instead of the RISC_OSLib code to plot the Draw diagram because the module code is faster.
The first thing you need to do is to create an empty diagram. Use
malloc() to create a
draw_diag structure and then
flex_alloc() to grab memory for the actual data. It is
important to use
malloc() for the
because flex requires that the pointer in it does not move. Using the flex
memory functions means that, as pages are built and destroyed, the Wimpslot
automatically increases and decreases. As each object is added to the
flex_extend() to increase the memory available,
then put the object data straight into the diagram and update the diagram
bounding box directly. There are more efficient ways of managing the
memory, such as allocating it in larger chunks to save on calls to
flex_extend(). This is important when there are many calls to
flex_extend() since flex may have to reorganise its memory on
each call which involves copying chunks of memory around.
The font table object should always be the first object in your Draw file. The PRMs say that it needs to precede all text objects but, if it is the first, then you won't have a problem. (There is also a mistake in the PRMs which say that each entry in the font table should be padded with zeroes to the next word boundary. This is not the case). I have 20 or so fonts in the header since there are many different text styles that may be used in a document: six heading styles and then body text, hypertext and fixed-width text all with plain, bold, italic and bold italic styles, plus one or two others. These fonts do not need to be different: normally body text and hypertext are the same font and it is the colours which are different.
The parser will not split up the text into individual words since this
would mean larger Draw files and slower rendering (more calls to
Font_Paint). Yet a block of text may not fit onto the rest of
the current line. Some words might fit, maybe none, or maybe all of them.
The text creator uses SWI
Font_ScanString to find out where to
split the string, if anywhere. It is told that spaces are the split
characters and how much space is left on the current line. Once that has
been decided, the text object is added to the diagram with the correct
bounding box (returned by SWI
Font_ScanString) and the rest of
the string is then handled the same way. If no more objects fit on the
current line then a new line is started. If the text is hypertext then its
colour is changed to blue and it is underlined (see Path Objects
These are used when tags indicating scaled text are found. The objects are handled in exactly the same way as normal text except for the 7 extra words in the Draw object header, which contain a transformation matrix and a flag word.
HTML allows the inclusion of inlined images. Unfortunately, most of the images on the Web are in GIF or JPEG format. Currently, Draw files only allow RISC OS sprites to be embedded in them, so images need to be converted from the original format into sprites. ArcWeb runs ChangeFSI to convert any images it finds into sprites. Once a sprite is available, the sprite object is added just like a text object: a check for there being enough space left on the line etc.
HTML has a special tag
<hr> which stands for
Horizontal Rule. ArcWeb generates a line across the page using a simple
path object made of three components: move, line to and end. To speed
things up, when the program starts, it generates two composite Draw objects
(a line and a box) which it can add to a diagram when needed. These are
initialised with all the correct headers and the correct path commands. The
line and box routines need only fill in the bounding box, coordinates and
The process of converting the HTML into a Draw diagram can be time
consuming for anything but a small document so it is desirable that you can
see the early parts of the diagram while the rest is still being created.
This creates many problems, particularly that the integrity of the Draw
diagram data must be guaranteed when an attempt at plotting the data is
made. I maintain a set of offsets into the Draw diagram including the
current "safe limit" which I pass as the diagram size to SWI
As each line of the diagram is built,
used to draw the current line on the screen. Every so often, the renderer
event_process() to allow the desktop to remain usable.
It keeps processing events until it receives a Null event code, so the
diagram remains up to date as the user scrolls up and down the document,
and the system remains as responsive as possible. Because the document will
not contain any overlapping objects and drawing is clipped to the current
redraw rectangle, redraws are fast and there is little to be gained from
From CAUGers volume 2 issue 3 Comments to firstname.lastname@example.org