"Cartography and Typography with True BASIC"

Contents

1 - Overview

In this paper, written while a Research Affiliate at the Naval Postgraduate School, Monterey, CA, Dr. Hershey describes the versions of his typographic system rewritten in BASIC and adapted to run on microcomputer systems. (Initially he ported it to the HP 85, and then to the Macintosh Classic II; this report describes this latter version.) The report includes significant listings of source code (is it the complete source of the system? I'm not sure) as well as a number of input files (handwritten) and their associated output. It does not include the glyph repertories or the cartographic data, however.

After a discussion of the necessity of using white-out in the preparation of portable listings, the Introduction to this report concludes with the wonderful observation that:

The absence of interchangeability between machines and programs is pathetic. (2)

Coming from someone whose published programming activity goes back at least as far as 1960 on the Naval Ordnance Research Calculator (NORC), this has great resonance. In the decade since, nothing, really, has changed.

2 - NORC Encoding

Although the programs described are in BASIC for microcomputers, the data are in the format developed for the NORC (Naval Ordnance Research Calculator) some decades earlier. The description of the digitization and encoding of these data may be useful in understanding the repertory even in other encodings. The following is from pages 5 and 8 of this report.

TYPOGRAPHY

The characters in a character repertory can be simulated by polygons if the sides of the polygons are short enough. I tis necessary only to list the coordinates of the corners of the polygons. Then a plotter can connect the corners to complete the polygons.

The printer's standard is 10-point type with 2 points of leading, where there are 72 points per inch. Polygonal simulation requires a finer raster.

The smallest circle is simulated by an octagon of radius 7 raster units because the square of 7 is 49 and nearly equal to 50 or twice the square of 5. Distinction between lower case and upper case requires a factor of 3 so the normal height of character is 21 raster units and the normal spacing between lines is 32 raster units. However, clarity is improved if 8 units are added to make the spacing 40 raster units. At the standard of 6 lines per inch there are 240 raster units per inch, which happens to be also the choice of IBM for their addressable raster.

On the main frame there is an occidental repertory OCCRPY with 1642 characters an dthere is an oriental repertory ORIRPY with 705 characters. The occidental repertory has Old English which is not everywhere obtainable. The repertories are scalable to any multiple of their normal size.

For the digitization the characters were plotted on 10-to-the-inch graph paper. Convenience in digitization was achieved when complements were recorded for negative coordinates. Thus negative values ranged from 50 to 99 while positive values ranged from 00 to 50. It was not necessary to devote a digit to the sign. The X-coordinate is positive rightward and the Y-coordinate is positive downward. Each datum is a 4-digit word with the following foramt:

Digits Interpretation
1 - 2 X-coordinate
3 - 4 Y-coordinate

Each character occupies a block of data in NORC format. The first 11 digits in each record give the file number, the block number, and the record number. Each block begins at the beginning of a record and continues to the end of the block. The data in each block are preceded by a beginning-of-block word and are terminated by an end-of-block word. The beginning-of-block word and the end-of-block-word give the number of words in each block. However, they are bypassed, because the endo-of-line datum is 5000 and the end-of-character dataum is 5050. The first datum for each character gives the distance to the left edge of the character block, and the second dataum gives the distance to the right edge of the character block. The remainder of the data are the corners in the polygonal simulation with origin at the centroid of the character block. (5)

[...]

For home computers each coordinate is expressed by a single byte with a bias of 64. Each datum is a two-byte word with the following format.

Byte Interpretation
1 X-coordinate
2 Y-coordinate

This is a two-fold compression of data. The end-of-line word is 0.64 and the end-of-character word is 0.0. From the biased data it is necessary only to subtract 64 to obtain the unbiased data. (8)

3 - Microcomputer Encoding

Dr. Hershey's report Cartography and Typograph with True BASIC describes an encoding used on several different brands of microcomputers. I have yet to see the actual data of this distribution, though, so I do not wish to speculate too deeply as to its format. (Yes, I could deduce it from inspection of the BASIC language programs present in the report. I have not done so.) It would seem from Dr. Hershey's description that this was a "bias-64" encoding in single 8-bit bytes which employed the coordinate pair (-64,0) (encoded as (0,64)) to represent "pen up" and the coordinate pair (-64,-64) (encoded as (0,0)) to represent the end of a glyph.

Exploring Dr. Hershey's Typography
CircuitousRoot