Back to Translating the Hershey Glyphs
The Peter Holzmann USENET Distribution of the Hershey Glyph data was posted to the USENET former newsgroup "mod.sources" in December 1986. It is archived in the "comp.sources.unix" archive, Volume 4. This is available from several sources online; at the time of writing these included the Internet Software Consortium (http://sources.isc.org/) and uu.net (ftp://ftp.uu.net/usenet/comp.sources.unix/).
Note that the Holzmann USENET Hershey Glyph Distribution Cover Statement in this distribution is slightly shorter than the version distributed with the Hershey Fonts in Ghostscript®. I do not know why.
The distribution is packaged in five files. Each of these uses the older "compress" program for data compression. Modern Linux® systems support the unpacking of this format.
To unpack them, I first uncompressed them:
uncompress part1.Z uncompress part2.Z uncompress part3.Z uncompress part4.Z uncompress part5.Z
The resulting five uncompressed files are "shell archives." This is a self-extracting format. As the files themselves indicate:
# This is a shell archive, meaning: # 1. Remove everything above the #!/bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files:
Doing this for each file resulted in many files. I received an error message for several of the files. I haven't yet determined why:
$ sh part2 shar: extracting 'hersh.oc1' (26458 characters) part2: test: 26458 hersh.oc1: integer expression expected shar: extracting 'hersh.oc2' (26561 characters) part2: test: 26561 hersh.oc2: integer expression expected $ sh part3 shar: extracting 'hersh.oc3' (28258 characters) part3: test: 28258 hersh.oc3: integer expression expected shar: extracting 'hersh.oc4' (28235 characters) part3: test: 28235 hersh.oc4: integer expression expected $ sh part4 shar: extracting 'hersh.or1' (26757 characters) part4: test: 26757 hersh.or1: integer expression expected shar: extracting 'hersh.or2' (27782 characters) part4: test: 27782 hersh.or2: integer expression expected $ sh part5 shar: extracting 'hersh.or3' (28972 characters) part5: test: 28972 hersh.or3: integer expression expected shar: extracting 'hersh.or4' (25031 characters) part5: test: 25031 hersh.or4: integer expression expected $
Things seemed to work otherwise, though, and I ended up with the files:
cyrilc.hmp gothgbt.hmp gothgrt.hmp gothitt.hmp greekc.hmp greekcs.hmp greekp.hmp greeks.hmp hershey.c hershey.doc hershey.f77 hersh.oc1 hersh.oc2 hersh.oc3 hersh.oc4 hersh.or1 hersh.or2 hersh.or3 hersh.or4 italicc.hmp italiccs.hmp italict.hmp README romanc.hmp romancs.hmp romand.hmp romanp.hmp romans.hmp romant.hmp scriptc.hmp scripts.hmp
The "README" file contains the Peter Holzmann distribution cover letter. The various *.hmp files are suggested mappings of the Hershey glyph numbers to ASCII fonts. The "hershey.doc" file contains information on the encoding format. hershey.c and hershey.f77 are sample programs for manipulating the data. Finally, hersh.oc* and hersh.or* are the files of occidental and oriental glyph data.
As unpacked, the occidental glyph data is split into four files. Within each file, glyph definitions which exceed 72 characters in length are split every 72 characters. I found it convenient to concatenate all of these data together and to rejoin these line splits so that I had a single file containing all of the Hershey glyph data with one glyph definition per line.
To do this, I first edited each data file (using vi, my favorite text editor) to remove any leading or trailing blank lines. (As I later rewrote my "joinhersh.awk" script to skip blank lines, this step would probably now be redundant.) Then I concatenated them together:
cat hersh.oc1 hersh.oc2 hersh.oc3 hersh.oc4 > hersh.cat
Then I wrote a small Awk programming language script to join the lines. This script assumes a knowledge of the James Hurt encoding of the Hershey data. This encoding will be discussed in the next chapter ("Translating the Hershey Data"). Here's the script:
{ # assume the current line is start of a glyph's encoding # Several blank lines (newline only) appear in the USENET data # distribution. Check for these and skip as necessary. if (length($0) == 0) { next } # write out the glyph number printf ("%s", substr($0,1,5)) # write out the data count printf ("%s", substr($0,6,3)) # get the number of data pairs, including left/right margin data datapairs = substr($0,6,3) + 0 # write out the data pairs, getting new lines of input as necessary pointpos = 9 for (i = 1; i <= datapairs; i++) { a = substr($0,pointpos,1) b = substr($0,pointpos + 1,1) printf ("%c%c", a, b) # if at the end of the line, then... if (pointpos == 71) { if (i == datapairs) { # either also at end of glyph printf("\n") next } else { # or have more glyph data to read getline pointpos = 1 } } else { pointpos = pointpos + 2 } } printf ("\n") }
The full text of this script is available as joinhersh.awk
To run this script, I did:
awk -f joinhersh.awk < hersh.cat > hersh.occ
The resulting file, hersh.occ, contains a version of the Hershey Glyph data suitable for my further processing. I found it interesting that there appear to be 1597 glyphs (perhaps less 27 blank glyphs) in the repertory, rather than the 1377 cited by Wolcott & Hilsenrath.
Wolcott, Norman M. and Joseph Hilsenrath. A Contribution to Computer Typesetting Techniques: Tables of Coordinates for Hershey's Repertory of Occidental Type Fonts and Graphic Symbols. Washington, D. C.: Office of Standard Reference Data, National Bureau of Standards, U.S. Department of Commerce, April 1976. NBS Special Publication 424. National Technical Information Service (NTIS) Order Number PB251845.
The data, files, text, and programs of the Holzmann USENET Hershey Glyph Distribution may be redistributed and used freely under their original terms as specified in the Holzmann USENET Hershey Glyph Distribution Cover Statement. The distribution here complies with these terms. The data of the Hershey Glyphs as transformed for use with VARKON® may be redistributed and used freely under these same terms. I assert no additional rights or conditions on the use of the transformed data. Some of the text and programs in the Holzmann USENET Hershey Font Distribution may be Copyright 1986 by Peter Holzmann and/or James Hurt. Their own terms either allow or require their redistribution with the Hershey data. The distribution of these texts, files, data, and programs here is subject to all of the disclaimers of warranty and liability noted herein.
The text of this document itself and of any linked program files insofar as their text is separable from any Hershey Glyph data they may contain are copyright © 2003 by David M. MacMillan.
Permission is granted to copy, distribute and/or modify copyrighted portions of this document (other than the portions the copyright of which is owned by Peter Holzmann and/or James Hurt, which are freely redistributable under their own terms) under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License."
Note: Those portions of this document which are in the public domain, if any, may be copied freely. The distribution of these public domain portions is subject to all of the disclaimers of warranty and liability noted herein.
This work is distributed WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Free Documentation License for more details.
You should have received a copy of the GNU Free Documentation License along with this work; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
This work is distributed "as-is," without any warranty of any kind, expressed or implied; without even the implied warranty of merchantability or fitness for a particular purpose.
In no event will the author(s), editor(s), or publisher(s) of this work be liable to you or to any other party for damages, including but not limited to any general, special, incidental or consequential damages arising out of your use of or inability to use this work or the information contained in it, even if you have been advised of the possibility of such damages.
In no event will the author(s), editor(s), or publisher(s) of this work be liable to you or to any other party for any injury, death, disfigurement, or other personal damage arising out of your use of or inability to use this work or the information contained in it, even if you have been advised of the possibility of such injury, death, disfigurement, or other personal damage.
Ghostscript is a registered trademark of artofcode LLC.
GNU is a registered trademark of the Free Software Foundation.
Linux is a registered trademark of Linus Torvalds.
VARKON is or was a trademark of Microform AB (Sweden).