From 5aef874a9d118863b0a27f83f883f6874e011ed9 Mon Sep 17 00:00:00 2001
From: Graeme Geldenhuys <graeme@mastermaths.co.za>
Date: Fri, 2 Oct 2009 16:58:23 +0200
Subject: documentation gathered from various sources regarding INF file
 format.

---
 docs/inf04.txt | 635 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 635 insertions(+)
 create mode 100644 docs/inf04.txt

diff --git a/docs/inf04.txt b/docs/inf04.txt
new file mode 100644
index 00000000..fcbd4175
--- /dev/null
+++ b/docs/inf04.txt
@@ -0,0 +1,635 @@
+  OS/2 2.0 Information Presentation Facility (IPF) Data Format - version 2
+  ----------------------------------------------------------------------- -
+
+  *** introduction to version 1 ***
+
+  Having become extremely frustrated by VIEW.EXE's penchant for windows
+  that come and go, without even opening large enough to see everything
+  in them, I thought I'd try to turn .INF files into something more
+  conventional.  While I don't have code to offer, I can tell you what I
+  learned about .INF format--it was enough to produce more-or-less
+  readable more-or-less plaintext from .INFs.
+
+  I offer this in the hope that somebody will give the community a
+  really nice, tasteful, convenient, doesn't-use-too-much-screen-real-estate
+  .INF browser to replace VIEW.EXE.
+
+  All of this was developed by looking at .INF files without any
+  documentation of the format except what VIEW.EXE showed for a
+  particular feature.
+
+  I don't have a lot of personal interest in refining this document with
+  additional escape sequences, etc., but I would be happy to correspond
+  with someone who wanted to fill in the details, or to clarify anything
+  that may be confusing.  If someone could point us to an official document
+  describing the format that would be most helpful.
+
+  -- Carl Hauser (chauser.parc@xerox.com)
+
+
+  *** introduction to version 2 ***
+
+  The original document contained most of the real tricky stuff in the file
+  format (especially the compression algorithm) so going on from there was
+  mainly a task of creating lots of help files using the IPFC and the
+  decompiling them again to see what came out.
+
+  I fixed a few minor bugs in the description of the header which was
+  extended to describe the entire structure I believe to be the header
+  (because variable data starts afterwards).
+
+  A number of escape codes have also been added and the descriptions of
+  others have been refined. There are still a lot of question marks about
+  the format, but this description already allows disassembling the text
+  into ASCII form in a fairly true-to-life format (including indentations
+  etc.).
+
+  Further research should go into the way multiple windows are handled
+  (I didn't work on that because I have never required multiple window
+  displays in my help files and therefore am not familiar with the concepts).
+  Font usage and graphics linking could also use some more fiddling around.
+
+  -- Marcus Groeber (marcusg@ph-cip.uni-koeln.de - Fidonet 2:243/8605.1)
+
+  *** introduction to version 3 ***
+
+  Just a bit of an update and flesh out ;-)
+
+  -- Peter Childs (pjchilds@apanix.apana.org.au)
+
+  *** Version 4 ****
+
+  Further additions as found while writing NewView
+
+  -- Aaron Lawrence
+
+  **** Types ****
+
+     All numeric quantities are least-significant-byte first in the file
+     (little-endian).
+
+     bit1       1 bit boolean           \  used only for explaining
+     int4       4 bit unsigned integer  /  packed structures
+     char8      8 bit character (ASCII more-or-less)
+     int8       8 bit unsigned integer
+     int16      16 bit unsigned integer
+     int32      32 bit unsigned integer
+
+  **** The File Header ****
+
+      Starting at file offset 0 the following structure can overlay the file
+      to provide some starting values:
+          {
+            int16 ID;           // ID magic word (5348h = "HS")
+            int8  unknown1;     // unknown purpose, could be third letter of ID
+            int8  flags;        // probably a flag word...
+                                //  bit 0: set if INF style file
+                                //  bit 4: set if HLP style file
+                                // patching this byte allows reading HLP files
+                                // using the VIEW command, while help files
+                                // seem to work with INF settings here as well.
+            int16 hdrsize;      // total size of header
+            int16 unknown2;     // unknown purpose
+            int16 ntoc;         // 16 bit number of entries in the tocarray
+            int32 tocstrtablestart;  // 32 bit file offset of the start of the
+                                     // toc entries (this is redundant info;
+                                     // the individual offsets are stored starting
+                                     // at tocstart)
+            int32 tocstrlen;    // number of bytes in file occupied by the
+                                // table-of-contents entries
+            int32 tocstart;     // 32 bit file offset of the start of tocarray
+            int16 nres;         // number of panels with ressource numbers
+            int32 resstart;     // 32 bit file offset of ressource number table
+            int16 nname;        // number of panels with textual name
+            int32 namestart;    // 32 bit file offset to panel name table
+            int16 nindex;       // number of index entries
+            int32 indexstart;   // 32 bit file offset to index table
+            int32 indexlen;     // size of index table
+            int8  unknown3[10]; // unknown purpose
+            int32 searchstart;  // 32 bit file offset of full text search table
+            int32 searchlen;    // size of full text search table
+            int16 nslots;       // number of "slots"
+            int32 slotsstart;   // file offset of the slots array
+            int32 dictlen;      // number of bytes occupied by the "dictionary"
+            int16 ndict;        // number of entries in the dictionary
+            int32 dictstart;    // file offset of the start of the dictionary
+            int32 imgstart;     // file offset of image data
+            int8  unknown4;     // unknown purpose
+            int32 nlsstart;     // 32 bit file offset of NLS table
+            int32 nlslen;       // size of NLS table
+            int32 extstart;     // 32 bit file offset of extended data block
+            int8  unknown5[12]; // unknown purpose
+            char8 title[48];    // ASCII title of database
+          }
+
+  **** The table of contents entries ****
+
+      Beginning at each file offset, tocentrystart[i]:
+          {
+             int8 len;          // length of the entry including this byte (but not including extended data?)
+             int8 flags;        // flag byte, description folows (MSB first)
+             // bit7 haschildren;  // following nodes are a higher level
+             // bit6 hidden;       // this entry doesn't appear in VIEW.EXE's
+                                   // presentation of the toc
+             // bit5 extended;     // extended entry format
+             // bit4         // ??
+             // int4 level;        // nesting level
+             int8 ntocslots;    // number of "slots" occupied by the text for
+                                // this toc entry
+          }
+
+      if the "extended" bit is not 1, this is immediately followed by
+
+          {
+             int16 tocslots[ntocslots]; // indices of the slots that make up
+                                        // the article for this entry
+             char8 title[];             // the remainder of the tocentry
+                                        // until len bytes have been used [not
+                                        // zero terminated]
+          }
+
+      if extended is 1 there are intervening bytes that describe
+      the kind, size and position of the window in which to display the
+      article. First, there are two flag bytes:
+          {
+             int8 w1;
+               // bit 3: Window controls are specified
+               // bit 2: Viewport
+               // bit 1: Size is specified.
+               // bit 0: Position is specified.
+             int8 w2;
+               // bit 3:
+               // bit 2: Group is specified.
+               // bit 1
+               // bit 0: Clear (all windows before display)
+          }
+      Then the following optional fields may appear, as specified by w1:
+
+      Origin ( 5 bytes )
+          {
+             int8 Flags;
+               // bits 4-7: X position type
+               // bits 0-3: Y position type
+             int16 XPosition; // meaning depends on type
+             int16 YPosition;
+          }
+
+      Position types are:
+        0 = absolute character
+        1 = relative %
+        2 = absolute pixel
+        3 = absolute points
+           For these types, the position is simply a number.
+           If one of the positions is not specified then the type will be 0
+           and the value will be -1 (65535)
+
+        4 = dynamic
+           For this type the position is one of the following values:
+             1: left
+             2: right
+             4: top;
+             8: bottom
+             16: center.
+
+      Size ( 5 bytes )
+          {
+             int8 Flags;
+               // bits 4-7: Width type
+               // bits 0-3: Height type
+             int16 Width;
+             int16 Height;
+          }
+
+      Width/height type are same as position types, above, except that dynamic is not used.
+
+      Window controls ( 2 bytes )
+        0, 112 means everything is turned off.
+        8, 103 means no scroll bars IIRC
+
+      Group ( 2 bytes )
+          {
+             int16 GroupNumber;
+          }
+        GroupNumber is basically a 'frame' or window number.
+
+
+      Here's a C code fragment for computing the number of bytes to skip
+          int bytestoskip = 0;
+          if (w1 & 0x8) { bytestoskip += 2 };
+          if (w1 & 0x1) { bytestoskip += 5 };
+          if (w1 & 0x2) { bytestoskip += 5 };
+          if (w2 & 0x4) { bytestoskip += 2 };
+
+      skip over bytestoskip bytes (after w2) and find the tocslots and title
+      as in the non-extended case.
+
+  **** The table of contents array ****
+
+      Beginning at file offset tocstart, this structure can overlay the
+      file:
+          {
+             int32  tocentrystart[ntoc];       // array of file offsets of
+                                               // tocentries (above)
+          }
+
+  **** The Slots array ****
+
+      Beginning at file offset slotsstart (provided by the file header) find
+          {
+             int32 slots[nslots];       // file offset of the article
+                                        // corresponding  to this slot
+          }
+
+  **** The Dictionary ****
+
+      Beginning at  file offset dictstart (provided by the file header) and
+      continuing until ndict entries have been read (and dictlen bytes have
+      been consumed from the file) find a sequence of length-preceeded
+      strings.  Note that the length includes the length byte (not Pascal
+      compatible!).  Build a table mapping i to the ith string.
+          {
+             char8*     strings[ndict];
+          }
+
+  **** The Article entries ****
+
+      Beginning at file offset slots[i] the following structure can overlay
+      the file:
+          {
+             int8       stuff;          // ?? [always seen 0]
+             int32      localdictpos;   // file offset of the local dictionary
+             int8       nlocaldict;     // number of entries in the local dict
+             int16      ntext;          // number of bytes in the text
+             int8       text[ntext];    // encoded text of the article
+          }
+
+  **** The Local dictionary ****
+
+      Beginning at file position localdictpos (for each article) there is an
+      array:
+          {
+             int16      localwords[nlocaldict];
+          }
+
+  **** The Text ****
+
+      The text for an article then consists of words obtained by referencing
+      strings[localwords[text[i]]] for i in (0..ntext), with the following
+      exceptions.  If text[i] is greater than nlocaldict it means
+
+         0xfa => end-of-paragraph, sets spacing to TRUE if not in monospace
+         0xfb => [unknown]
+         0xfc => spacing = !spacing
+         0xfd => line break (outside an example: ".br",
+                   sets spacing to TRUE if not in a
+                   monospace example)
+         0xfe => space
+         0xff => escape sequence                        // see below
+
+      When spacing is true, each word needs a space put after it.  When
+      false, the words are abutted and spaces are supplied using 0xfe or the
+      dictionary.  Examples are entered and left with 0xff escape sequences.
+      The variable "spacing" is initially (start of every article slot) TRUE.
+
+  **** 0xff escape sequences ****
+
+      These are used to change fonts, make cross references, enter and leave
+      examples, etc.  The general format is
+          {
+             int8       FF;             // always equals 0xff
+             int8       esclen;         // length of the sequence (including
+                                        // esclen but excluding FF)
+             int8       escCode;        // which escape function
+          }
+
+     escCodes I have partially deciphered are
+
+          0x01 =>               unknown
+
+          0x02 or 0x11 =>       (esclen==3) set left margin.
+               or 0x12          0x11 always starts a new line. Arguments
+                                  {
+                                     int8  margin;   // in spaces, 0=no margin
+                                  }
+                                note: in an IPF source, you must code
+                                :lm margin=256. to reset the left margin.
+
+          0x03 =>               (esclen==3) set right margin. Arguments
+                                  {
+                                     int8  margin;   // in spaces, 1=no margin
+                                  }
+
+          0x04 =>               (esclen==3) change style. Arguments
+                                  {
+                                     int8  style;    // 1,2,3: same as :hp#.
+                                                     // 4,5,6: same as :hp5,6,7.
+                                                     // 0 returns to plain text
+                                  }
+
+          0x05 =>               (esclen varies) beginning of cross
+                                reference.  The next two bytes of the
+                                escape sequence are an int16 index of
+                                the tocentrystart array.  The
+                                remaining bytes (if any) describe the size,
+                                position and characteristics of the
+                                window created when the
+                                cross-reference is followed by VIEW.
+                                Flag1 bit 7: 'split' window
+
+                                      bit 6: autolink
+                                      bit 3: window controls specified
+                                      bit 2: viewport
+                                      bit 1: target size supplied
+                                      bit 0: target position supplied
+                                Flag2 bit 0: ?
+                                      bit 1: dependent
+                                      bit 2: group supplied
+
+
+          0x06 =>               unknown
+
+          0x07 =>               (esclen==4) footnote start (:fn. tag). Arguments:
+                                  {
+                                     int16  toc;  // toc entry number of text
+                                  }
+                                footnotes end with 0x08
+
+          0x08 =>               (escLen==2) end of cross reference
+                                introduced by escape code 0x05 or 0x07
+
+          0x09 =>               unknown
+
+          0x0A =>               unknown
+
+          0x0B =>               (escLen==2) begin monosp. example.  set
+                                spacing to FALSE
+
+          0x0C =>               (escLen==2) end monosp. example.  set
+                                spacing to TRUE
+
+          0x0D =>               (escLen==2) special text colors. Arguments:
+                                  {
+                                     int8  color;   // 1,2,3: same as :hp4,8,9.
+                                                    // 0: default color
+                                  }
+
+          0x0E =>               Bitmap.
+                                  {
+                                     int8  flags;
+                                       4: runin flag
+                                       3: fit (scale) to window
+                                       2: align center
+                                       1: align right
+                                       0: always set?
+                                     int32 bitmapStartOffset;
+                                  }
+                                e.g. first bitmap always has offset 0
+
+          0x0F =>               if esclen==5 an inlined cross
+                                reference: the title of the referenced
+                                article becomes part of the text.
+                                This is probably the case even if
+                                esclen is not 5, but I don't know the
+                                decoding.  In the case that esclen is
+                                5, I don't know the purpose of the
+                                byte following the escCode, but the
+                                two bytes after that are an int16
+                                index of the tocentrystart array.
+
+          0x10 =>               [special link, reftype=launch]
+                                {
+                                  int8 unknown; ?
+                                  char launch_string[ esclen - 3 ];
+                                }
+
+
+          0x13 or 0x14 =>       (esclen==2) Set foreground (0x13)
+                                and background (0x14) color.  Arguments:
+                                  {
+                                     int8  color;
+                                       \\ 0 - default
+                                       \\ 1 - blue
+                                       \\ 2 - red
+                                       \\ 3 - ??
+                                       \\ 4 - green
+                                       \\ 5 - cyan
+                                       \\ 6 - yellow
+                                       \\ 7 - neutral
+                                  }
+
+          0x15 =>               unknown
+
+          0x16 =>               [special link, reftype=inform]
+
+          0x17 =>               hide text (:hide. tag). Arguments:
+                                  {
+                                     char8 key[];  // key required to show text
+                                  }
+
+          0x18 =>               end of hidden text (:ehide.)
+
+          0x19 =>               (esclen==3) change font. Arguments
+                                  {
+                                     int8  fontTableIndex (?);
+                                  }
+
+          0x1A =>               (escLen==3) begin :lines. sequence.  set
+                                spacing to FALSE.  Arguments
+                                  {
+                                     int8  alignment; // 1,2,4=left,right,center
+                                  }
+
+          0x1B =>               (escLen==2) end :lines. sequence.  set
+                                spacing to TRUE
+
+          0x1C =>               (escLen==2) Set left margin to current
+                                position.  Margin is reset at end of
+                                paragraph.
+
+          0x1F =>               [special link, reftype=hd database=...]
+
+          0x20 =>               (esclen==4) :ddf. tag. Arguments:
+                                  {
+                                     int16  res;  // value of res attribute
+                                  }
+
+      The font used in the text is the normal IBM extended character set,
+      including line graphics and some of the characters below 32.
+
+  **** The ressource number array ****
+
+      Beginning at file offset resstart, this structure can overlay the
+      file:
+          {
+              int16  res[nres];         // ressource number of panels
+              int16  toc[nres];         // toc entry number of panel
+          }
+
+  **** The text name array ****
+
+      Beginning at file offset namestart, this structure can overlay the
+      file:
+          {
+              int16  name[nres];        // index to panel name in dictionary
+              int16  toc[nres];         // toc entry number of panel
+          }
+
+  **** The index table ****
+
+      Beginning at file offset indexstart, a structure like the following
+      is stored for each of the nindex words (in alphabetical order).
+          {
+              int8  nword;              // size of name
+              int8  level;              // ? indent level
+                                        // bit 6 set: global entry
+                                        // bit 1 set: indent (:i2.)
+                                           bit 0 always set?
+              int8  number of roots;  // number of root references following
+              int16 toc;                // toc entry number of panel
+              char8 word[nword];        // index word [not zero-terminated]
+
+              there are n roots following:
+              int32 synonyms;           // 32 bit file offset to start of synonyms referencing this word
+          }
+
+  **** The extended data block ****
+
+      Not yet decoded.  This block has a size of 64 bytes and contains various
+      pointers to font names, names of externel databases etc.
+
+  **** The full text search table ****
+
+      Not yet decoded.  This table is supressed when "/S" is specified on
+      the IPFC command line.
+
+      In addition to data in...
+
+      RLE:
+
+      byte  RLEType; // ? always 1?
+
+      Then a sequence of blocks, until all data used:
+
+      byte Header;
+        // bits 0-6 are N
+        // bit 7:
+        //   0: there are N + 1 repeats of next byte.
+        //   1: N + 1 blocks of 'as is' data follow.
+        // except
+        //   value $80 means (?) the next byte contains the data byte,
+        //   and the next 2 bytes after that contain a 16 bit repeat number.
+
+
+      e.g. 04 00 means 5 repeats of 0
+           83 12 34 56 78 means the literal data 12 34 56 78
+           80 00 62 01 means $162 repeats of 0
+      byte DataByte; // with escapes
+           // bit 7 set means there are actually N+1 (=bits0-6) bytes of data to follow
+           // 0 means there is a single byte of data to follow (e.g. when the byte > 80)
+        ( optionally ) byte[ N+1 ] data
+      int16 Number of zeroes to follow
+  **** Image data ****
+
+      Beginning at file offset imgstart, this data is a series of compressed
+      OS/2 bitmaps.
+      Each starts with a BITMAPFILEHEADER:
+        {
+           int16 usType;     // 'bM' for bitmap
+           int32 cbSize;     // total bitmap size including header
+                             // BEFORE compression: not correct in this context
+           int16 xHotspot;   // only for icons/pointers, not relevant here?
+           int16 yHotspot;
+           int16 offBits;    // offset to the actual bitmap data bits
+           BITMAPINFOHEADER bmp; // further bitmap data:
+             int32 cbFix;    // length of bitmapinfo header structure (12)
+                             // (including this field)
+             int16 cx;       // bitmap width
+             int16 cy;       // bitmap height
+             int16 cPlanes;  // num bitplanes - always 1 AFAIK
+             int16 bitCount; // bits per pixel e.g. 4 = 16 colors
+
+           RGB  palette[ N ]; // 2 ^ bitCount * 3 bytes
+
+           bitmapData;       // in a special IPF format:
+              int32 totalLength; // not including this field, but including the next
+              int16 bitmapSize; // total size of memory required
+                                // for uncompressed bitmap i.e.
+                                // bytes per line rounded up to longword (4byte)
+                                // x rows
+                                // (This info is redundant)
+
+           Followed by a series of blocks each up to 64k uncompressed.
+           Blocks:
+              int16 dataLength; // length of data following (including data type field)
+
+              int8  dataType;   // 0 = uncompressed
+                                   2 = compressed
+              data...
+              Compression is LZW (Lempel Ziv XX?)
+
+        }
+
+  **** NLS table ****
+
+      Not yet decoded. This table contains informations specific to the
+      language and codepage the document was prepared in. It seems to contain
+      some bitfields as well that might be used for character classification.
+
+Appendix 1: Some useful translations from IBM Extended ASCII to normal ASCII
+
+      One other transformation I had to make was of the character box
+      characters of the IBM extended ASCII set.  These characters appear in strings
+      in the dicitonary.  They are given here in octal together with their translation.
+
+          020, 021      => blank seems satisfactory
+          037           => solid down arrow: used to give direction to
+                                a line in the syntax diagrams
+          0263          => vertical bar
+          0264          => left connector: vertical bar with short
+                                horizontal bar extending left from the
+                                center
+          0277, 0300    => top right or bottom left corner; one is
+                                one, the other is the other and I
+                                can't tell which from my translation
+          0301          => up connector: horizontal line with vertical
+                                line extending up from the center
+          0302          => down connector: horizontal line with
+                                vertical line extending down from the
+                                center
+          0303          => right connector: vertical bar with short
+                                horizontal bar extending right from
+                                the center
+          0304          => horizontal bar
+          0305          => cross connector, i.e. looks like + only
+                                slightly larger to connect with
+                                adjacent chars
+          0331, 0332    => top left or bottom right corner; one is
+                                one, the other is the other and I
+                                can't tell which from my translation
+
+
+Appendix 2:  Style codes for escCode 0x04 and 0x0D
+
+      escCode 0x04 implements font changes associated with the :hp# IPF source tag.
+
+          :hp1 is italic font
+          :hp2 is bold font
+          :hp3 is bold italic font
+          :hp5 is normal underlined font
+          :hp6 is italic underlined font
+          :hp7 is bold underlined font
+
+     tags :hp4, :hp8, and :hp9 introduce different colored text which is encoded in
+     the .inf or .hlp file using escCode 0x0D.  On my monitor normal text is dark blue.
+
+          :hp4 text is light blue
+          :hp8 text is red
+          :hp9 text is magenta
+
+
+
+History:
+October 22, 1992: version for initial posting (inf01.doc)
+July 12, 1993: second version (refer to introduction for changes) (inf02.doc)
+July 18, 1993: added appendices to the second version (inf02a.doc)
+
-- 
cgit v1.2.3-70-g09d2