summaryrefslogtreecommitdiff
path: root/docview/docs/INF_article.txt
diff options
context:
space:
mode:
Diffstat (limited to 'docview/docs/INF_article.txt')
-rw-r--r--docview/docs/INF_article.txt1186
1 files changed, 593 insertions, 593 deletions
diff --git a/docview/docs/INF_article.txt b/docview/docs/INF_article.txt
index d5c5ffd0..c6eb5f03 100644
--- a/docview/docs/INF_article.txt
+++ b/docview/docs/INF_article.txt
@@ -1,593 +1,593 @@
-
- Information on reading the INF file format
- ==========================================
-
-Author: unknown
-Date: unknown
-
-
-This article is intended to provide the reader with enough information
-to read and search the HLP/INF file format. Support is not provided
-for constructing your own INF files.
-
-The INF and HLP file format are exactly the same except for the
-switching of two bytes in the header. Therefore all the information in
-this article applies to both HLP and INF files. The difference between
-the two will be explained later. I will, however, use the term "INF
-format" to distinguish between OS/2 HLP files and Windows HLP files.
-
-This article will be divided into three main parts. First there will
-be an overview of the file format. Second there will be information on
-accessing parts of this format, including code samples. Third will be
-information on searching the INF format.
-
-Note that to understand a lot of the concepts in displaying panels,
-an understanding of the IPF Guide and Reference is necessary. This
-will give an understanding of the ways in which panels can be modified
-in terms of sizes, styles, etc.
-
-
-Overview
-========
-
- [This is where I will put Stan's document]
-
-Accessing Information in the INF/HLP file
------------------------------------------
-The next part of this article is organized as if you are writing your
-own INF/HLP viewer. It will provide explanations on how to do the
-following things:
-
-1. Read in header information. This will allow you to display the
- title and access the rest of the information in the panel.
-
-2. Read in and index the vocabulary.
-
-3. Read in the Cell Offset Table. This will be used later to display
- panels.
-
-4. Read in the table of contents. Explanations will be given for two
- methods of accessing the table of contents. The first is from
- memory. This method is useful if you are reading the entire table
- of contents at once to display it, or if your application will
- provide primary access to panels through the table of contents. The
- second method of accessing table of contents entries is directly
- from the file. This method is useful for linking and for displaying
- a panel initially when a file is opened. In OS/2, VIEW uses the
- first method whereas displaying help for an application uses the
- second method.
-
-5. Display the titles of all panels in the file. Titles will not be
- stored because access to the table of contents is provided by
- index, not by title. There is a lot of extra information in the
- table of contents entries that will not be used until a cell is
- actually displayed.
-
-6. Display a panel. This will be the most involved explanation.
- Displaying a panel actually requires retrieving a lot of
- information from the table of contents and then reading and
- formatting the data within the panel.
-
-Note that this makes some basic assumptions that you are going to use
-the file in a similar manner as OS/2s VIEW program and Help Manager.
-
-
-Headers
-=======
-The first step in accessing information in the file is to read in the
-header. Figures 1 and 2 show the structures used to acccess the
-regular and extended headers. The extended header is not in every
-file. It can be detected by checking the ExtHeaderOffset in the
-DOCHEADER structure. If the ExtHeaderOffset is greater than 0, then it
-is the file offset of the extended header. The following code fragment
-opens the file and reads the DOCHEADER and the EXTHEADER if necessary.
-
-DOCHEADER DocHeader;
-EXTHEADER ExtHeader;
-FILE* fpointer;
-CHAR* FileName;
-
-fpointer = fopen(FileName,"rb");
-fread(&DocHeader,sizeof(DOCHEADER),1,fpointer);
-if (DocHeader.ExtHeaderOffset > 0) {
- fseek(fpointer, DocHeader.ExtHeaderOffset, SEEK_SET);
- fread(&ExtHeader,sizeof(EXTHEADER),1,fpointer);
-}
-
-The DOCHEADER contains all of the information needed to access data
-within the file. At this point, though, there are only a couple
-fields we are concerned about. The first is the FileCTRLWord. This
-field indicates the type of file. If it is 0x01505348 it is an INF file;
-If it is 0x10505348, it is a HLP file. The other field of use right now
-is the Title; this is simply a null-terminated string containing
-the title of the document. One interesting note is that although
-the title for a HLP file is normally specified in an application,
-there still is a title in the HLP file if the writer specified
-a :title. tag.
-
-
-Vocabulary
-==========
-Once the DOCHEADER is obtained, the next step is to read the vocabulary.
-All references to the vocabulary in the INF/HLP file are made via an
-index value. This index, however, is not in the file so it must be
-built. The following code fragment reads in the vocabulary and builds
-an index to it.
-
-PULONG pulVocabIndex;
-PCHAR pchVocab;
-ULONG ulVocabPointer;
-INT i;
-
-pchVocab = malloc(DocHeader.CLVTSize);
-pulVocabIndex = malloc(DocHeader.CLVTNumWords*(sizeof(ULONG));
-
-fseek(fpointer, DocHeader.CLVTOffset,SEEK_SET);
-fread(pchVocab, DocHeader.CLVTSize, 1, fpointer);
-
-ulVocabPointer = 0;
-for (i=0;i< DocHeader.CLVTNumWords ;i++ ) {
- pulVocabIndex[i] = ulVocabPointer;
- ulVocabPointer += pchVocab[ulVocabPointer];
-} /* endfor */
-
-Remember that when referencing the vocabulary, the first byte contains the
-length of the word, including the first byte. Here is the result of the
-above code sample with a vocabulary of {you, can, develop}.
-
-Example:
---------
-pchVocab -> 4can8develop4you
- │ │ │
- │ └───┐ ┌┘
- └─────┐ │ │
-pulVocabIndex -> {0,4,12}
-
-Given any index into the vocabulary, you can then reference the
-appropriate word.
-
-
-Cell Offset Table
-=================
-The Cell Offset Table will be read next. It will be used later to
-get the file offsets of the cells within each panel. The information
-needed to obtain the Cell Offset Table is contained in the DOCHEADER.
-The pertinent fields are NumCell and COTOffset. The following code
-fragment retrieves the Cell Offset Table from the file.
-
-PULONG pulCOT;
-
-pulCOT = malloc(DocHeader.NumCell*sizeof(ULONG));
-fseek(fpointer, DocHeader.COTOffset, SEEK_SET);
-fread(pulCOT, DocHeader.NumCell*sizeof(ULONG), 1, fpointer);
-
-
-Table Of Contents
-=================
-The next step is to read the table of contents into memory.
-The DOCHEADER contains all the values necessary to read the table of
-contents. TOCOffset contains the file offset of the table of contents
-and TOCSize contains the size. The table of contents is read in a
-similar manner to the vocabulary; that is, it is read into memory
-and then indexed. The following code fragments reads the table
-of contents into memory.
-
-PULONG pulTOCIndex;
-PBYTE pbTOC;
-ULONG ulTOCPointer;
-INT i;
-
-pbTOC = malloc(DocHeader.TOCSize);
-pulTOCIndex = malloc(DocHeader.NumTOCEntry*(sizeof(ULONG));
-
-fseek(fpointer, DocHeader.TOCOffset,SEEK_SET);
-fread(pbTOC, DocHeader.TOCSize, 1, fpointer);
-
-ulTOCPointer = pbTOC;
-for (i=0;i< DocHeader.NumTOCEntry ;i++ ) {
- pulTOCIndex[i] = ulTOCPointer;
- ulTOCPointer += (BYTE) *pvTOC;
-} /* endfor */
-
-Each entry in the TOC index is an address of a TOC entry. Once the TOC
-is in memory, individual TOC items can be referenced. Note that this
-is not the only way to reference the TOC. There is also a TOC Ofset
-Table which provides file offsets to individual TOC items. This is used
-when you need to reference a panel individually by its TOC index.
-This is true when linking and when opening a HLP or INF file without
-display the table of contents first. The following code fragment
-retrieves the TOC Offset Table.
-
-PULONG pulTOCOffsetTable;
-
-pulTOCOffsetTable = malloc(DocHeader.NumTOCEntry*sizeof(ULONG));
-
-fseek(fpointer, DocHeader.OfsTiTICOfsTable,SEEK_SET);
-fread(pulTOCOffsetTable, DocHeader.NumTOCEntry*sizeof(ULONG), 1, fpointer);
-
-Once you have access to a table of contents entry, you must then read
-in the data contained there. This is not very straightforward due
-to the fact that TOC entries can very greatly in length. At this point,
-though, the important thing to read from the TOC entries are the titles.
-You will probably not want to use the header information until you need
-to display a panel. The following code fragment just reads the title
-given that the table of contents is in memory and the entry we want
-to access is i. It also checks the the extended header (if it exists)
-to determine whether or not the entry is a parent. This will allow
-you to display some sort of indicator that the entry can be expanded
-to display its children, similar to the way the help manager does.
-
-TOCIN TocIn;
-USHORT NumBytes;
-BOOL fParent = FALSE;
-PSZ pszTitle;
-USHORT TOCControlWord;
-
-memcpy(&TOCIn, pulTOCIndex[i], sizeof(TOCIN));
-NumBytes = sizeof(TOCIN);
-ExtHeader = TocIn.HeadLevel&HIGH_ORDER_MASK;
-if (ExtHeader) {
- memcpy(&TOCControlWord, pulTOCIndex[i]+sizeof(TOCIN), sizeof(USHORT));
- NumBytes += sizeof(USHORT);
- if (TOCControlWord&PANEL_EXTENDEDPARENT) {
- fParent = TRUE;
- }
- if (TOCControlWord&PANEL_EXTENDED_X_Y) {
- NumBytes+=(sizeof(BYTE)+2*sizeof(USHORT));
- }
- if (TOCControlWord&PANEL_EXTENDED_CX_CY) {
- NumBytes+=(sizeof(BYTE)+2*sizeof(USHORT));
- }
- if (TOCControlWord&PANEL_EXTENDED_STYLE) {
- NumBytes += sizeof(USHORT);
- }
- if (TOCControlWord&PANEL_EXTENDED_GROUP) {
- NumBytes += sizeof(USHORT);
- }
- if (TOCControlWord&PANEL_EXTENDED_CTRLSINDEX)
- NumBytes += sizeof(USHORT);
-}
-NumBytes += TOCIn.NumCells*sizeof(BYTE);
-pszTitle = malloc(TOCIn.LengthEntry-NumBytes+1);
-memcpy(pszTitle, pulTOCIndex[i]+NumBytes, TOCIn.LengthEntry-NumBytes);
-pszTitle[TOCIn.LengthEntry-NumBytes] = '\0';
-
-If the table of contents entry is on disk, the following code fragment
-retrieves the title and status as a parent using the TOC Offset Table.
-
-TOCIN TocIn;
-USHORT NumBytes;
-BOOL fParent = FALSE;
-PSZ pszTitle;
-
-fseek(fpointer, pulTOCOffsetTable[i], SEEK_SET);
-fread(&TOCIn, sizeof(TOCIN), 1, fpointer(TOCIN));
-NumBytes = sizeof(TOCIN);
-ExtHeader = TocIn.HeadLevel&HIGH_ORDER_MASK;
-if (ExtHeader) {
- fread(&TOCControlWord, sizeof(USHORT), 1, fpointer);
- NumBytes += sizeof(USHORT);
- if (TOCControlWord&PANEL_EXTENDEDPARENT) {
- fParent = TRUE;
- }
- if (TOCControlWord&PANEL_EXTENDED_X_Y) {
- fseek(fpointer, sizeof(BYTE)+2*sizeof(USHORT), SEEK_CUR);
- NumBytes+=(sizeof(BYTE)+2*sizeof(USHORT));
- }
- if (TOCControlWord&PANEL_EXTENDED_CX_CY) {
- fseek(fpointer, sizeof(BYTE)+2*sizeof(USHORT), SEEK_CUR);
- NumBytes+=(sizeof(BYTE)+2*sizeof(USHORT));
- }
- if (TOCControlWord&PANEL_EXTENDED_STYLE) {
- fseek(fpointer, sizeof(USHORT), SEEK_CUR);
- NumBytes += sizeof(USHORT);
- }
- if (TOCControlWord&PANEL_EXTENDED_GROUP) {
- fseek(fpointer, sizeof(USHORT), SEEK_CUR);
- NumBytes += sizeof(USHORT);
- }
- if (TOCControlWord&PANEL_EXTENDED_CTRLSINDEX) {
- fseek(fpointer, sizeof(USHORT), SEEK_CUR);
- NumBytes += sizeof(USHORT);
- }
-}
-NumBytes += TOCIn.NumCells*sizeof(BYTE);
-fseek(fpointer, TOCIn.NumCells*sizeof(BYTE), SEEK_CUR);
-pszTitle = malloc(TOCIn.LengthEntry-NumBytes+1);
-fread(pszTitle, TOCIn.LengthEntry-NumBytes, SEEK_SET);
-pszTitle[TOCIn.LengthEntry-NumBytes] = '\0';
-
-IMPORTANT: The title is not null terminated. We had to explicitly
- add a null to the end of the title.
-
-The TOCIN structure and TOCControlWord provide a lot more information
-than we are using in the above fragment. One important piece of
-information is the headlevel. This is available in the HeadLevel
-field of the TOCIN structure. To actually get at the headlevel,
-you must OR the HeadLevel field with LOW_ORDER_MASK. The result is
-a byte indicating the head level (1-6). Most of the other information
-in the table of contents is not really usable until a panel is displayed.
-When we are displaying a panel, we will reaccess the table of contents
-entry to get the pertinent information.
-
-
-Displaying a Panel
-==================
-The next step is displaying a panel. All that is necessary to display
-a panel is an index to a table of contents entry. We will use this
-index to get all the information about the panel. Table of contents
-indexes are obtained from various places including the index table,
-the panel number table, the panel name table, and the search table.
-In our viewer, we obtain the table of contents index determine which
-table of contents entry the use selected. Remember that we did not save
-the title entries of the table of contents. The reason we did not is
-because all we need is an index to the table of contents entry. The
-title is not necessary for access.
-
-The first step is to get the TOCIN structure from memory or from the
-file. This is performed in the same manner as the above code
-fragments. Once we have the TOCIN structure, we can detect the presence
-of an extended table of contents header (hereafter referred to as the
-control word). To detect the presence of a control word, we OR the
-HeadLevel value with the HIGH_ORDER_MASK. If the resulting value
-is not zero, there is a control word present. The control word
-is defined as a USHORT. By ORing the TOCControlWord with various
-constants defined in the header file, we can determine information
-about the panel including location, size, group, etc. The following
-constants are used to find that information.
-
-PANEL_EXTENDED_VIEWPORT
-PANEL_EXTENDED_NOSEARCH - Entry should not be searched
-PANEL_EXTENDED_NOPRINT - Entry should not be printed
-PANEL_EXTENDED_AUTO
-PANEL_EXTENDED_CHILD - Entry is a child
-PANEL_EXTENDED_CLEAR
-PANEL_EXTENDED_DEPENDENT
-PANEL_EXTENDED_PARENT - Entry is a parent
-PANEL_EXTENDED_TUTORIAL
-
-PANEL_EXTENDED_X_Y - lower left location of panel
-Read in additional byte,word,word
-PANEL_EXTENDED_CX_CY - size of panel
-Read in additional byte,word,word
-PANEL_EXTENDED_STYLE - style of window
-Read in additional word
-PANEL_EXTENDED_GROUP - group number
-Read in additional word
-PANEL_EXTENDED_CTRLSINDEX - control group index
-Read in additonal word
-
-To obtain a better understanding of things like groups and control
-group indexes, please consult the Information Presentation Facility
-Guide and Reference.
-
-The last five constants indicate that additional information needs
-to be read after the control word. You will note that in the
-above code fragments for reading the title, we had to process these
-values and skip bytes where appropriate. Later, in the code
-fragments to retrieve control word information, we will show you how
-to get these extra values.
-
-The other bit of information available in the HeadLevel field is,
-suprisingly, the head level! We can obtain the headlevel by ORing
-the value in HeadLevel with LOW_ORDER_MASK.
-
-The following code fragment reads in the extended header informatiom
-from memory and sets some variables based on the above constants.
-It also obtains the head level. Note that for your particular
-application, you might not need all of this information.
-
-/* INSERT CODE FRAGMENT THAT USES MEMCPY TO GET TOC AND PANEL INFO */
-
-If you are reading the table of contents from the file instead of
-memory, use the following code fragment.
-
-TOCIN TocIn;
-USHORT NumBytes;
-BOOL fParent = FALSE;
-PSZ pszTitle;
-
-fseek(fpointer, pulTOCOffsetTable[i], SEEK_SET);
-fread(&TOCIn, sizeof(TOCIN), 1, fpointer(TOCIN));
-HeadLevel = TOCIn.HeadLevel&LOW_ORDER_MASK;
-ExtHeader = TocIn.HeadLevel&HIGH_ORDER_MASK;
-if (ExtHeader) {
- fread(&TOCControlWord, sizeof(USHORT), 1, fpointer);
- if (TOCControlWord&PANEL_EXTENDED_X_Y) {
- fread(&bxyUnits, sizeof(BYTE), 1, fpointer);
- fread(&usx, sizeof(USHORT), 1, fpointer);
- fread(&usy, sizeof(USHORT), 1, fpointer);
- }
- if (TOCControlWord&PANEL_EXTENDED_CX_CY) {
- fread(&bcxcyUnits, sizeof(BYTE), 1, fpointer);
- fread(&uscx, sizeof(USHORT), 1, fpointer);
- fread(&uscy, sizeof(USHORT), 1, fpointer);
- }
- if (TOCControlWord&PANEL_EXTENDED_STYLE)
- fread(&usStyle, sizeof(USHORT), 1, fpointer);
- if (TOCControlWord&PANEL_EXTENDED_GROUP)
- fread(&usGroupNumber, sizeof(USHORT), 1, fpointer);
- if (TOCControlWord&PANEL_EXTENDED_CTRLSINDEX)
- fread(&usControlGroupIndex, sizeof(USHORT), 1, fpointer);
-}
-
-The information from the table of contents header is generally only
-used to decide how the window that the help is in will be displayed.
-
-You can use it to position your window and to decide whether it has
-a border, minimize or mazimize buttons, etc.
-Once you have all of the display information from the table of contents,
-you can begin actually getting the information that is in the panel.
-Don't forget to display the title of the panel. We obtained it
-again in the above code fragments.
-
-A panel consists of one or more cells that contain formatting information
-and the text of the panel. In most cases, panels have more than one
-cell, so you cannot make the assumption that panels have once cell.
-The number of cells in a panel can be found from the NumCells field
-in the TOCIN structure.
-
-After reading all the extended header information and the title in the
-above samples, you will notice that we saved a value called
-pusBeginCell. This is a pointer to the place in the table of contents
-where the list of cells begins. These cell values actually index into
-the Cell Offset Table. Using the Cell Offset Table we can get the file
-offsets of the individual cells and display the information in them.
-
-The offsets in the COT are only used to retrieve the actual cell.
-In and of themselves, they provide no additional information. For this
-reason, they will only be used as a part of a code fragment to retrieve
-the actual cell.
-
-In retrieving the cell, the first step is to retrieve the cell header.
-The cell offset points to this header. Once we have the header, we
-can get the information to display the cell.
-
-The following code fragment loops through all cells in table of contents
-entry i, and reads in the cell headers. The dots indicate
-where you would actually process the information in the cell, which we
-will do later.
-
-INT j;
-USHORT usCOTIndex;
-CELL Cell;
-
-for (j=0; j<=TOCIn.NumCells; j++)
-/* If the table of contents is in memory, use */
- memcpy(&usCOTIndex, pusBeginCell, sizeof(USHORT));
-/* If the table of contents is on disk, use */
- fseek(fpointer, pusBeginCell, SEEK_SET);
- fread(&usCOTIndex, sizeof(USHORT),1,fpointer);
-/* What follows is the same for both cases */
- fseek(fpointer, COT[usCOTIndex], SEEK_SET);
- fread(&Cell, sizeof(CELL),1,fpointer);
- .
- .
- .
-}
-
-The cell information allows us to actually display the text itself.
-
-In the cell header, we have information about the CVT and the CDI.
-These two arrays give us the actual formatting information. The CDI
-contains formatting information and words. The words are represented
-as indexes into the CVT. The CVT elements are indexes into the
-vocabulary (CLVT). To use the CVT and CDI, we will read them into
-memory and process the CVT byte by byte. The following code fragment
-reads the CVT and the CDI into memory. Note that after reading
-the cell header from the file, we are pointing at the beginning of the
-CDI.
-
-PBYTE CDI;
-PUSHORT CVT;
-
- CDI = malloc(Cell.CDISize);
- rc=fread(CDI,Cell.CDISize,1,fpointer);
- rc=fseek(fpointer,Cell.CVTOffset, SEEK_SET);
- CVT = malloc(Cell.CVTSize*2);
- rc=fread(CVT, Cell.CVTSize*2,1,fpointer);
-
-Even though the CVT is right after the CDI in the header, we cannot
-assume that after reading the CDI we are pointing at the CVT. We
-must fseek to the CVTOffset.
-
-Now we want to read the CDI byte by byte and process each item.
-The CDI values are either FA thru FF or they are a number which
-indexes into the CVT which then indexes into the vocabulary.
-Whenever the CDI value is an FF, we need to read additional info.
-This additional info is formatting information, font changes, links,
-etc. The FF escape code values are documented in appendix A.
-The following code fragment does some very basic formatting of a cell.
-Figure 5 provides the values for each of the BYTE_* values.
-
-INT m;
-INT l;
-CHAR String[255];
-BOOL Together = FALSE;
-
-for (m=0;m<Cell.CDISize;m++ ) {
- switch (CDI[m]) {
- /* The value indicates a new paragraph */
- case BYTE_PARA :
- printf("\n ");
- break;
- /* The value indicates that the text following should be centered. */
- /* Center all text until a BYTE_NEWLINE is encountered */
- case BYTE_CENTRE :
- printf("Center\n");
- break;
- /* Autoblanks are used to indicate when there should be no spaces */
- /* between words. When you encounter an autoblank, all words */
- /* following should be printed without spaces until another autoblank */
- /* is encountered. */
- case BYTE_AUTOBLANK :
- if (Together)
- Together = FALSE;
- else
- Together = TRUE;
- break;
- /* The value indicates to print a new line */
- case BYTE_NEWLINE :
- printf("\n");
- break;
- /* The value indicates a space */
- case BYTE_BLANK :
- printf(" ");
- break;
- /* The value is an escape code. We should do some processing within */
- /* This statement to handle the various cases documented in */
- /* Appendix A */
- case BYTE_ESC :
- break;
- /* Not sure */
- case WILD_CHAR :
- break;
- /* Not sure */
- case ESC_CHAR :
- break;
- /* It is a word. Index into the CVT to get the vocabulary offset */
- /* And print the word */
- default:
- l = pulVocabIndex[CVT[CDI[m]]]+1;
- for (k=0;k<(pchVocab[pulVocabIndex[CVT[CDI[m]]]]-1) ;k++,l++ )
- String[k] = pchVocab[l];
- /* Don't forget to null terminate the word. */
- String[k] = '\0';
- if (Together)
- printf("%s",String);
- else
- printf("%s ",String);
-
- break;
- } /* endswitch */
-} /* endfor */
-
-So that is the basics. You should now be able to display a cell.
-
-Accessing other information within the INF file
------------------------------------------------
-Index table (index) - synonym table
-Index command table (icmd)
-Panel table (context sensitive help)
-Panel name table (link)
-
-Database names
-Fonts
-Country / Grammar
-Bitmaps (should be interesting due to different compression)
-Strings
-Control
-Super Search Table (aka FTS or Full Text Search)
-Child pages
-
-
-Searching
-=========
-If you just want to search an INF file, you will still have to read in
-the vocabulary and the table of contents. You search the vocabulary
-for a word, and then get the index of that word. This index can be
-used to index into the Super Search Table which allows you to
-determine which panels contain that word. What a pain!
-
- ---------o0O0o---------
-
+
+ Information on reading the INF file format
+ ==========================================
+
+Author: unknown
+Date: unknown
+
+
+This article is intended to provide the reader with enough information
+to read and search the HLP/INF file format. Support is not provided
+for constructing your own INF files.
+
+The INF and HLP file format are exactly the same except for the
+switching of two bytes in the header. Therefore all the information in
+this article applies to both HLP and INF files. The difference between
+the two will be explained later. I will, however, use the term "INF
+format" to distinguish between OS/2 HLP files and Windows HLP files.
+
+This article will be divided into three main parts. First there will
+be an overview of the file format. Second there will be information on
+accessing parts of this format, including code samples. Third will be
+information on searching the INF format.
+
+Note that to understand a lot of the concepts in displaying panels,
+an understanding of the IPF Guide and Reference is necessary. This
+will give an understanding of the ways in which panels can be modified
+in terms of sizes, styles, etc.
+
+
+Overview
+========
+
+ [This is where I will put Stan's document]
+
+Accessing Information in the INF/HLP file
+-----------------------------------------
+The next part of this article is organized as if you are writing your
+own INF/HLP viewer. It will provide explanations on how to do the
+following things:
+
+1. Read in header information. This will allow you to display the
+ title and access the rest of the information in the panel.
+
+2. Read in and index the vocabulary.
+
+3. Read in the Cell Offset Table. This will be used later to display
+ panels.
+
+4. Read in the table of contents. Explanations will be given for two
+ methods of accessing the table of contents. The first is from
+ memory. This method is useful if you are reading the entire table
+ of contents at once to display it, or if your application will
+ provide primary access to panels through the table of contents. The
+ second method of accessing table of contents entries is directly
+ from the file. This method is useful for linking and for displaying
+ a panel initially when a file is opened. In OS/2, VIEW uses the
+ first method whereas displaying help for an application uses the
+ second method.
+
+5. Display the titles of all panels in the file. Titles will not be
+ stored because access to the table of contents is provided by
+ index, not by title. There is a lot of extra information in the
+ table of contents entries that will not be used until a cell is
+ actually displayed.
+
+6. Display a panel. This will be the most involved explanation.
+ Displaying a panel actually requires retrieving a lot of
+ information from the table of contents and then reading and
+ formatting the data within the panel.
+
+Note that this makes some basic assumptions that you are going to use
+the file in a similar manner as OS/2s VIEW program and Help Manager.
+
+
+Headers
+=======
+The first step in accessing information in the file is to read in the
+header. Figures 1 and 2 show the structures used to acccess the
+regular and extended headers. The extended header is not in every
+file. It can be detected by checking the ExtHeaderOffset in the
+DOCHEADER structure. If the ExtHeaderOffset is greater than 0, then it
+is the file offset of the extended header. The following code fragment
+opens the file and reads the DOCHEADER and the EXTHEADER if necessary.
+
+DOCHEADER DocHeader;
+EXTHEADER ExtHeader;
+FILE* fpointer;
+CHAR* FileName;
+
+fpointer = fopen(FileName,"rb");
+fread(&DocHeader,sizeof(DOCHEADER),1,fpointer);
+if (DocHeader.ExtHeaderOffset > 0) {
+ fseek(fpointer, DocHeader.ExtHeaderOffset, SEEK_SET);
+ fread(&ExtHeader,sizeof(EXTHEADER),1,fpointer);
+}
+
+The DOCHEADER contains all of the information needed to access data
+within the file. At this point, though, there are only a couple
+fields we are concerned about. The first is the FileCTRLWord. This
+field indicates the type of file. If it is 0x01505348 it is an INF file;
+If it is 0x10505348, it is a HLP file. The other field of use right now
+is the Title; this is simply a null-terminated string containing
+the title of the document. One interesting note is that although
+the title for a HLP file is normally specified in an application,
+there still is a title in the HLP file if the writer specified
+a :title. tag.
+
+
+Vocabulary
+==========
+Once the DOCHEADER is obtained, the next step is to read the vocabulary.
+All references to the vocabulary in the INF/HLP file are made via an
+index value. This index, however, is not in the file so it must be
+built. The following code fragment reads in the vocabulary and builds
+an index to it.
+
+PULONG pulVocabIndex;
+PCHAR pchVocab;
+ULONG ulVocabPointer;
+INT i;
+
+pchVocab = malloc(DocHeader.CLVTSize);
+pulVocabIndex = malloc(DocHeader.CLVTNumWords*(sizeof(ULONG));
+
+fseek(fpointer, DocHeader.CLVTOffset,SEEK_SET);
+fread(pchVocab, DocHeader.CLVTSize, 1, fpointer);
+
+ulVocabPointer = 0;
+for (i=0;i< DocHeader.CLVTNumWords ;i++ ) {
+ pulVocabIndex[i] = ulVocabPointer;
+ ulVocabPointer += pchVocab[ulVocabPointer];
+} /* endfor */
+
+Remember that when referencing the vocabulary, the first byte contains the
+length of the word, including the first byte. Here is the result of the
+above code sample with a vocabulary of {you, can, develop}.
+
+Example:
+--------
+pchVocab -> 4can8develop4you
+ │ │ │
+ │ └───┐ ┌┘
+ └─────┐ │ │
+pulVocabIndex -> {0,4,12}
+
+Given any index into the vocabulary, you can then reference the
+appropriate word.
+
+
+Cell Offset Table
+=================
+The Cell Offset Table will be read next. It will be used later to
+get the file offsets of the cells within each panel. The information
+needed to obtain the Cell Offset Table is contained in the DOCHEADER.
+The pertinent fields are NumCell and COTOffset. The following code
+fragment retrieves the Cell Offset Table from the file.
+
+PULONG pulCOT;
+
+pulCOT = malloc(DocHeader.NumCell*sizeof(ULONG));
+fseek(fpointer, DocHeader.COTOffset, SEEK_SET);
+fread(pulCOT, DocHeader.NumCell*sizeof(ULONG), 1, fpointer);
+
+
+Table Of Contents
+=================
+The next step is to read the table of contents into memory.
+The DOCHEADER contains all the values necessary to read the table of
+contents. TOCOffset contains the file offset of the table of contents
+and TOCSize contains the size. The table of contents is read in a
+similar manner to the vocabulary; that is, it is read into memory
+and then indexed. The following code fragments reads the table
+of contents into memory.
+
+PULONG pulTOCIndex;
+PBYTE pbTOC;
+ULONG ulTOCPointer;
+INT i;
+
+pbTOC = malloc(DocHeader.TOCSize);
+pulTOCIndex = malloc(DocHeader.NumTOCEntry*(sizeof(ULONG));
+
+fseek(fpointer, DocHeader.TOCOffset,SEEK_SET);
+fread(pbTOC, DocHeader.TOCSize, 1, fpointer);
+
+ulTOCPointer = pbTOC;
+for (i=0;i< DocHeader.NumTOCEntry ;i++ ) {
+ pulTOCIndex[i] = ulTOCPointer;
+ ulTOCPointer += (BYTE) *pvTOC;
+} /* endfor */
+
+Each entry in the TOC index is an address of a TOC entry. Once the TOC
+is in memory, individual TOC items can be referenced. Note that this
+is not the only way to reference the TOC. There is also a TOC Ofset
+Table which provides file offsets to individual TOC items. This is used
+when you need to reference a panel individually by its TOC index.
+This is true when linking and when opening a HLP or INF file without
+display the table of contents first. The following code fragment
+retrieves the TOC Offset Table.
+
+PULONG pulTOCOffsetTable;
+
+pulTOCOffsetTable = malloc(DocHeader.NumTOCEntry*sizeof(ULONG));
+
+fseek(fpointer, DocHeader.OfsTiTICOfsTable,SEEK_SET);
+fread(pulTOCOffsetTable, DocHeader.NumTOCEntry*sizeof(ULONG), 1, fpointer);
+
+Once you have access to a table of contents entry, you must then read
+in the data contained there. This is not very straightforward due
+to the fact that TOC entries can very greatly in length. At this point,
+though, the important thing to read from the TOC entries are the titles.
+You will probably not want to use the header information until you need
+to display a panel. The following code fragment just reads the title
+given that the table of contents is in memory and the entry we want
+to access is i. It also checks the the extended header (if it exists)
+to determine whether or not the entry is a parent. This will allow
+you to display some sort of indicator that the entry can be expanded
+to display its children, similar to the way the help manager does.
+
+TOCIN TocIn;
+USHORT NumBytes;
+BOOL fParent = FALSE;
+PSZ pszTitle;
+USHORT TOCControlWord;
+
+memcpy(&TOCIn, pulTOCIndex[i], sizeof(TOCIN));
+NumBytes = sizeof(TOCIN);
+ExtHeader = TocIn.HeadLevel&HIGH_ORDER_MASK;
+if (ExtHeader) {
+ memcpy(&TOCControlWord, pulTOCIndex[i]+sizeof(TOCIN), sizeof(USHORT));
+ NumBytes += sizeof(USHORT);
+ if (TOCControlWord&PANEL_EXTENDEDPARENT) {
+ fParent = TRUE;
+ }
+ if (TOCControlWord&PANEL_EXTENDED_X_Y) {
+ NumBytes+=(sizeof(BYTE)+2*sizeof(USHORT));
+ }
+ if (TOCControlWord&PANEL_EXTENDED_CX_CY) {
+ NumBytes+=(sizeof(BYTE)+2*sizeof(USHORT));
+ }
+ if (TOCControlWord&PANEL_EXTENDED_STYLE) {
+ NumBytes += sizeof(USHORT);
+ }
+ if (TOCControlWord&PANEL_EXTENDED_GROUP) {
+ NumBytes += sizeof(USHORT);
+ }
+ if (TOCControlWord&PANEL_EXTENDED_CTRLSINDEX)
+ NumBytes += sizeof(USHORT);
+}
+NumBytes += TOCIn.NumCells*sizeof(BYTE);
+pszTitle = malloc(TOCIn.LengthEntry-NumBytes+1);
+memcpy(pszTitle, pulTOCIndex[i]+NumBytes, TOCIn.LengthEntry-NumBytes);
+pszTitle[TOCIn.LengthEntry-NumBytes] = '\0';
+
+If the table of contents entry is on disk, the following code fragment
+retrieves the title and status as a parent using the TOC Offset Table.
+
+TOCIN TocIn;
+USHORT NumBytes;
+BOOL fParent = FALSE;
+PSZ pszTitle;
+
+fseek(fpointer, pulTOCOffsetTable[i], SEEK_SET);
+fread(&TOCIn, sizeof(TOCIN), 1, fpointer(TOCIN));
+NumBytes = sizeof(TOCIN);
+ExtHeader = TocIn.HeadLevel&HIGH_ORDER_MASK;
+if (ExtHeader) {
+ fread(&TOCControlWord, sizeof(USHORT), 1, fpointer);
+ NumBytes += sizeof(USHORT);
+ if (TOCControlWord&PANEL_EXTENDEDPARENT) {
+ fParent = TRUE;
+ }
+ if (TOCControlWord&PANEL_EXTENDED_X_Y) {
+ fseek(fpointer, sizeof(BYTE)+2*sizeof(USHORT), SEEK_CUR);
+ NumBytes+=(sizeof(BYTE)+2*sizeof(USHORT));
+ }
+ if (TOCControlWord&PANEL_EXTENDED_CX_CY) {
+ fseek(fpointer, sizeof(BYTE)+2*sizeof(USHORT), SEEK_CUR);
+ NumBytes+=(sizeof(BYTE)+2*sizeof(USHORT));
+ }
+ if (TOCControlWord&PANEL_EXTENDED_STYLE) {
+ fseek(fpointer, sizeof(USHORT), SEEK_CUR);
+ NumBytes += sizeof(USHORT);
+ }
+ if (TOCControlWord&PANEL_EXTENDED_GROUP) {
+ fseek(fpointer, sizeof(USHORT), SEEK_CUR);
+ NumBytes += sizeof(USHORT);
+ }
+ if (TOCControlWord&PANEL_EXTENDED_CTRLSINDEX) {
+ fseek(fpointer, sizeof(USHORT), SEEK_CUR);
+ NumBytes += sizeof(USHORT);
+ }
+}
+NumBytes += TOCIn.NumCells*sizeof(BYTE);
+fseek(fpointer, TOCIn.NumCells*sizeof(BYTE), SEEK_CUR);
+pszTitle = malloc(TOCIn.LengthEntry-NumBytes+1);
+fread(pszTitle, TOCIn.LengthEntry-NumBytes, SEEK_SET);
+pszTitle[TOCIn.LengthEntry-NumBytes] = '\0';
+
+IMPORTANT: The title is not null terminated. We had to explicitly
+ add a null to the end of the title.
+
+The TOCIN structure and TOCControlWord provide a lot more information
+than we are using in the above fragment. One important piece of
+information is the headlevel. This is available in the HeadLevel
+field of the TOCIN structure. To actually get at the headlevel,
+you must OR the HeadLevel field with LOW_ORDER_MASK. The result is
+a byte indicating the head level (1-6). Most of the other information
+in the table of contents is not really usable until a panel is displayed.
+When we are displaying a panel, we will reaccess the table of contents
+entry to get the pertinent information.
+
+
+Displaying a Panel
+==================
+The next step is displaying a panel. All that is necessary to display
+a panel is an index to a table of contents entry. We will use this
+index to get all the information about the panel. Table of contents
+indexes are obtained from various places including the index table,
+the panel number table, the panel name table, and the search table.
+In our viewer, we obtain the table of contents index determine which
+table of contents entry the use selected. Remember that we did not save
+the title entries of the table of contents. The reason we did not is
+because all we need is an index to the table of contents entry. The
+title is not necessary for access.
+
+The first step is to get the TOCIN structure from memory or from the
+file. This is performed in the same manner as the above code
+fragments. Once we have the TOCIN structure, we can detect the presence
+of an extended table of contents header (hereafter referred to as the
+control word). To detect the presence of a control word, we OR the
+HeadLevel value with the HIGH_ORDER_MASK. If the resulting value
+is not zero, there is a control word present. The control word
+is defined as a USHORT. By ORing the TOCControlWord with various
+constants defined in the header file, we can determine information
+about the panel including location, size, group, etc. The following
+constants are used to find that information.
+
+PANEL_EXTENDED_VIEWPORT
+PANEL_EXTENDED_NOSEARCH - Entry should not be searched
+PANEL_EXTENDED_NOPRINT - Entry should not be printed
+PANEL_EXTENDED_AUTO
+PANEL_EXTENDED_CHILD - Entry is a child
+PANEL_EXTENDED_CLEAR
+PANEL_EXTENDED_DEPENDENT
+PANEL_EXTENDED_PARENT - Entry is a parent
+PANEL_EXTENDED_TUTORIAL
+
+PANEL_EXTENDED_X_Y - lower left location of panel
+Read in additional byte,word,word
+PANEL_EXTENDED_CX_CY - size of panel
+Read in additional byte,word,word
+PANEL_EXTENDED_STYLE - style of window
+Read in additional word
+PANEL_EXTENDED_GROUP - group number
+Read in additional word
+PANEL_EXTENDED_CTRLSINDEX - control group index
+Read in additonal word
+
+To obtain a better understanding of things like groups and control
+group indexes, please consult the Information Presentation Facility
+Guide and Reference.
+
+The last five constants indicate that additional information needs
+to be read after the control word. You will note that in the
+above code fragments for reading the title, we had to process these
+values and skip bytes where appropriate. Later, in the code
+fragments to retrieve control word information, we will show you how
+to get these extra values.
+
+The other bit of information available in the HeadLevel field is,
+suprisingly, the head level! We can obtain the headlevel by ORing
+the value in HeadLevel with LOW_ORDER_MASK.
+
+The following code fragment reads in the extended header informatiom
+from memory and sets some variables based on the above constants.
+It also obtains the head level. Note that for your particular
+application, you might not need all of this information.
+
+/* INSERT CODE FRAGMENT THAT USES MEMCPY TO GET TOC AND PANEL INFO */
+
+If you are reading the table of contents from the file instead of
+memory, use the following code fragment.
+
+TOCIN TocIn;
+USHORT NumBytes;
+BOOL fParent = FALSE;
+PSZ pszTitle;
+
+fseek(fpointer, pulTOCOffsetTable[i], SEEK_SET);
+fread(&TOCIn, sizeof(TOCIN), 1, fpointer(TOCIN));
+HeadLevel = TOCIn.HeadLevel&LOW_ORDER_MASK;
+ExtHeader = TocIn.HeadLevel&HIGH_ORDER_MASK;
+if (ExtHeader) {
+ fread(&TOCControlWord, sizeof(USHORT), 1, fpointer);
+ if (TOCControlWord&PANEL_EXTENDED_X_Y) {
+ fread(&bxyUnits, sizeof(BYTE), 1, fpointer);
+ fread(&usx, sizeof(USHORT), 1, fpointer);
+ fread(&usy, sizeof(USHORT), 1, fpointer);
+ }
+ if (TOCControlWord&PANEL_EXTENDED_CX_CY) {
+ fread(&bcxcyUnits, sizeof(BYTE), 1, fpointer);
+ fread(&uscx, sizeof(USHORT), 1, fpointer);
+ fread(&uscy, sizeof(USHORT), 1, fpointer);
+ }
+ if (TOCControlWord&PANEL_EXTENDED_STYLE)
+ fread(&usStyle, sizeof(USHORT), 1, fpointer);
+ if (TOCControlWord&PANEL_EXTENDED_GROUP)
+ fread(&usGroupNumber, sizeof(USHORT), 1, fpointer);
+ if (TOCControlWord&PANEL_EXTENDED_CTRLSINDEX)
+ fread(&usControlGroupIndex, sizeof(USHORT), 1, fpointer);
+}
+
+The information from the table of contents header is generally only
+used to decide how the window that the help is in will be displayed.
+
+You can use it to position your window and to decide whether it has
+a border, minimize or mazimize buttons, etc.
+Once you have all of the display information from the table of contents,
+you can begin actually getting the information that is in the panel.
+Don't forget to display the title of the panel. We obtained it
+again in the above code fragments.
+
+A panel consists of one or more cells that contain formatting information
+and the text of the panel. In most cases, panels have more than one
+cell, so you cannot make the assumption that panels have once cell.
+The number of cells in a panel can be found from the NumCells field
+in the TOCIN structure.
+
+After reading all the extended header information and the title in the
+above samples, you will notice that we saved a value called
+pusBeginCell. This is a pointer to the place in the table of contents
+where the list of cells begins. These cell values actually index into
+the Cell Offset Table. Using the Cell Offset Table we can get the file
+offsets of the individual cells and display the information in them.
+
+The offsets in the COT are only used to retrieve the actual cell.
+In and of themselves, they provide no additional information. For this
+reason, they will only be used as a part of a code fragment to retrieve
+the actual cell.
+
+In retrieving the cell, the first step is to retrieve the cell header.
+The cell offset points to this header. Once we have the header, we
+can get the information to display the cell.
+
+The following code fragment loops through all cells in table of contents
+entry i, and reads in the cell headers. The dots indicate
+where you would actually process the information in the cell, which we
+will do later.
+
+INT j;
+USHORT usCOTIndex;
+CELL Cell;
+
+for (j=0; j<=TOCIn.NumCells; j++)
+/* If the table of contents is in memory, use */
+ memcpy(&usCOTIndex, pusBeginCell, sizeof(USHORT));
+/* If the table of contents is on disk, use */
+ fseek(fpointer, pusBeginCell, SEEK_SET);
+ fread(&usCOTIndex, sizeof(USHORT),1,fpointer);
+/* What follows is the same for both cases */
+ fseek(fpointer, COT[usCOTIndex], SEEK_SET);
+ fread(&Cell, sizeof(CELL),1,fpointer);
+ .
+ .
+ .
+}
+
+The cell information allows us to actually display the text itself.
+
+In the cell header, we have information about the CVT and the CDI.
+These two arrays give us the actual formatting information. The CDI
+contains formatting information and words. The words are represented
+as indexes into the CVT. The CVT elements are indexes into the
+vocabulary (CLVT). To use the CVT and CDI, we will read them into
+memory and process the CVT byte by byte. The following code fragment
+reads the CVT and the CDI into memory. Note that after reading
+the cell header from the file, we are pointing at the beginning of the
+CDI.
+
+PBYTE CDI;
+PUSHORT CVT;
+
+ CDI = malloc(Cell.CDISize);
+ rc=fread(CDI,Cell.CDISize,1,fpointer);
+ rc=fseek(fpointer,Cell.CVTOffset, SEEK_SET);
+ CVT = malloc(Cell.CVTSize*2);
+ rc=fread(CVT, Cell.CVTSize*2,1,fpointer);
+
+Even though the CVT is right after the CDI in the header, we cannot
+assume that after reading the CDI we are pointing at the CVT. We
+must fseek to the CVTOffset.
+
+Now we want to read the CDI byte by byte and process each item.
+The CDI values are either FA thru FF or they are a number which
+indexes into the CVT which then indexes into the vocabulary.
+Whenever the CDI value is an FF, we need to read additional info.
+This additional info is formatting information, font changes, links,
+etc. The FF escape code values are documented in appendix A.
+The following code fragment does some very basic formatting of a cell.
+Figure 5 provides the values for each of the BYTE_* values.
+
+INT m;
+INT l;
+CHAR String[255];
+BOOL Together = FALSE;
+
+for (m=0;m<Cell.CDISize;m++ ) {
+ switch (CDI[m]) {
+ /* The value indicates a new paragraph */
+ case BYTE_PARA :
+ printf("\n ");
+ break;
+ /* The value indicates that the text following should be centered. */
+ /* Center all text until a BYTE_NEWLINE is encountered */
+ case BYTE_CENTRE :
+ printf("Center\n");
+ break;
+ /* Autoblanks are used to indicate when there should be no spaces */
+ /* between words. When you encounter an autoblank, all words */
+ /* following should be printed without spaces until another autoblank */
+ /* is encountered. */
+ case BYTE_AUTOBLANK :
+ if (Together)
+ Together = FALSE;
+ else
+ Together = TRUE;
+ break;
+ /* The value indicates to print a new line */
+ case BYTE_NEWLINE :
+ printf("\n");
+ break;
+ /* The value indicates a space */
+ case BYTE_BLANK :
+ printf(" ");
+ break;
+ /* The value is an escape code. We should do some processing within */
+ /* This statement to handle the various cases documented in */
+ /* Appendix A */
+ case BYTE_ESC :
+ break;
+ /* Not sure */
+ case WILD_CHAR :
+ break;
+ /* Not sure */
+ case ESC_CHAR :
+ break;
+ /* It is a word. Index into the CVT to get the vocabulary offset */
+ /* And print the word */
+ default:
+ l = pulVocabIndex[CVT[CDI[m]]]+1;
+ for (k=0;k<(pchVocab[pulVocabIndex[CVT[CDI[m]]]]-1) ;k++,l++ )
+ String[k] = pchVocab[l];
+ /* Don't forget to null terminate the word. */
+ String[k] = '\0';
+ if (Together)
+ printf("%s",String);
+ else
+ printf("%s ",String);
+
+ break;
+ } /* endswitch */
+} /* endfor */
+
+So that is the basics. You should now be able to display a cell.
+
+Accessing other information within the INF file
+-----------------------------------------------
+Index table (index) - synonym table
+Index command table (icmd)
+Panel table (context sensitive help)
+Panel name table (link)
+
+Database names
+Fonts
+Country / Grammar
+Bitmaps (should be interesting due to different compression)
+Strings
+Control
+Super Search Table (aka FTS or Full Text Search)
+Child pages
+
+
+Searching
+=========
+If you just want to search an INF file, you will still have to read in
+the vocabulary and the table of contents. You search the vocabulary
+for a word, and then get the index of that word. This index can be
+used to index into the Super Search Table which allows you to
+determine which panels contain that word. What a pain!
+
+ ---------o0O0o---------
+