doxygen.md 8.57 KB
Newer Older
Dimitri van Heesch's avatar
Dimitri van Heesch committed
1 2 3 4 5 6 7
Doxygen Internals {#mainpage}
=================

Introduction
------------

This page provides a high-level overview of the internals of doxygen, with
Dimitri van Heesch's avatar
Dimitri van Heesch committed
8
links to the relevant parts of the code. This document is intended for
9
developers who want to work on doxygen. Users of doxygen are referred to the
Dimitri van Heesch's avatar
Dimitri van Heesch committed
10
[User Manual](http://www.doxygen.org/manual.html).
Dimitri van Heesch's avatar
Dimitri van Heesch committed
11

Dimitri van Heesch's avatar
Dimitri van Heesch committed
12
The generic starting point of the application is of cource the main() function.
Dimitri van Heesch's avatar
Dimitri van Heesch committed
13 14 15 16 17 18 19 20 21 22 23 24 25 26

Configuration options
---------------------

Configuration file data is stored in singleton class Config and can be
accessed using wrapper macros 
Config_getString(), Config_getInt(), Config_getList(),
Config_getEnum(), and Config_getBool() depending on the type of the
option. 

The format of the configuration file (options and types) is defined
by the file `config.xml`. As part of the build process, 
the python script `configgen.py` will create a file configoptions.cpp 
from this, which serves as the input for the configuration file parser
27 28 29
that is invoked using Config::parse(). The script `configgen.py` will also
create the documentation for the configuration items, creating the file
`config.doc`.
Dimitri van Heesch's avatar
Dimitri van Heesch committed
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

Gathering Input files
---------------------

After the configuration is known, the input files are searched using
searchInputFiles() and any tag files are read using readTagFile()

Parsing Input files
-------------------

The function parseFiles() takes care of parsing all files.
It uses the ParserManager singleton factory to create a suitable parser object
for each file. Each parser implements the abstract interface ParserInterface.

If the parser indicates it needs preprocessing
via ParserInterface::needsPreprocessing(), doxygen will call preprocessFile()
Dimitri van Heesch's avatar
Dimitri van Heesch committed
46
on the file. 
Dimitri van Heesch's avatar
Dimitri van Heesch committed
47

Dimitri van Heesch's avatar
Dimitri van Heesch committed
48 49 50 51 52 53 54 55 56 57 58 59 60
A second step is to convert multiline C++-style comments into C style comments
for easier processing later on. As side effect of this step also 
aliases (ALIASES option) are resolved. The function that performs these 
2 tasks is called convertCppComments().

*Note:* Alias resolution should better be done in a separate step as it is
now coupled to C/C++ code and does not work automatically for other languages!

The third step is the actual language parsing and is done by calling 
ParserInterface::parseInput() on the parser interface returned by 
the ParserManager.

The result of parsing is a tree of Entry objects.
Dimitri van Heesch's avatar
Dimitri van Heesch committed
61 62 63 64 65 66 67 68 69 70 71 72
These Entry objects are wrapped in a EntryNav object and stored on disk using
Entry::createNavigationIndex() on the root node of the tree.

Each Entry object roughly contains the raw data for a symbol and is later
converted into a Definition object.

When a parser finds a special comment block in the input, it will do a first
pass parsing via parseCommentBlock(). During this pass the comment block
is split into multiple parts if needed. Some data that is later needed is
extracted like section labels, xref items, and formulas. 
Also Markdown markup is processed using processMarkdown() during this pass.

Dimitri van Heesch's avatar
Dimitri van Heesch committed
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
Resolving relations
-------------------

The Entry objects created and filled during parsing are stored on disk 
(to keep memory needs low). The name, parent/child relation, and 
location on disk of each Entry is stored as a tree of EntryNav nodes, which is 
kept in memory.

Doxygen does a number of tree walks over the EntryNav nodes in the tree to
build up the data structures needed to produce the output. 

The resulting data structures are all children of the generic base class
called Definition which holds all non-specific data for a symbol definition.

Definition is an abstract base class. Concrete subclasses are
- ClassDef: for storing class/struct/union related data
- NamespaceDef: for storing namespace related data
- FileDef: for storing file related data
- DirDef: for storing directory related data

For doxygen specific concepts the following subclasses are available
- GroupDef: for storing grouping related data
- PageDef: for storing page related data

97
Finally the data for members of classes, namespaces, and files is stored in
Dimitri van Heesch's avatar
Dimitri van Heesch committed
98 99
the subclass MemberDef.

100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171
Producing debug output
----------------------

Within doxygen there are a number of ways to obtain debug output. Besides the
invasive method of  putting print statements in the code there are a number of
easy ways to get debug information.

- Compilation of `.l` files<br>
  This is also an invasive method but it will be automatically done by the
  `flex / lex` command. The result is that of each input line the (lex) rule(s)
  that are applied on it are shown.
  - windows
    - in the Visual C++ GUI
      - find the required `.l` file
      - select the `Properties` of this file
      - set the item `Write used lex rules` to `Yes`
      - see to it that the `.l` file is newer than the corresponding `.cpp` file
        or remove the corresponding `.cpp` file
  - unices
    - global change<br>
      In the chapter "Doxygen's internals" a `perl` script is given to toggle the
      possibility of having the rules debug information.
    - command line change<br>
      It is possible to the option `LEX="flex -d"` with the `make` command on the
      command line. In this case the `.l` that are converted to the corresponding
      `.cpp` files during this `make` get the rules debug information.<br>
      To undo the rules debug information output just recompile the file with
      just `make`.<br>
      Note this method applies for all the `.l` files that are rebuild to `.cpp`
      files so be sure that only the `.l` files(s) of which you want to have the
      rules debug information is (are) newer than the corresponding `.cpp`
      file(s).
- Running doxygen<br>
  During a run of doxygen it is possible to specify the `-d` option with the
  following possibilities (each option has to be preceded by `-d`):
  - findmembers<br>
    Gives of global, class, module members its scope, arguments and other relevant information.
  - functions<br>
    Gives of functions its scope, arguments and other relevant information.
  - variables<br>
    Gives of variables its scope and other relevant information.
  - classes<br>
    Gives of classes en modules its scope and other relevant information.
  - preprocessor<br>
    Shows the results of the preprocessing phase, i.e. results from include files, 
    <tt>\#define</tt> statements etc., definitions in the doxygen configuration file like:
    `EXPAND_ONLY_PREDEF`, `PREDEFINED` and `MACRO_EXPANSION`. 
  - commentcnv<br>
    Shows the results of the comment conversion, the comment conversion does the
    following:
     - It converts multi-line C++ style comment blocks (that are aligned)
       to C style comment blocks (if `MULTILINE_CPP_IS_BRIEF` is set to `NO`).
     - It replaces aliases with their definition (see `ALIASES`)
     - It handles conditional sections (<tt>\\cond ... \\endcond</tt> blocks)
  - commentscan<br>
    Will print each comment block before and after the comment is interpreted by
    the comment scanner.
  - printtree<br>
    Give the results in in pretty print way, i.e. in an XML like way with each
    level indented by a `"."` (dot).
  - time<br>
    Provides information of the different stages of the doxygen process.
  - extcmd<br>
    Shows which external commands are executed and which pipes are opened.
  - markdown<br>
    Will print each comment block before and after Markdown processing.
  - filteroutput<br>
    Gives the output of the output as result of the filter command (when a filter
    command is specified)
  - validate<br>
    Currently not used

Dimitri van Heesch's avatar
Dimitri van Heesch committed
172 173
Producing output
----------------
Dimitri van Heesch's avatar
Dimitri van Heesch committed
174

Dimitri van Heesch's avatar
Dimitri van Heesch committed
175
TODO
Dimitri van Heesch's avatar
Dimitri van Heesch committed
176

Dimitri van Heesch's avatar
Dimitri van Heesch committed
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195
Topics TODO
-----------
- Grouping of files in Model / Parser / Generator categories
- Index files based on IndexIntf
  - HTML navigation
  - HTML Help (chm)
  - Documentation Sets (XCode)
  - Qt Help (qhp)
  - Eclipse Help
- Search index
  - Javascript based
  - Server based
  - External
- Citations
  - via bibtex
- Various processing steps for a comment block
  - comment conversion
  - comment scanner
  - markdown processor
196
  - doc tokenizer
Dimitri van Heesch's avatar
Dimitri van Heesch committed
197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226
  - doc parser
  - doc visitors
- Diagrams and Images
  - builtin
  - via Graphviz dot
  - via mscgen
  - PNG generation
- Output formats: OutputGen, OutputList, and DocVisitor
  - Html:  HtmlGenerator and HtmlDocVisitor
  - Latex: LatexGenerator and LatexDocVisitor
  - RTF:   RTFGenerator and RTFDocVisitor
  - Man:   ManGenerator and ManDocVisitor
  - XML:   generateXML() and XmlDocVisitor
  - print: debugging via PrintDocVisitor
  - text:  TextDocVisitor for tooltips
  - perlmod
- i18n via Translator and language.cpp
- Customizing the layout via LayoutDocManager
- Parsers 
  - C Preprocessing 
    - const expression evaluation
  - C link languages
  - Python
  - Fortran
  - VHDL
  - TCL
  - Tag files
- Marshaling to/from disk
- Portability functions
- Utility functions
Dimitri van Heesch's avatar
Dimitri van Heesch committed
227