DocZilla

CiTEC > DocZilla > What is DocZilla?


What is DocZilla?

Background

In order to avoid costly and time-consuming conversion of SGML documents into a proprietary format, CiTEC developed the MultiDoc Pro product family. Unfortunately due to license issues the sales for it had to be ended at the beginning of year 2001. For the replacement for Multidoc Pro CiTEC has developed the DocZilla.

Introduction

DocZilla is a standard Mozilla that contains extra components providing parsing and displaying SGML/XML documents on the fly without a precompilation step.

Additional features on DocZilla are DocZilla GUI (Graphical User Interface), support for various link types (ID/IDREF and HyTime clinks with nameloc, treeloc, queryloc locators and XLinks), extended searching capabilities, document set and table of content- support. Mozilla as a basis of DocZilla provides support for HTML and CSS, bookmarks, navigation history, SSL (Secure Sockets Layer), etc.

Handled Data Objects

DocZilla handles various types of data objects:

  • Standard Generalized Markup Language (SGML)
  • eXtensible Markup Language (XML)
  • HyperText Markup Language (HTML)
  • Cascading Style Sheets (CSS)
  • Document Type Definitions (DTD)
  • Document Sets / Table of Contents (XML/XSLT)
  • HyTime
  • CALS
  • Media files (notation data)
  • API (Javascript)

Standard Generalized Markup Language (SGML) and eXtensible Markup Language (XML) documents are structured data in standard textual form. The documents do not include any information about layout so they will need an additional information in order to display and navigate the document.

A HyperText Markup Language (HTML) document on the other hand contains information for rendering the data and can be displayed without any additional information.

A Cascading Style Sheet (CSS) document is the additional information that DocZilla needs to view SGML and XML documents properly. It is a set of formatting rules, a mapping from SGML elements to formatting specifications. The style sheets not only support content formatting such as font specification, position, spacing, and justification, but also content hiding, autonumbering, leading, borders, and tables. For example, any element structure can be formatted as a table with or without the table grid.

The style sheet is independent of the document: several documents can share a single style sheet, and one document can use several style sheets.

Document Type Definition (DTD) is used to ensure that information is correctly structured when it is written into the system. It is rather like a template that decides what information should be allowed and where it can be located. DTD is normally an integral part of an SGML document. However, DocZilla is also capable of preloading DTDs, reusing them from one document to another in order to improve performance.

Media files are externally stored data such as graphics, audio, or video. DocZilla has built-in support for notation data - for instance it supports a number of raster formats (such as JPG, GIF, PNG, TIFF).

DocZilla can also launch external applications to view the data, generate callbacks to the main application, and offers an interface to register and use external graphics interpreters, allowing graphics to be displayed inline. For example you can use Flash Multimedia through Macromedia Flash Player Plug-In in SGML and XML documents.

DocZilla allows Javascript to be run with the documents. For example you can set a warning message window to pop up every time user opens the document, you can change document content on the fly, etc. The only limit is your imagination.

The Data Processing Model

All of these data objects are direct input to DocZilla through the Parser Component. Navigators are converted into their internal representations if needed (XSLT), SGML is validated by DTD Parser Component. All references to notation data are resolved and brought in without parsing. DocZilla either handles the data itself, launches external applications, or passes it on to plug-ins. Possible errors are reported by the Message Logger Component. Finally Mozilla combines the document structure (DOM) with style sheets into document view.

Extensibility

The cross platform COM (XPCOM) provides re-use of existing components on other platforms and operating systems. The DocZilla and all of its components are dynamically loaded (self-registering), which means that a user can download a component that is needed for a special task without installing anything but the component. After the component has been installed it automatically registers itself and extends the current applications to be able to perform the task.

Summary

To summarize the purpose of DocZilla, DocZilla adds extra value to Mozilla. This is done by adding extra components that handles SGML, links, searching and TOCs, on top of Mozilla.


What is DocZilla? | Download | Demonstrations | FAQ | Forum | Contact

Last modified: Mon Mar 14 13:12:55 EET 2005 by webmaster@citec.fi