• Skip to content
  • Skip to link menu
The KOffice Project
  • KOffice Homepage / Developer Resources / Filters / Old Filter FAQ
 

Old Filters FAQ

$Date: 2004-05-29 12:10:59 +0200 (Sat, 29 May 2004) $

Please note that this FAQ is for KOffice 1.1 or 1.2. It is lightly outdated for KOffice 1.3 and totally for KOffice 1.4.

Questions

  • Do we really need filters
  • Which filters are there?
  • Which filters are most wanted?
  • How to use a filter?

Developing Filters for KOffice

  • Prepare the Environment
  • Behind the Scenes
  • How do I develop a filter?
  • Advanced Techniques
  • Remaining Questions?
  • File Formats - Doctype Definitions
  • Add Documentation

Answers

Do we really need filters

In my opinion we definitely need filters because an important factor influencing the success of an office suite is the ability to import and export documents. Of course this is not critical stuff like printing or a nice and straightforward use interface, but it's not just a "nice-to-have" feature, either.

Just imagine a user working in a heterogeneous environment using KOffice among other office suites. Sometimes it is necessary to exchange documents as we all know. Now the adventure begins:

  • Which format do I use (i.e. Which format is supported by both office suites)?
  • How much information is lost due to internal differences between the office suites (e.g. formatting, tables, pictures, columns,...)?
  • What about the character sets (i.e. the encoding of umlauts and so on)?
  • Can I use Unicode characters in the other office suite?

Another problem is that some vendors of proprietary office suites provide inaccurate and/or incomplete documentation of the file formats (or no information at all). This is one of the obstacles we face because it's really time consuming to search for information in a binary file as you can imagine... (At this place I'd like to thank Espen Sand for his brilliant KHexEditor (should be on your hard disk already if you installed the kdeutils package).

[Up to Questions]

Which filters are there?

Please have a look at the status page for all available filters. Please note that this page reflects the state of current development. This means that some filters might not be in the latest release.

[Up to Questions]

Which filters are most wanted?

Please refer to our pending and most wanted filters.

[Up to Questions]

How to use a filter?

The KOffice Library Developers have done a good job and you will not even notice when you use a filter to convert a file to the part's native format. OK, you can see it (debug output), but there is no difference for you at all. Just select
File -> Open... for import or
File -> Save or File -> Save As... for export
and choose the filter which should be used. If you import files with a known mime type (that means, konqueror tells you the correct type of the file) you don't even have to select the appropriate filter. Select (or name, in case of saving) the file and off you go.

[Up to Questions]

Developing Filters for KOffice

Prepare the Environment

As the development version of KOffice needs KDE 3 it's necessary to install at least parts of KDE 3 (Qt 3, kdesupport, arts, kdelibs, kdebase - in this order) and - of course - KOffice. I recommend looking for further information on how to install it. To get some help from real KOffice experts please join the KOffice or the KOffice-Devel mailing lists (koffice@kde.org and koffice-devel@kde.org). There is an archive of those lists and you can find them all at http://lists.kde.org.
One final hint: Add -debug to your ./configure options for Qt and --enable-debug to your ./configure options for the kdelibs and koffice packages. The resulting binaries are quite large and a little bit slower, but nonetheless this is an enormous help if you are developing and debugging something.

Oh, and use gdb-5.0 or later, because gdb-4.x always crashed on KOffice stuff (at least for me).

[Up to Questions]

Behind the Scenes

There are several ways for programming a filter depending on your needs. However, unless you really need a non-standard filtering method (e.g. because you'd like to import huge amounts of data and the performance is bad, or if you want to import embedded files, too) we recommend using the plain and easy standard method. All the following descriptions are based on the assumption that you use the standard filtering method. The gory details about the optimized (read: hacky) methods of filtering are provided at the bottom of this page (Note: I didn't add a link, because you at least have to scroll the standard description.)

KOffice uses a quite straightforward approach to convert files to the native format of the matching KOffice part. I'll try to explain this via a simple example (from the user's point of view):

  1. The user activates File -> Open...
  2. The file dialog pops up
  3. She/He selects a file type (also called mime type) (e.g. Rich Text Format)
  4. Now the file dialog shows only the matching files
  5. Depending on the filter, the user might see a preview of the file contents (optional).
  6. After selecting a file she/he presses OK
  7. The filter converts the file to the native format of the application. Note that some filters pop up a configuration dialog at this stage to query the user for e.g. a password or encoding hints.
  8. The KOffice application opens the native file

When saving documents the filtering works nearly identical. The whole process of looking for available filters, choosing the best one, invoking it,... is done by the KOffice Libraries so you don't have to worry about that.

Filters are shared libs which are opened on demand (via KLibLoader, which is a wrapper for dlopen) and closed after a few minutes of inactivity (so that we don't waste too much memory). All the filters have to inherit KoFilter (koffice/lib/kofficecore/koFilter.h) and they have to override the pure virtual method convert(...). This method is called by the filter manager and the filter should start to convert the file (i.e. open the file, read it, convert the contents, write it back to the disk).

[Up to Questions]

How do I develop a filter?

Please download the filter template first. Then follow this step-by-step guidelines to set up your own filter:

  1. Untar the template somewhere in the koffice/filters directory (For this example we assume koffice/filters/template).
  2. If your copy of KOffice is configured already you just have to call create_makefile filters/template from the toplevel koffice directory. This should create a Makefile from your Makefile.am in the template. If this fails due to a missing create_makefile script, please install the kdesdk package, it contains lots of useful stuff like that in the kdesdk/scripts directory.
  3. Believe it or not, but this is nearly all Makefile/build system hackery you have to do. cd back to the filters/template directory and call make. This should start compilation of that small, no-op template. If this step isn't successful, please write a mail to <koffice@kde.org>.
  4. Please rename all foo names to match the name of your filter. If you write an export filter it's of course a good idea to change import to export, too. Due to renaming the files you have to fix the Makefile.am now. Don't forget to adapt the name of the #includes (header and .moc) in the source file. Make sure it still compiles before going on to the next step.
  5. Now that the files are renamed we have to change the content of them. We'll address one file after the other:
    • Makefile.am: If you changed the library's name make sure to update all the lines in the Makefile.am, the X-KDE-Library field in the .desktop file and the library-name argument of K_EXPORT_COMPONENT_FACTORY in the .cpp file. Else the filter manager won't be able to load your library.
      If you have to add more source files you only have to add their names to the libfooimport_la_SOURCES line.
      Don't forget to adapt the service_DATA line if you change the name of the .desktop file. Else make install will fail.
    • fooimport.h: Please add a license header of your choice (e.g. GPL, LGPL, X License, BSD,...) and rename the class (and maybe the #include guards __FOOIMPORT_H__).
    • fooimport.cpp: Add a license header and change the classname to reflect the changed header file. Don't forget to adapt the name in the factory typedef.
    • kword_foo_import.desktop: You'll have to adapt the Name field (obvious) and the X-KDE-Export/X-KDE-Import fields. These fields state what mimetypes the filter exports and imports (from the filter's point of view!). If you want to have more than one mimetype in those lines you have to separate them by a plain comma (no space!), like "text/plain,text/english". The X-KDE-Library field has to contain the library name. Don't forget to update that when you change the name of the lib!
    • status.html: This file contains status information about your filter. Please check the koffice site for examples.
  6. If KDE doesn't "know" your mimetype up to now, please add a x-*.desktop file (see kdelibs/mimetypes for inspiration :-). Make sure that it gets installed to the correct directory (most likely it will be application). If KDE knows your file type, you don't have to care about that.
  7. The framework should be working now and you can start implementing the real filter code. Looking at simple existing filters (like the ascii or the wml filter) might help to get started.
  8. To submit you new filter to the KDE CVS please remove generated files from the directory (*.o, *.lo, *.moc, the .deps and .libs directory, and so on) and create a tarball. If you have some webspace please upload it there and send a mail to the koffice-devel mailing list containing a short description and a link to the sources. If you can't put it somewhere please send it to me as email attachment (not to the list, please). I'll take care of uploading it then.

 

[Up to Questions]

Advanced Techniques

One of the major strengths of an Office Suite is the ability to "embed" documents into other documents (e.g. to embed a spreadsheet or a chart into a text document or a presentation). There are two fundamentally different ways to handle this topic:

  1. The first method is to link the child documents. The way it usually is done is by transferring ownership, i.e. the child document is owned by the parent document. It's obvious that both concepts have advantages and disadvantages (e.g. by linking you save diskspace and keep your documents small, the tradeoff is that you have to take care when moving documents).

  2. In KOffice we decided to transfer the ownership and save the child documents along with the main (=parent) document in one file. To achieve this you again have several possibilities, but they only differ in the implementation.

The main concept is always the same: You have to emulate a kind of file-system-in-a-file:
  • Microsoft uses the OLE concept to generate so called compound files. (OLE = object linking and embedding) These files are binary files with raw document streams and these streams can be accessed and processed with the LAOLA tools (Look at the links selection for that page) or look at the import filters for OLE files in koffice. That can be found in WebCVS (current) or in your local copy at koffice/filters/olefilters/lib/.

  • KOffice recently switched from using the well known tar/gnuzip utilities to an OpenOffice compatible compound file format. It's basically a .zip file with some special files to allow things like mimetype detection and so on. If you invoke the unzip utility on such a file you'll find some .xml files, where maindoc.xml holds the parent document's XML file.
    For further detail on the internals of the compound document format please read the specification either in WebCVS (current) or look it up in your local koffice copy in koffice/lib/store/SPEC. The external point-of-view specification can be found here.

What does all that mean for the filter programmer? In 99.9% of all cases you surely want to use the KOffice Storage Library (koffice/lib/store) which saves you from having to implement reading from or writing to a zip file. You can open files and operate directly on the internal streams.

If your foreign file format supports embedded images you can keep inheriting KoFilter and just use the KoFilterChain to access different streams than just the "root" (=maindoc.xml) stream in the compound storage file. Be careful though, as the storage classes don't allow simultaneous access to more than one stream at a time. Therefore you might have to temporarily store images in memory and/or write them all at once after converting the main document.

In case the foreign format you want to import supports embedded documents (i.e. non-trivial objects which would have to be passed through a different KOffice filter) you have to inherit your filter class from KoEmbeddingFilter. I said import on purpose, as we don't support exporting of embedded KOffice documents right now. If you need that feature for your filter please let us know and we will try to implement it in the library. The most important method is embedPart(). It follows the template method pattern and will call back your filter and ask it to save the source file. Then it will convert it and insert it to the right place of your storage file. The returned integer should be used to refer to the converted part from within your filter's output file. If you obey the rules stated in the API documentation of that class embedding is really straightforward. Unless you're trying to convert a seriously screwed format like OLE files you won't need start/endInternalEmbedding. As this code isn't used a lot in current KOffice filters I'd be curious about feedback about the API. Just tell us if it's hard to use or buggy, maybe we can clean it up a bit.

One final advice if you'd like to see the available filters for all KOffice applications and their relations: enter koffice/lib/kofficecore/tests and make check. You should see a program called filter_graph which generates a input file for the dot tool (part of the graphviz package). Invoke the test program and then call make dot and a file called graph.png will be created (take care, it's a bit wide ;-).

[Up to Questions]

Remaining Questions?

Feel free to ask me if there are any remaining questions. BTW: It's generally a good idea to ask on koffice@kde.org whether anyone works on a filter before starting to implement it :)

[Up to Questions]

File Formats - Doctype Definitions

This section contains some useful documentation (I'll add more stuff here, soon):

  • KWord Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/kword/dtd/kword.dtd

  • KSpread Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/kspread/dtd/kspread.dtd

  • Karbon Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/karbon/karbon.dtd

  • KPresenter Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/kpresenter/dtd/kpresenter.dtd

  • Krita Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/kimageshop/dtd/krita.dtd

Here is a little advice (from IBM) how to read doctype descriptions. If you look at that page it shouldn't be a problem to understand how to read them.
   Doctype Description

[Up to Questions]

Add Documentation

So if you have done your filter please add some information.
At least add a statusfile status.html. Inside of that file you should insert

  • a feature list,
  • a history list,
  • a todo list,
  • maybe some nice links to more information (fileformat description, ...),
  • the author(s) of the filter with email-addresses,
  • last page update.

Whenever you obtain any information about the filter update the status.html file. The status file may contain the result of a finished investigation (e.g. can code be reused from another open source project if yes how, if no why not) or even the experience of a not finished investigation, and of course the possibilities the filter currently provides.

There is a statusfile template where you can look at. Statusfile template is here: temp  

If you have done your documentation and the tables are looking right mail it.

If you are doing some update don't forget to update your documentation too!

[Up to Questions]

Inform

Skip menu "Inform"
  • Home
  • KDE Home
  • News
  • Information
  • FAQ
  • Add-ons for KOffice
  • People
  • Mailing Lists
  • Support KOffice

Latest Releases

Skip menu "Latest Releases"
  • KOffice 1.6.3
  • KOffice 2.0-alpha-6
  • Security

KOffice Applications

Skip menu "KOffice Applications"
  • KOffice Workspace
  • KWord
  • KSpread
  • KPresenter
  • Kexi
  • Kivio
  • Karbon14
  • Krita
  • KPlato
  • KChart
  • KFormula
  • Kugar

Documentation

Skip menu "Documentation"
  • KOffice 1.6.3
  • Supported File Formats

Competitions

Skip menu "Competitions"
  • Recent Competitions
  • KOffice2 Design

Download

Skip menu "Download"
  • Download

Development

Skip menu "Development"
  • Developer Resources
    • Information
    • API Reference
    • Tasks
    • Filters
      • Status
      • Old Filter FAQ
      • Links
    • File Format
    • Website
    • KDE Developer Site
  • Get Involved
  • KOffice Sprints

Global navigation links

  • KDE Home
  • KDE Accessibility Home
  • Description of Access Keys
  • Back to content
  • Back to menu
Maintained by koffice.org Web Team
KDE® and the K Desktop Environment® logo are registered trademarks of KDE e.V. | Legal