datatree

A C library for streaming hierarchical data to and from text files and other formats.

Purpose

This library provides support for a hierarchical data structures that can be manipulated in-memory and streamed to/from a flat file.
It is a stripped-out version of a C++ library I have been using since 1990 in a number of projects, as a serialization format and for configuration files.

DataTree is actually very similar to JSON, a community effort to provide a lightweight alternative to XML. It was also designed with the same goals as JSON in mind (simplicity, better readability, etc).

If you are looking for a lightweight and readable text-based file format to use in your project, and are of the opinion that XML sucks, you probably want to consider using JSON (or you could add a JSON fomat reader/writer to this library!). You may also want to take a look at YAML, which is somewhat of an intermediate between JSON/DataTree and the full XML. Finally, the boost::serialization library also has nice design, and is worth reviewing (although it has a more specific focus).

The implementation posted here is one I had had started to convert to C (from a C++ version I had been using) for increased portability (and as an exercise of writing pure C code, since I had started with C++). Unfortunately, I have never gotten to implement more that the basic features. Maybe I'll once post the C++ version, if I get to remove any dependencies on other libraries I use (for memory management and streaming).

Overview

This library is built around two core concepts:

This design might be the most interesting aspect of this library.

Based on the data traversal infrastructure, it is very easy to add file readers or writers that support a number of file formats. Or even to replicate data trees in memory by making a deep copy (by default, the nodes are shared using reference counting). The C++ library I use supports text and binary file formats, and even conversion to an XML subset. In this C version, only a "native" text-based format is supported.

Default file format

The native text file format supported by this library is very straightforward (see this page). It is even simpler and less verbose than the JSON format (which was designed to be directly parsable as JavaScript code). Anyway, new streamed representations of the data structures are easy to add as plug-ins.

The example data provided on JSON's website would be automatically formatted and represented as follows:

{
  glossary: {
    title: "example glossary"
    GlossDiv: {
      title: S
      GlossList: (
       {
        ID: SGML
        SortAs: SGML
        GlossTerm: "Standard Generalized Markup Language
        Acronym: SGML
        Abbrev: "ISO 8879:1986"
        GlossDef: "A meta-markup language, used to create markup languages such as DocBook."
        GlossSeeAlso: ( GML, XML, markup )
       }
      )
    }
  }
}

Comments can be added at the end of any line, following a # character.

For more information, see the online library documentation, generated using doxygen.

Download

The C source code of this library is available here. This zip archives containes .h and .c files which just need to be compiled together, and a small test program (dt_test.h). A configuration file for doxygen (named Doxyfile) is also included.

This library is functional and has undergone some testing, but it never has matured to a full-featured tool (unlike its C++ counterpart).
Things that are missing include:

If you take the time to explore or use this library, I would appreciate your feedback.