Managing complex map files using XML entities

Mapnik XML files can become quite complex. This tutorial introduces some techniques to keep large map files more maintainable. Specifically it demonstrates how to avoid duplicate data in the XML file, like:

  • color values
  • database connection parameters
  • icon directories

It also shows how to split a single, monolithic map file into reusable components.

Mapnik XML support

Mapnik currently supports three different XML parsers:

  • the boost spirit based parser
  • the tinyxml parser
  • libxml2

The three parsers differ in size, external dependencies and the number of XML features they support. The most comprehensive parser is the libxml2 parser and it is the only one that supports XML entities. Because libxml2 support is quite new it is currently not the default parser.

Compiling mapnik with libxml2 support

The libxml2 parser is enabled by setting the XMLPARSER option at compile time:

$ scons XMLPARSER=libxml2 install

Of course this requires the libxml2 library and, depending on the distribution the corresponding devel package. If xml2-config is not in the PATH its location can be set using the XML2_CONFIG option.

Internal Entities

All XML parsers have some built-in entities to escape otherwise illegal characters:

  • >
  • <
  • &
  • "
  • '

The XML document type definition (DTD) provides a way to declare new, user defined entities:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Map[
    <!ENTITY water_color "#b5d0d0">
]>
<Map bgcolor="&water_color;"/>

This XML document declares an internal entity named water_color. This entity is referenced by the bgcolor attribute of the Map element. The parser replaces all occurrences of &water_color; with the string #b5d0d0.

Using entities for common values results in a single point of definition. This greatly improves maintainability. Instead of searching and replacing all occurrences of a value, there is a single place to change it. By using names, like water_color the XML becomes more readable which helps a lot with future changes. In case of color values it has an additional benefit. Because all entities are declared at the top of the document the color set of the map is immediately apparent. Of course color values are not the only application. Any reoccurring value is a candidate for an entity.

It is allowed to nest entities:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Map[
    <!ENTITY home_dir "/home/david">
    <!ENTITY icons    "&home_dir;/map/icons">
]>
<Map>
    <Style name="volcanos">
        <Rule>
            <PointSymbolizer file="&icons;/volcano.png"
                             type="png" width="16" height="16"/>
        </Rule>
    </Style>
</Map>

However, these internal entities are not suitable for larger blocks. They also do not help with sharing common styles and layers between different maps.

External Entities

External entities are declared by adding the keyword SYSTEM:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Map[
    <!ENTITY db_settings SYSTEM "settings/db_settings">
]>
<Map>
    <Layer name="volcanos" status="on">
        <StyleName>volcanos</StyleName>
        <Datasource>

            <Parameter name="table">volcanos</Parameter>

            &db_settings;

        </Datasource>
    </Layer>
</Map>

The entity declaration assigns the content of the file settings/db_settings to the entity &db_settings;. When parsed the reference to &db_settings; in the Datasource section is expanded to the content of the file. If a relative filename is given the file is searched relative to the document. The file settings/db_settings could look like this:

            <Parameter name="type">postgis</Parameter>
            <Parameter name="host">www.example.org</Parameter>
            <Parameter name="port">5433</Parameter>     
            <Parameter name="user">david</Parameter>
            <Parameter name="dbname">geo</Parameter>

Note that this is not a legal XML document on its own because it does not have a single root element. It is a list of elements. But the tags have to be well balanced. Also note that references to external entities are illegal in attribute values. They are only allowed in text sections.

It is possible to use entity references in external entities. This allows a limited form of parameterization. Consider the following example:

File earthquake.map:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Map[
    <!ENTITY since_year "1970">
    <!ENTITY earthquakes_since SYSTEM "earthquakes_since.lay">

    <!ENTITY earthquakes_default_style SYSTEM "earthquakes.sty">
    <!ENTITY common_styles SYSTEM "common.sty">
    <!ENTITY common_layers SYSTEM "common.lay">
    <!ENTITY db_settings SYSTEM "db_settings">

    <!-- colors -->
]>
<Map>
    &earthquakes_default_style;

    &earthquakes_since;

    &common_styles;
    &common_layers;
</Map>

File earthquakes_since.lay:

    <Layer name="earthquakes_since" status="on">
        <StyleName>earthquakes</StyleName>
        <Datasource>

            <Parameter name="table">
                (SELECT * FROM earthquakes 
                 WHERE year &gt;= &since_year; ) AS earthquakes_since
            </Parameter>

            &db_settings;

        </Datasource>
    </Layer>

This is a quite flexible setup. It is very easy to add and remove thematic overlays. Other overlays may use the same parameters by referencing the same entities. Styles can be changed by replacing the reference to &earthquakes_default_style; with a custom one. It is also possible to have many map files all referencing the same set of styles and layer files but with different settings.

Conclusion

Entities provide a way to use symbolic names in the map file. This improves readability and helps to build logical groups. By providing a single point of definition map files are better adaptable to different environments and in general more maintainable. External entities can store whole blocks of XML. This helps to build reusable collections of layers and styles. These reusable components can be parameterized using other entities as needed.

Further Reading