Conceptual Modeling of Spatial Data Warehouses
Conceptual modeling of spatial data warehouses extends traditional multidimensional models to incorporate geographic and spatial information. In this context, the MultiDim model is enhanced to support spatial data types while preserving the standard concepts of schemas, levels, hierarchies, cardinalities, facts, and measures. A typical example is the GeoNorthwind data warehouse, which extends the classic Northwind schema by adding spatial characteristics. In this model, pictograms are used to visually represent spatial elements, making it easier to distinguish between spatial and nonspatial components.
A key concept in this extension is the spatial level. A spatial level is a dimension level that stores spatial characteristics, represented through a geometry. The geometry uses spatial data types such as points, lines, or regions. In the example model, levels such as Supplier, Customer, City, State, Region, Country, and Continent are spatial, while Product and Time are nonspatial. Spatial levels are represented with icons indicating their geometry type. Importantly, a level can be spatial regardless of whether it has spatial attributes, depending on application requirements.
Spatial attributes are attributes whose values belong to a spatial data type. For instance, an attribute representing a capital city’s geographic location may be of type point, while elevation could be modeled as a continuous field. Continuous spatial fields are identified with a special pictogram. A level can have spatial attributes even if it is not considered spatial in the overall hierarchy structure.
Spatial hierarchies are hierarchies that include at least one spatial level. These hierarchies may combine spatial and nonspatial levels. For example, hierarchies that connect City, State, Region, Country, and Continent form spatial hierarchies. A dimension containing at least one spatial hierarchy is called a spatial dimension. Relationships between spatial levels in a hierarchy may include topological constraints, such as containment or overlap, defined using standard spatial relationships. For example, the geometry of a state may be covered by the geometry of its region or country. However, not all related spatial levels require such constraints; for instance, the location of a supplier obtained through geocoding may not have a topological relationship with the city’s central point.
A spatial fact is a fact that links multiple levels, at least two of which are spatial. Spatial facts can include topological constraints specifying how related spatial objects must relate geometrically. For example, in a highway maintenance schema, a Maintenance fact might relate County and Highway Segment levels with an “overlaps” constraint, meaning that only segments overlapping a county are associated with it. In contrast, a Sales fact linking Supplier and Customer may not impose any spatial constraint.
Measures in spatial data warehouses can be either numeric or spatial. Numeric measures represent quantitative values such as costs or counts, while spatial measures are geometries, such as areas or line sets. Numeric measures may also be derived using spatial operations like calculating distance or area. For instance, the length of a road segment within a county can be computed using spatial operations, while a spatial measure might represent the shared area between geographic entities.
Aggregation of measures along hierarchies follows specific rules. Numeric measures are typically aggregated using summation, while spatial measures use spatial union operations. For example, when rolling up data from counties to states, numeric measures such as length, number of cars, or repair cost are summed, whereas a spatial measure representing road segments is aggregated by merging geometries into a unified set. This allows both quantitative and spatial aspects of data to be analyzed at different levels of detail.
Reference:
Vaisman, A., & Zimányi, E. (2014). Data warehouse systems: Design and implementation. Springer.