Description of the Model

Introduction

The intent of the Model-Based DDI Specification Class Description is to provide information on the individual classes used in the model, their relationship to each other and their relationship to DDI Lifecycle 3.2 and other standards such as General Statistical Information Model (GSIM). The model based DDI specification consists of two parts – a Library of classes and Functional Views of the model. The Library of classes encompasses the entire DDI-Lifecycle (MD) model. The classes in the Library are the building blocks used to construct the Functional Views. These Functional Views are in essence profiles of the full specification oriented around specific user needs. The model is primarily being developed and surfaced to the user community at http://lion.ddialliance.org/

A Note on Terminology During the development process, the terminology for what is now called classes (to reflect the language used in UML) was previously referred to as ‘objects’

Structure of the DDI-Lifecycle (MD) Model

The model contains a Library and Functional Views. The Library is composed of Library Packages which contain other data types (primitives or complex) or classes. The Functional Views contain references to the classes used by the particular Functional View that are needed to meet the needs of the use case or business application.

Figure 1. DDI-Lifecycle (MD) Model and its components

figure1

Library of Classes

The Library of Classes encompasses the entire DDI-Lifecycle (MD) model, but without any specific schemas or vocabularies for Functional Views. The classes in the Library contain primitives and complex data types and are the building blocks used to construct the Functional Views. Classes are organized into Library Packages.

Functional Views

Functional Views are made up of a set of references to the classes in the Library. Functional Views are subsets of the model grouped to support a specific application (for example the description of a questionnaire). The Functional Views are divided into sections. Each section loosely corresponds to a DDI lifecycle business area. Within each business area section there are separate subsections for Functional Views and compositions. Note that Functional Views may include placeholders like an abstract class that need to be substituted before the Functional View can actually be used. Functional Views are always a strict subset of the existing published or (for customization) extended classes. A Functional View identifies a set of classes that are needed to perform a specific task. It primarily consists of a set of references to specific versions of classes. Functional Views are the method used to restrict the portions of the model that are used, and as such they function very much like DDI profiles in DDI 3.*. The subsetting with Functional Views is done on the model and not on the instance level as in DDI Profiles. One may

  • restrict the use of non-mandatory properties on a class;
  • restrict the cardinality of a class’s relationships and properties;
  • restrict the use of non-mandatory relationships.

Restrictions may never be made that would violate the mandatory inclusion of a relationship or property. Functional Views may combine classes from any package or set of packages needed. The creation of Functional Views thus has no dependency on the organization of metadata classes within the Library Packaging structure.

Figure 2. Interoperability of Functional Views

figure2

As shown in Figure 2, Each Functional View is a subset of the classes in the Library. Functional Views might be distinct, overlapping in their function or be a subset or superset of another Functional View. Interoperability between two Functional Views is only given for the Library classes which are used in both Functional Views.

A global Functional View could be created which comprehends all classes in the Library and their relationships. It represents all functionality of the class in the Library. Each Functional View would be interoperable to this global Functional View.

Model Constructs and Their Relationships

Figure 3 below shows the basic relationships between the types of constructs in the model. At the lowest level, we have the primitives. These are used directly by classes, and are also used to create complex data types. The complex data types are also used by classes. Classes themselves can relate to other classes, building increasingly complex structures. The classes – along with the primitives and complex data types – form the Class Library. Classes can relate to each other in two ways: a class may have a “direct” relationship (composition, aggregation) with another class, or it may have an inheritance relationship. In this latter case, the DDI model uses additive extension. One class may extend another by inheriting all of its properties and relationships, to which the new class may add additional properties and relationships. This mechanism is used to take more generic classes and alter them for a more specific purpose. Extension is explained more fully below.

Figure 3. DDI-Lifecycle (MD) Organization of the Model

figure3

Extension

Extension is the inheritance of one class’s properties and relationships from another class. It also has a semantic relationship – an extending class provides a specialized use of the extended class.

Extensions are used within the DDI-published Library Packages to provide relationships between classes as they increase in complexity to meet increasingly complex functionality. Thus, a “simple” version of a questionnaire class might be extended into a more complex class, describing a more complex questionnaire. Some classes exist only for the purpose of extension, and are declared abstract. A Functional View may never include an abstract class. Non-abstract classes may never have direct relationships with abstract classes. Extension is illustrated in Figure 4 below.

Figure 4. Extensions in DDI-Lifecycle (MD)

figure4

Here, an abstract class – Instrument, which is any tool used to collect data – is extended by Simple Questionnaire, which is itself extended by Complex Questionnaire. As we proceed through this chain of extension, the classes have increasingly large numbers of properties and relationships.

For example, if an Instrument class has a name property, a description property, and an ID property, these would be inherited by Simple Questionnaire, which might add a relationship to one or more Question classes. The Complex Questionnaire in turn might add a relationship to a Questionnaire Flow class, to add conditional logic to the questionnaire.

The second use of extension in the DDI model is to allow users to add needed metadata fields for the purposes of customization. Thus, a specific user community may decide to have a standard set of additional properties, classes, and relationships and create their own model Library Package which contains classes extending the classes in the DDI-published Library Packages. The creator of the extensions is the owner and maintainer of the extended classes and Library Packages – this is not the business of the DDI Alliance.

Extension in DDI is strictly defined: you are able to add new properties to existing classes, and add new relationships to existing classes. Extension is always done on a class which is referenced and inherited from: that is, a new class is declared which inherits all the properties and relationships of an existing class. New properties and relationships are then declared for it. Extension is always additive extension. There is no concept of refinement – that is handled using Functional Views.

Those creating their own custom Library Packages based on extensions to the DDI model may also declare entirely new classes which are not extension of DDI classes.

Extensions made by those customizing the DDI model are expressed using the same modeling techniques and information that are used for the development of the model published by the DDI Alliance itself. As a result of this, the same tools for the creation of documentation and syntax artefacts (XML schemas, RDF vocabularies) could potentially be used.

Managing the Library

In order to manage the Library effectively, the classes, together with primitives and complex data types, are grouped into Library Packages. The organization of Library Packages is currently flat. As development continues and the number of Library Packages increases the DDI model may be organized in a hierarchy of Library Packages arranged according to the types of constructs.

Library Packages are mutually exclusive and comprehensive. They are organic entities with a logical organization and are labeled in an accessible way so that developers and modelers can easily understand their content. They are stable and should not be changed often.

Provisional Library Organization

  • Core
    • Primitives
    • Complex Data Types
    • Identification and versioning
    • Grouping and Comparison
    • Utility classes
  • Conceptual
    • Universe, concept, category unit
    • Representations, code lists, classifications
    • Represented and conceptual variables
  • Study
    • Study info
    • Study inception
    • Collection
    • Archiving and preservation
    • Access and discovery
  • Data
    • Logical data structures
    • Physical data structures
    • Datasets
    • Instance variables (raw and derived variables)
  • Process
  • Geography
  • Instrument and data source
    • Questionnaires, routing
    • Access to administrative data
    • Questions, items
  • Methodology
    • Data transformations e.g. formulas

Versioning the Library

The classes within each Library Package as well as Functional Views are versioned. The whole model has a specific release date that acts as part of its identification. The Library Packages are versioned primarily for maintenance purposes

The versioning rule is that if the contents of a versioned class change, it is versioned. This means that versions “trickle up” – a new class is added to a Library Package, which versions the Library Package; the new version of the Library Package can drive a new release of the Model, and so on.

However, if a class does not change, its version does not change, even if the Library Package within which it lives in is versioned. Once published, a class is always available for use within Functional Views, even if it is not the latest version of the class. (If the old version of a class is good enough, it is still available for use in a new version of a Functional View, etc.) Once published, classes are never removed from the Library. All published classes and Functional Views will be available in the model forever

This has the effect of de-coupling the dependencies created by the use of extensions to add new things to the model. Decisions about what release packages consist of are driven by the needs of users and marketing considerations, and not by the chain of dependencies between classes, Library Packages, etc.

It is foreseen that at least initially, the Library will be released alongside sets of useful Functional Views, but incremental releases are possible without causing problems – a new version of the Library is released, but it will always contain all classes already in use.

Example of a Functional View

Figure 5 shows a diagram of the initial Discovery View, which includes the Access, Annotation and Coverage classes. Access and Coverage inherits from the AnnotatedIdentifiable class, while Annotation inherits from the Identifiable class. Coverage has aggregation relationships to TemporalCoverage, TopicalCoverage and SpatialCoverage.

Figure 5. Example Functional View

figure5