OSLC proposal for data dictionary interfaces
Introduction
Data definition is one of the cornerstones in designing, building, implementing and maintaining business applications. Organizations often want to specify and maintain a data dictionary, which contains data definitions for the functional concepts that make up an IT system. The main benefit of a data dictionary is achieved when it can trace the usages of the concepts it defines. Maintaining a data dictionary up to date, maintaining the traceability links to the various artifacts that refer to data definitions is a challenge in an IT environment that is made-up various tools and protocols. It often implies manual steps and tight processes.
This proposal is about defining an OSLC interface for data dictionary entries that facilitates the interchange of these artifacts amongst several tools and allows the setup and maintenance of traceability links.
Artifacts that could trace data definitions
The early phases of enterprise architecture and modeling involve representing processes or flows that manipulate data. A good part of the data hereby manipulated represents concepts of a functional area that a Subject Matter Expert of the domain can identify, define and describe in a data dictionary. Example of such data definitions are a claim number, social security number (for elementary data) or a customer (for aggregate data). Such high level, functional data should be described once, in a data dictionary and referenced by traceability links by the design and modeling artifacts.
Requirements, plans, tasks and work-items can also discuss functional improvements of an application. Such improvements can be related to functions that need to manipulate data in a certain manner. For example, a requirement can express a business rule about how a customer can reach gold status. This requirement will be translated into plans and tasks that will detail more precisely the nature of the rule, and the controls to perform on the customer and its status. Here again, traceability links could be used between the requirements, plans and work-items to the appropriate data definitions.
Glossaries define terms. Some of them may relate to the data definitions in the data dictionary. Entries in the glossary could be linked to entries in the data dictionary.
Several data dictionaries may also exist, for different purposes or domains. An entry in a given data dictionary could be linked to an entry in another data dictionary to indicate that the two entries deal with the same logical concept.
Source code obviously deals with data. Some variables in the code represent temporary data defined by a given program to perform a calculation; some others are language specific representations of the functional data that the application is supposed to manipulate. Again, traceability links could be used between language specific data definitions and the high level functional data in the data dictionary.
Code generation, if its inputs are directly or indirectly linked to entries of the data dictionary, can establish automatically traceability links between generated source code and data definitions.
Code scanners and analysis tool are able to relate fields in different source files and identify them as being related, dealing with the same concept. Such tools can be used to automatically, or semi-automatically establish traceability links between existing source code and data definitions. They could also create entries in the data dictionary.
Benefits of traceability to data definitions
Establishing the traceability links enables several functions:
- Identify all the source files, whatever the language they use and whatever the actual type of the item is in the source file, that is a representation of a functional concept captured in the data dictionary. This provides better analysis of the IT system and predictability of a change.
- Identify all the activities that relate to a functional concept
- Identify all the design elements that relate to a functional concept
- Produce a report that crosses disciplines (analysis, design, requirement, implementation) about where a given data definition is referenced, allowing to browse the linked artifacts.
Scenarios
This set of scenarios outlines the basic functionality for creating and navigating basic linking across entries of a data dictionary and other domain resources.
In these scenarios we continue to assume that the details (raw resource formats) of individual entries in the data dictionary are a black box as concerned by the specification, however implementing service can and should know how to manage them.
All scenarios assume that the user has proper permissions and has already been authenticated by all required systems/services. It also assumes that all services have been configured properly (i.e. link types have been defined, and servers have been configured with references to the other servers implementing the expected OSLC interfaces (CM, RM, QM, ...).
Create link from requirement to data definition
Goal: An analyst wants to indicate that a specific requirement involves an entry in the data dictionary.
- The analyst navigates to a requirement
- The analyst invokes the action to create a new link from this resource to an entry of the data dictionary type resource.
- If the system is configured with more than one data dictionary project associated with the context that the requirement resource is in, then the user is prompted to select which data dictionary 'project' to find the entry in.
- The system delegates resource selection to the data dictionary service. The data dictionary service provides a resource picker to the analyst.
- The analyst uses the resource picker to select a data definition to link to.
- If the appropriate data definition is not found in the data dictionary, the user can create a new entry and is prompted with a creation form.
- The analyst selects the link type 'manipulates' from the list of available types, and confirms the link creation. The list contains the labels for each link type (not the internal id/uri). Systems may optionally display hover help or a short description for each.
- The system creates a new link resource that has the requirement resource as the subject, the data definition resource URI as the object, and the link type as the predicate. This link will appear in the links view for requirement resource and the data definition resource, and can be queried for.
Create link from source to data definition
Goal: A developer wants to indicate that a specific source manipulates an existing entry in the data dictionary.
- The developer navigates to a source file
- The developer invokes the action to create a new link from this source to an entry of the data dictionary type resource.
- The system delegates resource selection to the data dictionary service. The data dictionary service provides a resource picker to the analyst.
- The developer uses the resource picker to select a data definition to link to.
- If the appropriate data definition is not found in the data dictionary, the user can create a new entry and is prompted with a creation form.
- The developer selects the link type ‘implements’ from the list of available types, and confirms the link creation. The list contains the labels for each link type (not the internal id/uri). Systems may optionally display hover help or a short description for each.
- The system creates a new link resource that has the requirement resource as the subject, the data definition resource URI as the object, and the link type as the predicate. This link will appear in the links view for source file and the data definition resource, and can be queried for.
Create new entry in the data dictionary and link this entry to the current resource
Goal: Populate the data dictionary from existing resources, for example source files.
- The developer navigates to a source file, and recongnize that a piece of code manipulates a concept that is worth being captured in the data dictionary.
- The developer first checks if the data definition he wants to create does not already exist. Through a resource picker, he performs a query about the characteristics of the data definition. The query does not answer a satisfactory data definition.
- The developer invokes the action to create a new entry in the data. A creation dialog opens to create a new entry in the data dictionary.
- A new entry is created in the data dictionary. A link is established automatically from the current resource to this new entry.
Create new entries in the data dictionary semi automatically by scanning resources
Goal: populate the data dictionary from a semi-automated analysis of existing resources.
This is a variant of the previous scenario, that involves an analysis tool.
- The user would select a set of resources (e.g. a stream)
- The analysis tool would scan this set of resources
- The analysis tool returns a list of possible new entries in the data dictionary.
- The user reviews the suggestions, amend them and finally accepts a subset of them
- The tool creates the new entries in the data dictionary and link these entries from the appropriate files that were scanned
View and navigate to a linked resource from a data definition
Goal: A stakeholder viewing a data definition resource wants to see and navigate to related ALM resources.
- The stakeholder views a data definition resource.
- The system provides a way to see the list of links to external resources (other ALM resources like requirements, work items or test cases, sources). The links are displayed with human readable labels. If available hover help will display a short human readable description of the link resource itself. Additional rich hover information may be available (including previews of the target resource). The list of links may be organized by type.
- The stakeholder can select a link and the default behavior will be to open up a browser (if not already using a web browser) and it will make a request for an HTML representation of the resource on the other end of the link.
- The other system (OSLC RM, or plain web server) will provide an HTML presentation of the resource.
Identify "same concept" entries in several data dictionaries
Goal: establish traceability links between several data dictionary to indicate that one entry in the first one is equivalent in concept to one entry in the second one.