Data Structures

One of the primary goals of the Prajna project is to collate data into more comprehensible data structures. Traditional database retrievals focus on retrieving single records, and having the application construct any advanced data structures based upon the data. This requires the developer to have a strong understanding of the database structure, including how its tables are linked, the types of values available, etc. However, the developer responsible for generating advanced data structures, such as graphs or trees, is often not the same person as the one maintaining or developing the database.

The Prajna Project establishes the DataAccessor as a bridge between an underlying data source and complex data structures. A visualization developer can simply retrieve various graphs from the DataAccessor, and construct visualizations or provide analytical utilities. The DataAccessor is responsible for constructing the advanced data structures. Implementations of a DataAccessor will require significant knowledge of the underlying data source, such as its structure, fields, etc. By encapsulating this functionality, the DataAccessor separates the generation of the data model from the visualization or analytical components of the software.

Prajna identifies four basic data structures:

These data structures provide for a variety of analytical tasks and tools. Particular DataAccessor implementations may provide a subset of data structures, since not all data can be mapped into trees, graphs or grids. The data structures are designed to be extensible, and use Java generics to allow a developer to construct data structures of arbitrary objects, without tying to any particular data structure implementation.

These basic data structures are based in part on the original work by Dr. Ben Schneiderman. Dr. Shneiderman originally proposed a taxonomy of information visualization environments in his paper, The Eyes Have It: A Task by Data Type Taxonomy of Information Visualizations. In this paper, visualization tasks are broken down into several categories: Temporal, 1-D, 2-D, 3-D, Multi-D, Tree, Network, and Workspace. While originally focused on visualization, this taxonomy provides a useful breakdown according to analytic tasks.

Since the nature of the various dimensional visualizations is inherently similar, the Prajna project groups 1-D, 2-D, 3-D, and Multi-D visualizations into the generalized Grid structure, which can be of any dimension. Tree and Network are represented by Tree and Graph structures respectively. The Temporal visualization type is instead identified as an atomic data type, since it is certainly possible to create temporal trees or graphs. Workspace visualizations are less concerned about data visualization and analysis, and so are excluded.