The DataRecord

The DataRecord is used as a data object throughout Prajna, providing a common representation for arbitrary data. It is used as the basic data object in the semantic packages, and also as the reference object on other elements, such as geographic shapes.

The DataRecord supports six different data element types, each of which represents a basic unit of knowledge. Each of these data types serve a different purpose in analysis. The data types are:

The DataRecord is designed as a mapping of property keys to values. A single data record may have multiple values for any of its properties. For instance, a DataRecord representing a person might have the name, a Date of Birth (date field), Birthplace (location field), and siblings (multi-valued text field).

Why no Floating Point data type?

Floating point values typically measure something. A floating point value is meaningless without some context - if a data element has a value of 1.8, what does that really mean? While data is stored in databases or files as floating point values, the tables typically have units of measure which are either implied or explicitly defined. Similarly, most human-readable data which contains floating point values also has enough context for a reader to intuitively grasp the meaning of a particular value.

Since the goal of the Prajna project is to provide understanding beyond simple data representation, the floating point values are not used as an atomic data type. The reason for this decision was to require developers to take the time to understand the data, rather than lazily pushing floating point values at the user for them to interpret.

The Measure data type uses floating point values to represent the value of a measurement. Prajna does include a Unitless measure, which can be used for arbitrary floating point representations if necessary. However, the use is discouraged. By identifying and using actual measurement types, the Prajna project enables unit conversion and comparison. Using a measurement will also enhance the end user's understanding of the data.