Data Structures: Types, Storage & Classification


Data Structures: Types, Storage & Classification

Data structures basically represent data organization, management and storage, including values, relationships and operations between them. There are some synonyms or related terms:

Data Type

Tipically the value or set of values of an entity (variable, constant, etc.). Basic types are atomic ones, that can’t be taken appart to more basic ones (char, integer, float, etc.). They’re usually directly supported by the hardware. Operators are also tightly related (an abstraction), but we’re not going to go into more details here.
Abstract data type

A mathematical model with set of operations (defined by the user). E.g. set, or a graph (nodes + branches).

Similarly to functions where we have a generalized set of primitive opertations, abstract types are generalization of primitive types (integer, chars, float, etc).

Based on the above, data structures are basically set of interconnected elements/types.

Data Structures: Type and storage [Examples]

Arrays (homogeneous) and Records (heterogeneous) are two basic examples of element aggregation. With a combination of those two and/or with elements within those structures we can create structures of arbitrarary complexity.

Aside from a logical relations between elements within a record, they’re also physically sequentially placed in memory. Elements can be placed in a continous memory locations or they can be tied to a begining of a memory words.

If memory word size if 4 bytes, and if we have an object with the following fields: “Name” (10 bytes), “x” (2 bytes) and “y” (2 bytes).

Storing that record in a continuous locations, starting from the memory location 200:

200 – 210: name
210 – 212: x
212 – 214: y

Since we have a 4 byte memory word (they’re situated on: 200, 204, 208, 212, 216), X and Y are placed across two memory words. That would require 2 memory access operations . It’s more efficient to place them on the beginning of a memory word.

200 – 210: name
210 – 212: <EMPTY>
212 – 216: x
216 – 220: y

Altghough we’re ending up with some fragmentation, performant wise, it’s more efficient alocation.

Classification & Representation

One classification method is based on element relations within the record/object. So, if we have an element that’s connected with a successor and a precursor, it’s a linear structure (array, linked list, stack,). If we have more complex relations where one element is related to multiple other elements, it’s a nonlinear structure (graph, tree).

Another classification is related to structure size, static (defined while compiling, maximum expected size) and dynamic (changed on runtime).

Based on a physical and logical order of elements in memory, we have a sequential (continous space) and linked representation (non-continous, random placement).

For a linked ones, pointers are the main way of navigation. Pointer is a basic type, containing memory address value of an element/object.

As always, both have some benefits and disadvantages (faster access vs more flexibility).

Conclusion

What type of structure you should implement depends on many factors, including your requirements: frequency of certain operations, memory size, speed, etc. We’re going to continue with the individual data structures in the following posts.