Tree Structures

Overview


A tree structure is a very common data structure. A tree is a data structure composed of nodes, each of which may have multiple children nodes. A common example of a tree strucutre is the folder structure on a computer hard drive. Every folder is a node. All the files and folders within that folder are the children nodes.

tree api


The davinci platform hosts a set of libraries for dealing with tree data.

  • tree-map.js

Reading from Table Data


Often times a tree structure is enconded as tabular data. For instance, you could save tree data in a relational database. The tree api provides methods for reading tabular data into a tree structure.


//creates a tree map from an array of objects
let treeMap = $list(data).treeMap(p=>p.id, p=>p.parentid).tree;
					
Try it

Dimensional Data


Trees are often used in data that is described as being dimensional. That is, you have data that is tabular in nature, but which has columns that are items in a tree structure. For instance, you may have a table of sales data, and each transaction is attached to a salesperson. That sales officer exists within a hierarchy of managers in their organization. One may want to filter that sales data by a manager in the management tree.

Other columns in the sales data may refer to other tree strucutures. For example, the product id may exist within a product classification hierarchy.

Each of these tree structures attached to the data is commonly referred to as a dimension.


let tree = await import('/lib/tree/v1.0.0/tree-map.js');

//creates a tree map from an array of objects
let productMap = tree.fromTable(products, p=>p.id, p=>p.parentid);
let officers = tree.fromTable(salesofficers, p=>p.name, p=>p.manager);

let williamsSales = sales.filter(p=>"williams" in officers[p.officer].ancestors)
					
Try it

Large Dimensional Data - Data Blocks


The code above works when the data set can be loaded entirely in memory. For large datasets (or Big Data), this may not be possible. In some cases the entire dataset cannot be loaded into memory, but the data you want to analyze can, that is you only want a subset of the data. In other cases, it may be possible to load the entire dataset into memory but you still only want a subset of the data, for speed reasons, you dont want to have to load the entire set just to filter out a set of records.

One way to tackle these situations is to break the data into separate files and then to create a folder structure that reflects properties of the data in that folder.

For example, using the data from above, you could have a set of folders such as:


/jones/2020-01/
/jones/2020-02/
/jones/2020-03/
/smith/2020-01/
/smith/2020-02/
/smith/2020-03/
/lee/2020-01/
/lee/2020-02/
/lee/2020-03/
						 


This folder structure has the sales officer name as the top folder, the month as the second folder, and the data files would be placed in the date subdirectory. This folder structure allows one to filter the data loaded by officer name and date. Once loaded, the data can be further filtered.

To accomplish this, you need a way to list all the data files at a give location. For instance, the server library will list the files at a given webserver, assuming that webserver complies with the davinci file server specification.

The following code demonstrates filtering the data prior to loading. The data is loaded using the $ajax server. It can be further filtered after loading.


let sv = await import('/lib/server/v1.0.0/server.js');
let url = 'http://www.server.com/';
let urls = (await sv.server(url).list('path')).map(p=>"`+url+`"+p.path);

urls = urls.filter(p=>{
  let split = p.split('/');						
  return split[0] == 'jones';
})

let data = [];
for(let url of urls){
  data = [...data, ...(await $ajax(url))];					
}
						 


Supposing the data is still to large to load at once, meaning we want to further filter each file, you can load each file separately as follows:


let data = [];
for(let url of urls){
  data = [...data, ...((await $ajax(url)).filter(p=>p.size > 1000))]					
}
						 


Contents