From a4673571570816a06d4188169fc00dada79ec0a3 Mon Sep 17 00:00:00 2001 From: Terry Truong Date: Sun, 15 May 2022 00:46:39 +1000 Subject: Converted nodes+r_nodes tables into nodes+edges+r_nodes+r_edges Conversion avoids encoding node children as JSON strings, and allows for easier querying of edge data. Adjusted server to use new format. Also added some table indexes for common operations. --- backend/data/README.md | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) (limited to 'backend/data/README.md') diff --git a/backend/data/README.md b/backend/data/README.md index e2b5db7..cb9cd42 100644 --- a/backend/data/README.md +++ b/backend/data/README.md @@ -3,8 +3,8 @@ File Generation Process 1 Tree Structure Data 1 Obtain data in otol/, as specified in it's README. - 2 Run genOtolData.py, which creates data.db, and adds a 'nodes' - table using data in otol/*. + 2 Run genOtolData.py, which creates data.db, and adds + 'nodes' and 'edges' tables using data in otol/*. 2 Name Data for Search 1 Obtain data in eol/, as specified in it's README. 2 Run genEolNameData.py, which adds 'names' and 'eol\_ids' tables to data.db, @@ -26,17 +26,19 @@ File Generation Process 2 Run genEnwikiData.py, which adds to the 'descs' table, using data in enwiki/enwikiData.db, reducedTol/names.txt, and the 'nodes' table. 5 Reduced Tree Structure Data - 1 Run genReducedTreeData.py, which adds a 'reduced_nodes' table to data.db, - using reducedTol/names.txt, and the 'nodes' and 'names' tables. + 1 Run genReducedTreeData.py, which adds 'r_nodes' and 'r_edges' tables to + data.db, using reducedTol/names.txt, and the 'nodes' and 'names' tables. data.db Tables ============== -- nodes: name TEXT PRIMARY KEY, children TEXT, parent TEXT, tips INT, p\_support INT -- names: name TEXT, alt\_name TEXT, pref\_alt INT, PRIMARY KEY(name, alt\_name) -- eol\_ids: id INT PRIMARY KEY, name TEXT -- images: eol\_id INT PRIMARY KEY, source\_url TEXT, license TEXT, copyright\_owner TEXT -- descs: name TEXT PRIMARY KEY, desc TEXT, redirected INT -- reduced\_nodes: name TEXT PRIMARY KEY, children TEXT, parent TEXT, tips INT, p\_support INT +- nodes: name TEXT PRIMARY KEY, tips INT +- edges: node TEXT, child TEXT, p\_support INT, PRIMARY KEY (node, child) +- names: name TEXT, alt\_name TEXT, pref\_alt INT, PRIMARY KEY(name, alt\_name) +- eol\_ids: id INT PRIMARY KEY, name TEXT +- images: eol\_id INT PRIMARY KEY, source\_url TEXT, license TEXT, copyright\_owner TEXT +- descs: name TEXT PRIMARY KEY, desc TEXT, redirected INT +- r\_nodes: name TEXT PRIMARY KEY, tips INT +- r\_edges: node TEXT, child TEXT, p\_support INT, PRIMARY KEY (node, child) Other Files =========== -- cgit v1.2.3