aboutsummaryrefslogtreecommitdiff
path: root/backend/data/README.md
diff options
context:
space:
mode:
authorTerry Truong <terry06890@gmail.com>2022-05-26 01:06:16 +1000
committerTerry Truong <terry06890@gmail.com>2022-05-26 01:06:16 +1000
commiteb72584af8f5a598740a87ee024d0d899fdffc8d (patch)
tree64e418086157831f17233621ea5cf5e4320baebc /backend/data/README.md
parent67cf7da46692f86120a5e066a2f696074413d3f6 (diff)
Trim otol tree to avoid certain slowdowns
Some nodes had multiple ancestors with over 10k children, and jump-searching to them could take almost a minute for vue to load.
Diffstat (limited to 'backend/data/README.md')
-rw-r--r--backend/data/README.md12
1 files changed, 11 insertions, 1 deletions
diff --git a/backend/data/README.md b/backend/data/README.md
index a1bc287..8cfa960 100644
--- a/backend/data/README.md
+++ b/backend/data/README.md
@@ -4,7 +4,8 @@ File Generation Process
1 Tree Structure Data
1 Obtain data in otol/, as specified in it's README.
2 Run genOtolData.py, which creates data.db, and adds
- 'nodes' and 'edges' tables using data in otol/*.
+ 'nodes' and 'edges' tables using data in otol/*, as well as
+ namesToKeep.txt, if present.
2 Name Data for Search
1 Obtain data in eol/, as specified in it's README.
2 Run genEolNameData.py, which adds 'names' and 'eol\_ids' tables to data.db,
@@ -57,3 +58,12 @@ Other Files
tries to associate tree-of-life node names wth DBpedia node labels. It
writes data about them to conflicts.txt, which can be manually edited
to resolve them.
+- namesToKeep.txt <br>
+ Contains names to avoid trimming off the tree data generated by
+ genOtolData.py. Usage is optional, but, without it, a large amount
+ of possibly-significant nodes are removed, using a short-sighted
+ heuristic. <br>
+ One way to generate this list is to generate the files as usual,
+ then get node names that have an associated image, linked-image,
+ description, or presence in r_nodes. Then run the genOtolData.py
+ and genEolNameData.py scripts again.