aboutsummaryrefslogtreecommitdiff
path: root/backend/data/README.md
diff options
context:
space:
mode:
authorTerry Truong <terry06890@gmail.com>2022-07-08 14:19:49 +1000
committerTerry Truong <terry06890@gmail.com>2022-07-08 14:31:46 +1000
commit834dab545931a3f224ef336530a890a7349b100a (patch)
tree1ed5e2a2059bcabd3f8266fd7d52138cc00f026a /backend/data/README.md
parentd84a2dab11aa23d56c3213008424872e1a011279 (diff)
Add ancestors_* tables, for faster 'toroot' lookupancestors-tables
Speedup seemed minor, and for a non-wide range of situations. It also roughly quadrupled the database size.
Diffstat (limited to 'backend/data/README.md')
-rw-r--r--backend/data/README.md9
1 files changed, 5 insertions, 4 deletions
diff --git a/backend/data/README.md b/backend/data/README.md
index 13aeb89..b4b0745 100644
--- a/backend/data/README.md
+++ b/backend/data/README.md
@@ -43,6 +43,8 @@ This directory holds files used to generate data.db, which contains tree-of-life
These are like `nodes`, but describe the nodes for various reduced trees.
- `edges_t`, `edges_i`, `edges_p` <br>
Like `edges` but for reduced trees.
+- `ancestors_t`, `ancestors_i`, `ancestors_p` <br>
+ Maps nodes to their ancestors. Used for faster lookup.
# Generating the Database
@@ -146,7 +148,6 @@ Some of the python scripts require third-party packages:
These correspond to entries in the `names` table. `prefAlt` should be 1 or 0.
A line like `name1|name1|1` causes a node to have no preferred alt-name.
3. Run genReducedTrees.py, which generates multiple reduced versions of the tree,
- adding the `nodes_*` and `edges_*` tables, using `nodes` and `names`. Reads from
- pickedNodes.txt, which lists names of nodes that must be included (1 per line).
- The original tree isn't used for web-queries, as some nodes would have over
- 10k children, which can take a while to render (took over a minute in testing).
+ adding the `nodes_*`, `edges_*`, and `ancestors_*` tables, using `nodes` and
+ `names`. Reads from pickedNodes.txt, which lists names of nodes that must be
+ included (1 per line). The original tree isn't used for web-queries.