From 834dab545931a3f224ef336530a890a7349b100a Mon Sep 17 00:00:00 2001 From: Terry Truong Date: Fri, 8 Jul 2022 14:19:49 +1000 Subject: Add ancestors_* tables, for faster 'toroot' lookup Speedup seemed minor, and for a non-wide range of situations. It also roughly quadrupled the database size. --- backend/data/README.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) (limited to 'backend/data/README.md') diff --git a/backend/data/README.md b/backend/data/README.md index 13aeb89..b4b0745 100644 --- a/backend/data/README.md +++ b/backend/data/README.md @@ -43,6 +43,8 @@ This directory holds files used to generate data.db, which contains tree-of-life These are like `nodes`, but describe the nodes for various reduced trees. - `edges_t`, `edges_i`, `edges_p`
Like `edges` but for reduced trees. +- `ancestors_t`, `ancestors_i`, `ancestors_p`
+ Maps nodes to their ancestors. Used for faster lookup. # Generating the Database @@ -146,7 +148,6 @@ Some of the python scripts require third-party packages: These correspond to entries in the `names` table. `prefAlt` should be 1 or 0. A line like `name1|name1|1` causes a node to have no preferred alt-name. 3. Run genReducedTrees.py, which generates multiple reduced versions of the tree, - adding the `nodes_*` and `edges_*` tables, using `nodes` and `names`. Reads from - pickedNodes.txt, which lists names of nodes that must be included (1 per line). - The original tree isn't used for web-queries, as some nodes would have over - 10k children, which can take a while to render (took over a minute in testing). + adding the `nodes_*`, `edges_*`, and `ancestors_*` tables, using `nodes` and + `names`. Reads from pickedNodes.txt, which lists names of nodes that must be + included (1 per line). The original tree isn't used for web-queries. -- cgit v1.2.3