diff options
| author | Terry Truong <terry06890@gmail.com> | 2022-05-26 01:06:16 +1000 |
|---|---|---|
| committer | Terry Truong <terry06890@gmail.com> | 2022-05-26 01:06:16 +1000 |
| commit | eb72584af8f5a598740a87ee024d0d899fdffc8d (patch) | |
| tree | 64e418086157831f17233621ea5cf5e4320baebc /backend/data/README.md | |
| parent | 67cf7da46692f86120a5e066a2f696074413d3f6 (diff) | |
Trim otol tree to avoid certain slowdowns
Some nodes had multiple ancestors with over 10k children, and
jump-searching to them could take almost a minute for vue to load.
Diffstat (limited to 'backend/data/README.md')
| -rw-r--r-- | backend/data/README.md | 12 |
1 files changed, 11 insertions, 1 deletions
diff --git a/backend/data/README.md b/backend/data/README.md index a1bc287..8cfa960 100644 --- a/backend/data/README.md +++ b/backend/data/README.md @@ -4,7 +4,8 @@ File Generation Process 1 Tree Structure Data 1 Obtain data in otol/, as specified in it's README. 2 Run genOtolData.py, which creates data.db, and adds - 'nodes' and 'edges' tables using data in otol/*. + 'nodes' and 'edges' tables using data in otol/*, as well as + namesToKeep.txt, if present. 2 Name Data for Search 1 Obtain data in eol/, as specified in it's README. 2 Run genEolNameData.py, which adds 'names' and 'eol\_ids' tables to data.db, @@ -57,3 +58,12 @@ Other Files tries to associate tree-of-life node names wth DBpedia node labels. It writes data about them to conflicts.txt, which can be manually edited to resolve them. +- namesToKeep.txt <br> + Contains names to avoid trimming off the tree data generated by + genOtolData.py. Usage is optional, but, without it, a large amount + of possibly-significant nodes are removed, using a short-sighted + heuristic. <br> + One way to generate this list is to generate the files as usual, + then get node names that have an associated image, linked-image, + description, or presence in r_nodes. Then run the genOtolData.py + and genEolNameData.py scripts again. |
