From 90a5e15bb824b84e5bb60337d6a57a1394090dc6 Mon Sep 17 00:00:00 2001 From: Terry Truong Date: Wed, 4 May 2022 01:17:06 +1000 Subject: Add scripts for obtaining/sending/displaying wikipedia descriptions Add backend/data/enwiki/ directory containing scripts and instructive READMEs. Adjust some other scripts to generate 'eol_ids' sqlite table separate from 'names'. Make server respond to /data/desc requests, and have client TileInfo component display response data. Also adjust .gitignore entries to be root-relative. --- backend/data/README.md | 36 +++++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 13 deletions(-) (limited to 'backend/data/README.md') diff --git a/backend/data/README.md b/backend/data/README.md index e639cb6..8791fb4 100644 --- a/backend/data/README.md +++ b/backend/data/README.md @@ -1,21 +1,31 @@ File Generation Process ======================= -1 Obtain data in otol/ and eol/, as specified in their README files. -2 Run genOtolData.py, which creates data.db, and adds a 'nodes' - table using data in otol/*. -3 Run genEolNameData.py, which adds a 'names' table to data.db, - using data in eol/vernacularNames.csv and the 'nodes' table. -4 Run genSpellfixNameData.py, which adds a 'spellfix\_alt\_names' - table to data.db, using data in the 'names' table. -5 Use downloadImgsForReview.py to download EOL images into imgsForReview/. - It uses data in eol/imagesList.db, and the 'names' table. -6 Use reviewImgs.py to filter images in imgsForReview/ into EOL-id-unique - images in imgsReviewed/ (uses 'names' to display common names). -7 Use genImgsForWeb.py to create cropped/resized images in img/, using - images in imgsReviewed, and also to add an 'images' table to data.db. + +1 Tree Structure Data + 1 Obtain data in otol/, as specified in it's README. + 2 Run genOtolData.py, which creates data.db, and adds a 'nodes' + table using data in otol/*. +2 Name Data for Search + 1 Obtain data in eol/, as specified in it's README. + 2 Run genEolNameData.py, which adds 'names' and 'eol\_ids' tables to data.db, + using data in eol/vernacularNames.csv and the 'nodes' table. + 3 Run genSpellfixNameData.py, which adds a 'spellfix\_alt\_names' + table to data.db, using data in the 'names' table. +3 Image Data + 1 Use downloadImgsForReview.py to download EOL images into imgsForReview/. + It uses data in eol/imagesList.db, and the 'eol_ids' table. + 2 Use reviewImgs.py to filter images in imgsForReview/ into EOL-id-unique + images in imgsReviewed/ (uses 'names' and 'eol_ids' to display extra info). + 3 Use genImgsForWeb.py to create cropped/resized images in img/, using + images in imgsReviewed, and also to add an 'images' table to data.db. +4 Node Description Data + 1 Obtain data in enwiki/, as specified in it's README. + 2 Run genEnwikiData.py, which adds a 'descs' table to data.db, + using data in enwiki/enwikiData.db, and the 'nodes' table. spellfix.so =========== + This file provides the spellfix1 extension for Sqlite, and is used for responding to fuzzy-search requests. -- cgit v1.2.3