From 29940d51eb8b6b220d53940ecbc212cea78159ae Mon Sep 17 00:00:00 2001 From: Terry Truong Date: Tue, 17 May 2022 10:41:12 +1000 Subject: Improve enwiki description extraction Adjust enwiki code to handle single dump file, and add scripts for 'convenient' page-content lookup. --- backend/data/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'backend/data/README.md') diff --git a/backend/data/README.md b/backend/data/README.md index cb9cd42..576c70e 100644 --- a/backend/data/README.md +++ b/backend/data/README.md @@ -24,7 +24,7 @@ File Generation Process - Supplementing with Wikipedia dump 1 Obtain data in enwiki/, as specified in it's README. 2 Run genEnwikiData.py, which adds to the 'descs' table, using data in - enwiki/enwikiData.db, reducedTol/names.txt, and the 'nodes' table. + enwiki/enwikiData.db, and the 'nodes' table. 5 Reduced Tree Structure Data 1 Run genReducedTreeData.py, which adds 'r_nodes' and 'r_edges' tables to data.db, using reducedTol/names.txt, and the 'nodes' and 'names' tables. -- cgit v1.2.3