aboutsummaryrefslogtreecommitdiff
path: root/backend/data/README.md
diff options
context:
space:
mode:
authorTerry Truong <terry06890@gmail.com>2022-05-14 20:13:35 +1000
committerTerry Truong <terry06890@gmail.com>2022-05-14 20:13:38 +1000
commit36d3d535b646e65f4b38e8097ce0b74b784520ba (patch)
tree813f1e0b57ad049ff02a213dd8d3a2faf0156cb9 /backend/data/README.md
parentc97acf8852e2017fd4776d65069f707121405f43 (diff)
Supplement DBpedia descs with enwiki's
Diffstat (limited to 'backend/data/README.md')
-rw-r--r--backend/data/README.md6
1 files changed, 3 insertions, 3 deletions
diff --git a/backend/data/README.md b/backend/data/README.md
index b568f90..e2b5db7 100644
--- a/backend/data/README.md
+++ b/backend/data/README.md
@@ -21,10 +21,10 @@ File Generation Process
1 Obtain data in dbpedia/, as specified in it's README.
2 Run genDbpData.py, which adds a 'descs' table to data.db, using
data in dbpedia/dbpData.db, dbpPickedLabels.txt, and the 'nodes' table.
- - Using wikipedia dump (old method)
+ - Supplementing with Wikipedia dump
1 Obtain data in enwiki/, as specified in it's README.
- 2 Run genEnwikiData.py, which adds a 'descs' table to data.db,
- using data in enwiki/enwikiData.db, and the 'nodes' table.
+ 2 Run genEnwikiData.py, which adds to the 'descs' table, using data in
+ enwiki/enwikiData.db, reducedTol/names.txt, and the 'nodes' table.
5 Reduced Tree Structure Data
1 Run genReducedTreeData.py, which adds a 'reduced_nodes' table to data.db,
using reducedTol/names.txt, and the 'nodes' and 'names' tables.