aboutsummaryrefslogtreecommitdiff
path: root/backend/data/README.md
diff options
context:
space:
mode:
authorTerry Truong <terry06890@gmail.com>2022-06-11 02:05:08 +1000
committerTerry Truong <terry06890@gmail.com>2022-06-11 02:05:08 +1000
commit6cb2cccad2fae70ce7e857e3aab232a6e7eeb358 (patch)
tree9a4ca11359abf79c8daa3c74bad83228fabaa3b4 /backend/data/README.md
parent5220d744dc3f7b2629d0ad8bd4bb4634d21e0d96 (diff)
Add yet more manual-correction for node-desc generation
Diffstat (limited to 'backend/data/README.md')
-rw-r--r--backend/data/README.md6
1 files changed, 3 insertions, 3 deletions
diff --git a/backend/data/README.md b/backend/data/README.md
index 6ec629a..0845450 100644
--- a/backend/data/README.md
+++ b/backend/data/README.md
@@ -25,12 +25,12 @@ File Generation Process
1 Obtain data in dbpedia/, as specified in it's README.
2 Run genDbpData.py, which adds a 'descs' table to data.db, using
data in dbpedia/dbpData.db, the 'nodes' table, and possibly
- dbpNamesToSkip.txt and dbpPickedLabels.txt.
+ genDescNamesToSkip.txt and dbpPickedLabels.txt.
5 Supplementary Name/Description/Image Data
1 Obtain data in enwiki/, as specified in it's README.
2 Run genEnwikiDescData.py, which adds to the 'descs' table, using data in
- enwiki/enwikiData.db, and the 'nodes' table. Also uses genEnwikiDesc*.txt
- files for skipping/resolving some name-page associations.
+ enwiki/enwikiData.db, and the 'nodes' table. Also uses genDescNamesToSkip.txt and
+ genEnwikiDescTitlesToUse.txt for skipping/resolving some name-page associations.
3 Optionally run genEnwikiNameData.py, which adds to the 'names' table,
using data in enwiki/enwikiData.db, and the 'names' and 'descs' tables.
4 In enwiki/, run getEnwikiImgData.py, which generates a list of