diff options
| author | Terry Truong <terry06890@gmail.com> | 2022-05-14 20:13:35 +1000 |
|---|---|---|
| committer | Terry Truong <terry06890@gmail.com> | 2022-05-14 20:13:38 +1000 |
| commit | 36d3d535b646e65f4b38e8097ce0b74b784520ba (patch) | |
| tree | 813f1e0b57ad049ff02a213dd8d3a2faf0156cb9 /backend/data/README.md | |
| parent | c97acf8852e2017fd4776d65069f707121405f43 (diff) | |
Supplement DBpedia descs with enwiki's
Diffstat (limited to 'backend/data/README.md')
| -rw-r--r-- | backend/data/README.md | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/backend/data/README.md b/backend/data/README.md index b568f90..e2b5db7 100644 --- a/backend/data/README.md +++ b/backend/data/README.md @@ -21,10 +21,10 @@ File Generation Process 1 Obtain data in dbpedia/, as specified in it's README. 2 Run genDbpData.py, which adds a 'descs' table to data.db, using data in dbpedia/dbpData.db, dbpPickedLabels.txt, and the 'nodes' table. - - Using wikipedia dump (old method) + - Supplementing with Wikipedia dump 1 Obtain data in enwiki/, as specified in it's README. - 2 Run genEnwikiData.py, which adds a 'descs' table to data.db, - using data in enwiki/enwikiData.db, and the 'nodes' table. + 2 Run genEnwikiData.py, which adds to the 'descs' table, using data in + enwiki/enwikiData.db, reducedTol/names.txt, and the 'nodes' table. 5 Reduced Tree Structure Data 1 Run genReducedTreeData.py, which adds a 'reduced_nodes' table to data.db, using reducedTol/names.txt, and the 'nodes' and 'names' tables. |
