Changeset 38789


Ignore:
Timestamp:
2024-02-26T00:26:21+13:00 (3 months ago)
Author:
jc550
Message:

add more to readme and change ingest bash to work with changed file structure

Location:
other-projects/metadata-encoding
Files:
1 added
1 deleted
3 edited

Legend:

Unmodified
Added
Removed
  • other-projects/metadata-encoding/INGEST-INTO-MONGODB.sh

    r38636 r38789  
    1 ./py/xRefToMongo-v3.py $* crossref-data/April\ 2023\ Public\ Data\ File\ from\ Crossref/ mongodb://localhost:27017
     1./py/using-api/xRefToMongo-v3.py $* crossref-data/April\ 2023\ Public\ Data\ File\ from\ Crossref mongodb://localhost:27017
  • other-projects/metadata-encoding/README.txt

    r38628 r38789  
    7777  pip install -r requirements.txt
    7878 
     79#----
     804. Organisation of Python files
     81#----
     82
     83Python code is seperated in the py folder chronologically. Anything created before MongoDB was used is in "using-api" (and contains files that exclusively use the api of Crossref and Openalex). Most other code is in the "using-mongodb" folder, and use MongoDB (and occasionally the api as well).
     84
     85"comparisonTest" contains experimental code that tests for equivalence in two titles (taken from OpenAlex and CrossRef in these tests).
     86 
     87#----
     885. Other files
     89#----
     90
     91INGEST-INTO-MONGODB.sh
     92
     93Quick and easy way to ingest crossref data into MongoDB. Assumes the existence of a crossref-data folder with subfolder that contains the extracted contents of the crossref download from 2023 (April\ 2023\ Public\ Data\ File\ from\ Crossref)
     94
     95interesting-DOIs.txt
     96
     97Contains a (currently short) list of "interesting" DOIs. DOIs are usually accompanied by a reason why they were put into the list, in future this could be expanded and put into a nice JSON file.
     98
     99notes.txt
     100
     101Barely used notes that contain a very tiny amount of information on Crossref API usage.
    79102
    80103
     104#----
     1056. Citations
     106#----
    81107
     108get-unicode-blocks.py
     109
     110Created by Chris Adams, source (https://gist.github.com/acdha/49a610089c2798db6fe2)
    82111
    83112
    84113Developed by Joel Crombie (jc550) as a Summer Research Project
    85 (ALPSS373-23C)
    86 
    87 
    88 
    89 
     114(HECSS373-23C)
    90115
    91116--------------------------------------------------
  • other-projects/metadata-encoding/note

    r38774 r38789  
    1 Notes..........
     1Notes..
    22
    33API CrossRef
Note: See TracChangeset for help on using the changeset viewer.