For the full explanation of my reasoning behind this project, see Open Access, Academia.edu, and why I'm all-in on Zenodo.org on the American Numismatic Society blog.
This application extracts publication metadata from an Academia.edu user profile web page and facilitates import of this metadata (if there are associated document files) and re-upload of these document files into Zenodo. Despite Academia.edu's (poor) terms of service, Google's recent victory over academic publishers has demonstrated that metadata are not copyrightable, and can be freely harvested from the web. However, the document files cannot be harvested automatically (only authenticated users can download them), and so there needs to be an intermediate step in which Academia users re-upload their files into this system for posting via API into Zenodo. Upon completion of the migration process, these uploaded files will be deleted from this server.
It should be noted that an Academia.edu profile seems to only include as many as seven publications of a certain type (papers, talks, etc.). As a result, this framework will only migrate those documents embedded ino the user profile. Users with many publications may need to re-run this migration over several iterations, importing seven papers at a time, deleting them from Academia.edu, and re-running the migration on the next seven documents in the profile. I apologize for the inconvenience, but Academia.edu is purposely designed to thwart harvesting.
The code is open source at https://github.com/ewg118/academia-migrate. It relies on a PHP script for scraping metadata from Academia.edu and posting files at
multipart/form-data. The remaining interactions are handled in Orbeon, an XForms processor. Please submit issues to Github. The HTML on Academia.edu may change from time to time, and the scraping script may require revision.
- An Academia.edu profile.
- A Zenodo.org account and activated access token (see next page).