Cognitive sciences events webscraping project's documentation at the RISC


At the RISC, one of our services to the community consists on publishing events of our domain, cognitive sciences, for example seminaries, colloquia,, conferences, etc. to our community members. Related database is fulfilled by hand by RISC's agents, who browse several hundred of websites all week long.

So, one of my first task in the unit was to try to computerize this process, for all or part, using web scraping technical (extracting semi-strutured data from websites), via Scrapy framework.

Hereafter is project's documentation exported in PDF, as of May 2018:

PDF:

Events-webscraping-RISC.pdf