Jazz transcriptions - follow links

scrapy for a single webpage

scrapy is a python package that helps with scraping. A simple example is as follows. For a single webpage, we can retrieve data using

import scrapy

class BlogSpider(scrapy.Spider):
    name = 'jazzspider'
    start_urls = ['https://blueblackjazz.com/en/books']

    def parse(self, response):
        for link in response.css('div.row > div > ul > li > a'):
            yield {'title': link.css('::text').get(),
                    'url': link.attrib['href'] }

and later running the command scrapy runspider scr.py -o my_data.csv.