admin管理员组

文章数量:1122832

I'm trying to scrape a webpage (/) to get all the animal names listed there.

I installed scrapy in my PyCharm project. Then, by using the terminal in PyCharm, created a folder using scrapy startproject AnimalNames. I navigated into that folder and created a spider using scrapy genspider animals /

Then I added to the code in animals.py, which is meant to retrieve the animal names from the site:

import scrapy


class AnimalsSpider(scrapy.Spider):
    name = "animals"
    allowed_domains = ["a-z-animals"]
    start_urls = ["/"]

    def parse(self, response):
        for container in response.css('div.container'):
            yield {
                container.css('a::text').get()
            }

But PyCharm underlines the parse method parameters ((self, response)) and tells me:

Signature of method AnimalsSpider.parse() does not match signature of the base method in class Spider

When I run the spider using scrapy crawl animals -O names_of_animals.json it just gives me an empty json file.

How do I fix this so it makes me a json file of all the animal names in the site?

Note that I had to change the USER_AGENT and DOWNLOAD_DELAY in settings.py so the webpage doesn't block me.

I'm trying to scrape a webpage (https://a-z-animals.com/animals/) to get all the animal names listed there.

I installed scrapy in my PyCharm project. Then, by using the terminal in PyCharm, created a folder using scrapy startproject AnimalNames. I navigated into that folder and created a spider using scrapy genspider animals https://a-z-animals.com/animals/

Then I added to the code in animals.py, which is meant to retrieve the animal names from the site:

import scrapy


class AnimalsSpider(scrapy.Spider):
    name = "animals"
    allowed_domains = ["a-z-animals.com"]
    start_urls = ["https://a-z-animals.com/animals/"]

    def parse(self, response):
        for container in response.css('div.container'):
            yield {
                container.css('a::text').get()
            }

But PyCharm underlines the parse method parameters ((self, response)) and tells me:

Signature of method AnimalsSpider.parse() does not match signature of the base method in class Spider

When I run the spider using scrapy crawl animals -O names_of_animals.json it just gives me an empty json file.

How do I fix this so it makes me a json file of all the animal names in the site?

Note that I had to change the USER_AGENT and DOWNLOAD_DELAY in settings.py so the webpage doesn't block me.

Share Improve this question edited Nov 25, 2024 at 7:33 VLAZ 28.8k9 gold badges62 silver badges82 bronze badges asked Nov 22, 2024 at 10:49 Blatant LeisureBlatant Leisure 113 bronze badges 1
  • To get rid of the pycharm warning, try adding **kwargs to the signature of the parse method. (That won't fix the main issue, though). – ekhumoro Commented Nov 22, 2024 at 12:44
Add a comment  | 

1 Answer 1

Reset to default 0

A function signature is the specification of function parameter form. You should use the same form when a function is overwritten from the parent class you inherited.

The parse method is inherited from scrapy.Spider, which might be defined as
def parse(self, response, **kwargs)
or
def parse(self, response, *args, **kwargs),
which depends on the version of scrapy you are using.

Usually, You can fix this problem by changing
def parse(self, response)
to
def parse(self, response, **kwargs).

本文标签: