Monthly Archives: November 2014

Handling HTTP status code 100 in Scrapy

You might have some problems handling the 100 response code in Scrapy.  Scrapy uses Twisted on the backend, which itself does not handle status code 100 properly yet: http://twistedmatrix.com/trac/ticket/5192

The remote server will first send a response with the 100 status code, then a response with 200 status code.  In order to get the 200 response code, sent the following header in your spider:

‘Connection’: ‘close’

If your 200 response is also gzipped, Scrapy might not gunzip, in which case you need to set the following header as well:

‘Accept-Encoding’: ”

And if Scrapy will not do anything with the responses at all, you might need to set the following Spider attribute:

handle_httpstatus_list = [100]