Sounds like the server might be throttling/banning your IP for making
too many requests too frequently.
First, I'd suggest checking the robots.txt on the domain to see if there
is any guidance on automated request frequency. If not, you could ask the
owner of the website to advise on how to best crawl the site. Otherwise,
you may need to determine the rate limiting experimentally.
To throttle your requests, you can use something like apiclient.RateLimiter* (source). It would look something like this:
from apiclient import RateLimiter
from urllib3 import PoolManager
lock = RateLimiter(max_messages=30, every_seconds=60)
http = PoolManager(...)
for url in crawl_list:
r = http.request(...)
Another thing you could do is crawl a cached version of the site, if one
is available through Google or archive.org.
[*] Disclaimer: I also wrote apiclient a long time ago. It's not
super-well documented. I suspect there are other similar modules that you
can use if you find it lacking, but the source should be reasonably easy to
understand and extend.