Crawl feeds.
(numbers.Integral) The default timeout for connection attempts. 10 seconds.
New in version 0.3.0.
Error which rises when crawling given url failed.
New in version 0.3.0: Added feed_uri parameter and corresponding feed_uri attribute.
The result of each crawl of a feed.
It mimics triple of (url, feed, hints) for backward compatibility to below 0.3.0, so you can still take these values using tuple unpacking, though it’s not recommended way to get these values anymore.
New in version 0.3.0.
Add it as a subscription to the given subscription_set.
Parameters: | subscription_set (SubscriptionSet) – a subscription list or category to add a new subscription |
---|---|
Returns: | the created subscription object |
Return type: | Subscription |
(collections.Mapping) The extra hints for the crawler e.g. skipHours, skipMinutes, skipDays. It might be None.
Crawl feeds in feed list using thread.
Parameters: |
|
---|---|
Returns: | a set of CrawlResult objects |
Return type: | collections.Iterable |
Changed in version 0.3.0: It became to return a set of CrawlResults instead of tuples.
Changed in version 0.3.0: The parameter feeds was renamed to feed_urls.
New in version 0.3.0: Added optional timeout parameter.