Scanner
Scanner class that is responsible for scanning the URLs and extracting the prices and shipping costs.
Attributes:
| Name | Type | Description |
|---|---|---|
level |
str
|
The level of the scanner. |
urls |
list
|
The list of URLs to scan. |
quantities |
list
|
The list of quantities for each URL. |
wait |
int
|
The number of seconds to wait before scraping the next URL. |
headless |
bool
|
The headless mode of the browser. |
console_out |
bool
|
The console output flag. |
excel_out |
bool
|
The Excel output flag. |
individual_deals |
dict
|
The dictionary of individual deals. |
best_individual_deals |
list
|
The list of best individual deals. |
best_cumulative_deals |
dict
|
The dictionary of best cumulative deals. |
formatted_datetime |
str
|
The formatted datetime string. |
Methods:
| Name | Description |
|---|---|
scan |
Scans the URLs and extracts the prices and shipping costs. |
remove_unavailable_items |
Removes the unavailable items from the individual deals. |
find_best_individual_deals |
Finds the best individual deals. |
find_best_cumulative_deals |
Finds the best cumulative deals. |
__init__(level, urls, quantities, wait, headless, console_out, excel_out)
Initialize the Scanner object with the specified parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level |
str
|
The level of the scanner. |
required |
urls |
list
|
The list of URLs to scan. |
required |
quantities |
list
|
The list of quantities for each URL. |
required |
wait |
int
|
The number of seconds to wait before scraping the next URL. |
required |
headless |
bool
|
The headless mode of the browser. |
required |
console_out |
bool
|
The console output flag. |
required |
excel_out |
bool
|
The Excel output flag. |
required |
find_best_cumulative_deals()
Find the best cumulative deals.
This method iterates over the items in the individual_deals dictionary and checks if the seller indicates a free delivery
threshold and if the cumulative price is greater than or equal to the threshold. If both conditions are met, the item is
considered as one of the best individual deals and is added to the best_individual_deals list.
find_best_individual_deals()
Find the best individual deals.
This method iterates over the items in the individual_deals dictionary and checks if the seller indicates a free delivery
threshold and if the cumulative price is greater than or equal to the threshold. If both conditions are met, the item is
considered as one of the best individual deals and is added to the best_individual_deals list.
remove_unavailable_items()
Remove the unavailable items from the individual deals.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The count of removed items. |
scan()
Scan the URLs and extracts the prices and shipping costs.
This method iterates over the list of URLs and performs the following steps for each URL: 1. Creates an instance of the Scraper class with the specified wait time and headless mode. 2. Downloads the HTML content for the URL, including prices plus shipping costs and best prices with shipping costs included. 3. Extracts the item name and a list of items with their respective prices and shipping costs. 4. Extracts the best price with shipping costs included. 5. If the best price is not already in the list of items, it is added. 6. Sorts the list of items by price. 7. Stores the list of items in the individual_deals dictionary with the item name as the key. 8. Logs the number of deals found for the item. 9. Waits for a specified amount of time before processing the next URL.
Note: This method uses the Progress class from the rich.progress module to display a progress bar during the scanning process.