Scraper
Scraper class for scraping the Trovaprezzi website.
__init__(wait, headless)
Initialize the Scraper object with the specified wait time and headless mode.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
wait |
int
|
The wait time for the WebDriver to wait for an element to be clickable. |
required |
headless |
bool
|
A boolean value indicating whether to run the WebDriver in headless mode. |
required |
download_html(url)
Download the HTML content of the specified URL.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url |
str
|
The URL to download the HTML content from. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
tuple |
tuple
|
A tuple containing the HTML content of the page. |
extract_best_price_shipping_included(html_content, quantity)
Extract the best price of the item shipping included from the HTML content.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
html_content |
str
|
The HTML content to extract the prices from. |
required |
quantity |
int
|
The quantity of items to buy. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
tuple |
tuple
|
A tuple containing the item name and the best price item. |
extract_prices_plus_shipping(html_content, quantity)
Extract the prices of the items plus shipping cost from the HTML content.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
html_content |
str
|
The HTML content to extract the prices from. |
required |
quantity |
int
|
The quantity of items to buy. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
tuple |
tuple
|
A tuple containing the item name and a list of items. |