Skip to content

Scraper

Scraper class for scraping the Trovaprezzi website.

__init__(wait, headless)

Initialize the Scraper object with the specified wait time and headless mode.

Parameters:

Name Type Description Default
wait int

The wait time for the WebDriver to wait for an element to be clickable.

required
headless bool

A boolean value indicating whether to run the WebDriver in headless mode.

required

download_html(url)

Download the HTML content of the specified URL.

Parameters:

Name Type Description Default
url str

The URL to download the HTML content from.

required

Returns:

Name Type Description
tuple tuple

A tuple containing the HTML content of the page.

extract_best_price_shipping_included(html_content, quantity)

Extract the best price of the item shipping included from the HTML content.

Parameters:

Name Type Description Default
html_content str

The HTML content to extract the prices from.

required
quantity int

The quantity of items to buy.

required

Returns:

Name Type Description
tuple tuple

A tuple containing the item name and the best price item.

extract_prices_plus_shipping(html_content, quantity)

Extract the prices of the items plus shipping cost from the HTML content.

Parameters:

Name Type Description Default
html_content str

The HTML content to extract the prices from.

required
quantity int

The quantity of items to buy.

required

Returns:

Name Type Description
tuple tuple

A tuple containing the item name and a list of items.