What are the ways to use the HTTP proxy? How to use Web crawler?

Now proxy IP has been integrated into our daily life, such as crawler crawl, website detection, advertising testing and other businesses are inseparable from proxy IP. Currently, there are three common proxy IP addresses, namely, HTTP proxy, HTTPS proxy, and SOCKS proxy. The most common proxy is HTTP proxy.


What are the usage scenarios for HTTP proxies? (http proxy)


An HTTP proxy is a common network application that passes HTTP requests and responses between a client and a server through a middleman. Here are some common usage scenarios for HTTP proxies:

1. Access control: HTTP proxies can be used to restrict access to certain websites or content. For example, a school or company can control the websites that employees or students can access through a proxy server.

2. Caching: HTTP proxies can cache requests and responses to improve performance. When the proxy server receives a request, it can first check the cache, and if there is a response to the request in the cache, it can immediately return the response without having to make a request to the server.

3. Geolocation camouflage: HTTP proxies can be used to disguise the geographic location of clients. When the proxy server receives a request, it can change the source IP address of the request to make the server think the request came from another geographic location.

There are a lot of scenarios that can be used.


Proxy4Free


So how to use the purchased IP proxy? What are the methods? (python proxy)


1: Web crawler use


Compared with dynamic IP proxy, crawler users are the most popular, because crawler users need to constantly change IP addresses to avoid ip address blocking. Let's take a look at how to use crawler program to link proxy IP:

The code is as follows:

import requests

from bs4 import BeautifulSoup

Url = 'https://www.SmartProxy.cn//nn/' # smart agent IP address

headers = {

'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'

}

Get the source code of the page

html = requests.get(url, headers=headers).text

Parse the source code of a web page using BeautifulSoup

soup = BeautifulSoup(html, 'lxml')

Find all IP proxies

ips = soup.find_all('tr')

Loop to get the details of each IP proxy

for i in range(1, len(ips)):

ip_info = ips[i]

tds = ip_info.find_all('td')

ip = tds[1].text

port = tds[2].text

address = tds[3].text.replace('\n', '').replace(' ', '')

proxy_type = tds[5].text.replace('\n', '').replace(' ', '')

# Displays detailed IP proxy information

print(f'IP:{ip} port :{port} address :{address} Proxy type :{proxy_type}')


1.2 How to Use crawler program to automatically change IP proxy address? (rotate proxy python)


Since different sites may require different crawlers, here is a sample program that uses the requests library and proxy pool in Python, which you can modify to suit your needs.

import requests

from urllib3.exceptions import MaxRetryError, NewConnectionError

from requests.adapters import HTTPAdapter

from requests.packages.urllib3.util.retry import Retry

import random

# Customize requests requests function using proxy pooling and retry mechanism

def requests_retry_session(retries=3, backoff_factor=0.3, status_forcelist=(500, 502, 504), proxy=None):

session = requests.Session()

retry = Retry(

total=retries,

read=retries,

connect=retries,

backoff_factor=backoff_factor,

status_forcelist=status_forcelist,

)

 

adapter = HTTPAdapter(max_retries=retry)

session.mount('http://', adapter)

session.mount('https://', adapter)

 

if proxy:

session.proxies = {

'http': proxy,

'https': proxy

}

 

return session

Define the proxy pool list

proxies_list = [

'http://proxy1:port',

'http://proxy2:port',

'http://proxy3:port',

More proxy addresses can be added

]

# Select a proxy address at random

proxy = random.choice(proxies_list)

 

# Send the request using the requests_retry_session function

try:

response = requests_retry_session(proxy=proxy).get(url)

# Handle the response content here

except (MaxRetryError, NewConnectionError) as e:

print(f" request error: {str(e)}")

In this example program, you use the custom requests_retry_session function, which implements a retry mechanism and specifies the proxy address to use through the session.proxies parameter.

Then, before each request, a proxy address is randomly selected from the proxy pool list using the random.choice function.

If the request fails, either MaxRetryError or NewConnectionError is thrown, which can be handled as needed.


Proxy4Free


2. Use it directly through the computer system


The computer system directly sets up IP proxy use


To set up IP proxy use from your computer system, you can follow these steps:

1. Open the Control Panel: Click the "Start" button, type "Control Panel" in the search box, and press the "Enter" key.

2. Locate the "Network and Sharing Center" : In the Control Panel, locate the "Network and Sharing Center" option and click on it.

3. Find "Change Adapter Settings" : In the Network and Sharing Center, click the "Change Adapter Settings" option in the left panel.

4. Locate the network adapter on which you want to set the proxy: Locate the network adapter on which you want to set the proxy, right-click on it, and select "Properties".

5. Locate "Internet Protocol Version 4 (TCP/IPv4)" : In the adapter properties window, locate the "Internet Protocol Version 4 (TCP/IPv4)" option and click "Properties".

6. Configure the proxy server: In the Internet Protocol Version 4 (TCP/IPv4) Properties window, select Use the following proxy server address and enter the proxy server address and port number.

7. Click "OK" and save the changes: After completing the above Settings, click "OK" button to save the changes.

Please note that this is a basic guide to the steps, which may vary slightly for different operating system versions or network configurations.

Proxy4free Telegram
Contact Us On Telegram
Proxy4free Skype
Contact Us On skype
Proxy4free WhatsApp
Contact Us On WhatsApp
Proxy4free Proxy4free