openproxyherder/scripts
Em a82231ca70
Reset num_failures to 0 when processing removal requests
2021-11-09 07:27:25 -05:00
..
checkers create requirements.txt for each type of script 2021-11-02 07:47:39 -04:00
dronebl Reset num_failures to 0 when processing removal requests 2021-11-09 07:27:25 -05:00
gatherers create requirements.txt for each type of script 2021-11-02 07:47:39 -04:00
README.md add scripts and fix config 2021-06-23 14:44:52 -04:00

README.md

Some of the proxy scrapers and checkers that can be used with the openproxyherder.

Works with recent py3, make a venv, install requirements.txt, copy config.example.yaml to config.yaml and change the stuff.

These are not shining examples of code, they're quick scripts that work (unless they don't work), meant to show how to interact with OPH.

No, I'm not going to give you my list of proxy sites. Do your own research.

Also be a good citizen of the internet.

Checkers

Scripts that check IPs listed in OPH.

http_socks.py

Checks HTTP/SOCKS(4|5) proxies. It attempts to connect to a webpage that consists solely of [connecting ip address] [string defined in scrapers/config.yaml]. The one in the example config is a public one you are welcome to use. If you want to use your own, nginx ssi is the easiest way, <!--#echo var="REMOTE_ADDR" --> yourproxystring.

vpngate.py

Checks vpngate proxies. Not much else to say here. While technically a VPN, not a proxy, locking the door but leaving the key in the lock is essentially open, and that's what VPNGate does.

Gatherers

Scripts that get proxies from various places.

Note that as of now, some of these dump directly into the database rather than use the OPH API, for historical reasons when OPH didn't use asyncpg and it would lock up when adding hundreds of proxies. This will probably change eventually.