Journal 1

Lot of cool stuff so here we go!


Originally, my plan was to just simply learn to web scrape, auto-fill the information on specific websites, call it a day, make a million dollars and then, bam, winner of the coolest guy ever and a nobel peace prize. Problem is, these websites, due to the way I had my code set up, pretty much instantly detected I was a bot and turned down my POST requests. To clarify, a HTTP POST method sends data to the designated server. It works similar to everyday conversation, and if I said something to someone else, that would be a POST request in daily life. I was planning to use Regex and Selenium - a web scraping tool, and a unique identifier tool - in order to make it work (see screenshot below) but I ultimately scrapped my idea.

What I ended up doing instead was reading into the post data that was being sent every time you wanted to send a form, for example, google has a specific token id that you get every time you complete a captcha, it is verifiable for 2-3 minutes, and that's how you have a Google CAPTCHA verification.


 I read the website's terms of service, and slowly figured out that an id value that should be custom for every unique user was hardcoded, thereby making it possible to send fake post requests by using something called a harvester. The Harvester was initially made for botting sneaker websites in order to get some sick new kicks, but what it does is it provides a prompt for you fill in a CAPTCHA, what ends up happening is it generates the specific Google CAPTCHA generator, and for two minutes, that valid ID can be used to send separate POST request, and tricks the system into thinking I am a real person, bypassing the entire removal process, making the user only have to solve 1 CAPTCHA, and verify an email in their inbox.


First removal and bypass is in the works with Intelius, BeenVerified, Fast People Search, and cloudscraper, and I am proud of the progress I have made, still more to do though!


Accomplishments


Reflection

I have made a lot of great progress as of recently, but I still have a lot to accomplish. Considering I wasn’t even supposed to finish one until the end of the semester, now that I have ~6 finished, I would like to make one big program that does it all. Still have a few more websites I would like to finish, but I am making incredible progress.