Everyone Loves Cartoons/Comics,eh? When it comes from political satire most of you won’t say a No.Like this , you won’t be laughing again in life. Fortunately, I had a nice night scraping up the Cartoon-scape from India’s century old newspaper The Hindu once which renowned cartoonists like R. K. Laxman, K. Shankar Pillai were ruling the pen.
Scrapping up all visible cartoon-scape images in (.jpg format)from the English Hindu Website and stores them in a folder with the naming convention of the respective dates.
It’s a fun to to know about the missed cartoons, a small portable image column in a corner of the fresh newspaper but will divulge the nation’s hot topic or a minimal blow in the asses of something/someone caused the unconvincing situation on the last daylight. Persuading with the the knowledge you are gaining is a ecstasy and a chance to test your skills.
Below is the gist containing the code. It’s Python, so nothing more to explain of the plain-sight Pizza.
Interestingly, I have never used the mechanize module much as I curled under Invisble cloak of requestes/bs4. But mechanize has certain much needed/minimal features like listing the current page’s urls with regex filtering options like url_regex, text_regex etc. It eased the job with few lines of code (w.r.t my knowledge).
- Interestingly, when searching google for any projects/codes done before on scraping The Hindu came across This one. .
- @2020saurav’s random script repo has one script but outdated.