Between school returns, those of clients and personal projects. I don’t have enough time so I’ll be concise.
First, let’s rephrase the context:
During a Growth Hacking course we had to launch a product and get a maximum of feedback in 1 week… the rather vague instructions just had to be worked out to validate an MVP (minimum viable product) while respecting the AARRR framework.
The idea of the group:
We play the role of an importer of Brazilian products looking for potential new customers (restaurant).
dinner event, landing page, social networks….. not bad for finding qualified lead but not enough.
So how do we expand this base?
By Scrapping Trip Advisor… finally we get to the heart of the matter
Let’s start with Adrien Lachaize scrapping small definition:
“Web Scraping is a technique for extracting information from websites. This technique focuses mainly on the transformation of unstructured data (HTML format) on the web into structured data (database or spreadsheet).”
In our case, a restaurant’s page on Trip Advisor always follows the same architecture, which makes it much easier to extract data.
Often we read occasionally that it is necessary to make scripts in python, javascript etc. passing the art of scrapping as a complicated practice….. but this is not true (in most cases;)
Here we use only a macro in google sheet and 2, 3 HTML knowledge.
This macro allowed us to get the information we needed: name, address, Tel, Mail and that’s all.
The most complicated thing to do is to retrieve the links an example of a querie exists in the header –> use it with the Mozbar extension to collect all the links from a google search 😉
Here is the link to observe, understand the macro… If you ever want to use it create a copy in your drive.
link : Trip Advisor Scrapper
If the trip advisor interface is redesigned, chances are it won’t work anymore if it does, I let you tweak the macro by changing the div and span names
and be careful not to make too many requests, they slow us down to a certain number
See you in the next One 😉