Skip to content

epassaro/alternativateatral

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alternativateatral

Web scraper for user reviews on Alternativa Teatral

Usage

Binary for Linux

Download the binary from the Releases section and run:

./scrape <URL> -o <OUTPUT_FILE>

Example:

$ ./scrape https://www.alternativateatral.com/opiniones65140-sex-vivi-tu-experiencia

The results are saved in JSONL format.

{"date": "25/04/2025 17:08", "author": "Patricia", "rating": "5", "text": "Excelente! Súper recomendable, un espectáculo diferente!"}

Python Script

Install the dependencies from requirements.txt and run:

$ python src/scrape.py <URL> -o <OUTPUT_FILE>

Development

For development/packaging, create the Conda environment:

$ conda env create -f environment.yml
$ conda activate alternativa

Limitations

  • I suspect the Alternativa Teatral website has a limit of 999 pages per play. At 7 comments per page, this would represent a maximum of 6,988 reviews.
  • Although the site allows rating a play in half-star increments, the script only captures the integer part of the rating.
  • To enable packaging into a binary, SSL certificate verification was disabled. This has some security implications. An alternative would be to bundle cacert.pem alongside the binary.

About

Web scraper for user reviews on Alternativa Teatral

Resources

License

Stars

Watchers

Forks