Html scrapper in Python for SpellTastic.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Nicolas FRANCO d43d67c8b7
added scripts and latest outputs
2 years ago
database added scripts and latest outputs 2 years ago
outputs added scripts and latest outputs 2 years ago
scrapping added scripts and latest outputs 2 years ago
.gitignore Initial commit 2 years ago
README.md added scripts and latest outputs 2 years ago

README.md

Spell Scrapper 📜 🐍

About this repository

This repository was built in order to have an "up to date" spells database for SpellTastic, a cross-platform spell manager for Patfinder.

Data source

All data is retrieved from d20pfsrd the #1 Pathfinder Roleplaying Game rules reference site. All spells can be found at spells.

The latest data extracted is available as a YAML file.

Getting Started

Prerequisites

  • Python 3.6+

Python libraries

  • BeatifulSoup4
  • Requests
  • lxml
  • PyYAML
  • sqlite3

Installing

  1. Cloning repository
git clone https://github.com/your_username/pathfinder-spell-scraper.git
  1. Install the required libraries
pip install requests beautifulsoup4 lxml pyyaml

Usage

Scrapping

  1. You can run scrap-spells.py to scrape the spell information from the website:
python3 scrap-spells.py

The script should take a few minutes to scrap all spells

  1. This command will generate a file spells.yaml with all spells and their attributes

Database

  1. You can build a .db sqlite3 databse file by running the spell_db.py file:
python3 spell-db.py
  1. The script will create a spells.db file with a spell table containing all the spell information.