You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1.7 KiB
1.7 KiB
Spell Scrapper 📜 🐍
About this repository
This repository was built in order to have an "up to date" spells database for SpellTastic, a cross-platform spell manager for Pathfinder.
Data source
All data is retrieved from d20pfsrd the #1 Pathfinder Roleplaying Game rules reference site. All spells can be found at spells.
The latest data extracted is available as a YAML file and can be found in the outputs
directory.
Getting Started
Prerequisites
- Python 3.6+
Python libraries
- BeatifulSoup4
- Requests
- lxml
- PyYAML
- sqlite3
Installing
- Cloning repository
git clone https://github.com/your_username/pathfinder-spell-scraper.git
- Install the required libraries
pip install requests beautifulsoup4 lxml pyyaml
Usage
Scrapping
- You can run scrap-spells.py to scrape the spell information from the website:
python scrapping/scrap-spells.py
- This command will generate a file spells.yaml with all spells and their attributes. The file should be found in the
outputs
directory.
A progress bar should be displayed in your terminal while scrapping, showing the time left and the number of spells scraped. The script should takes about 20 minutes to scrap all spells.
Database
- You can build a .db sqlite3 databse file by running the spell_db.py file:
python database/spell-db.py
- The script will generate a spells.db file with a spell table containing all the spell information. This file should also be found in the
outputs
directory.