You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1.5 KiB

Spell Scrapper 📜 🐍

About this repository

This repository was built in order to have an "up to date" spells database for SpellTastic, a cross-platform spell manager for Patfinder.

Data source

All data is retrieved from d20pfsrd the #1 Pathfinder Roleplaying Game rules reference site. All spells can be found at spells.

The latest data extracted is available as a YAML file.

Getting Started

Prerequisites

  • Python 3.6+

Python libraries

  • BeatifulSoup4
  • Requests
  • lxml
  • PyYAML
  • sqlite3

Installing

  1. Cloning repository
git clone https://github.com/your_username/pathfinder-spell-scraper.git
  1. Install the required libraries
pip install requests beautifulsoup4 lxml pyyaml

Usage

Scrapping

  1. You can run scrap-spells.py to scrape the spell information from the website:
python scrapping/scrap-spells.py

A progress bar should be displayed in your terminal indicating the time left and the number of spells scraped. The script should takes about 20 minutes to scrap all spells

  1. This command will generate a file spells.yaml with all spells and their attributes

Database

  1. You can build a .db sqlite3 databse file by running the spell_db.py file:
python database/spell-db.py
  1. The script will create a spells.db file with a spell table containing all the spell information.