html2epub
A simple book scraper written in Python, converting html to epub.
Install
pip install bs4 inquirer html2epub tqdm
Usage
wget https://raw.githubusercontent.com/ixiaopan/DataScience/master/Utilities/Scraper/download.py
python download.py
After running the above commands, you need to select one source, as shown below.
Note that this script can only convert html to epub. In other words, only books in HTML format can be downloaded.
Books with chapter list
Books may or may not have a chapter list. For books with a chapter list, just type the URL. For example, the below one.
https://booksvooks.com/selfish-shallow-and-self-absorbed-sixteen-writers-on-the-decision-not-to-have-kids-pdf.html
Books without chapter list
For books without a chapter list, you need to add the total page number to the page URL. For example, the book this is going to hurt
does not have a chapter list, so the url is
https://booksvooks.com/fullbook/this-is-going-to-hurt-pdf-adam-kay.html?page=32
The book will be downloaded in the folder booksVooks/
or jinjiang
based on your source.
- booksVooks
- thisisgoingtohurtpdfadamkaypage32.epub