1.1 KiB
bookscrape
A scraper for Standard Ebooks, Project Gutenberg, and Global Grey Ebooks. It produces json output listing books in a format compatible with libman. The goal being to have a searchable, but modular, ebook manager (like a package manager, but for ebooks and their sources). That said, the json documents produced are flexible enough to be ingested and used by any number of other systems that wish to use these book catalogs.
Building
go build
or
go install
Running
bookscrape -se # fetch standard ebooks
bookscrape -gg # fetch global grey
bookscrape -pg # fetch project gutenberg
# There is also a convenient `-all` flag to do all of the above in one command
They will produce a json file each (even when -all
is used). The sizes vary. Gutenberg is the largest file since their catalog is many times larger than the other two combined. However, Gutenberg is also the fastest to build since their website does not need to be crawled and scraped: they provide a CSV file, which this program ingests and modifies into the, much larger, json file.