[IIAB] Gutenberg epubs and html
Braddock
braddock at braddock.com
Sun Jun 30 20:55:26 PDT 2013
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 06/30/2013 05:32 AM, Emmanuel Engelhart wrote:
> Dear Braddock
>
> I'm more interested in gutenberg-htmlz-images than in
> gutenberg-epub. The reason is that ZIM files should be directly
> usable without using an additional EPUB reader.
I'm having some trouble getting qbittorrent to build the full 23GB
gutenberg-htmlz-images torrent. Do you have a recommended (Linux)
tool? I have never created a torrent before.
Also, is there a public tracker you can recommend? I was surprised
that legaltorrents actually charges for posting.
- -braddock
>
> The best case would be to have both in one ZIM file (I have no big
> problem with the redundancy) to give the choice to the reader.
>
> Do you think you can put this file instead the epub one online? If
> you need some place to put that online, I can probably help you.
>
> Kind regards Emmanuel
>
> Thank you for your feedback. I'm mostly interested in the Le
> 29/06/2013 23:55, Braddock a écrit :
>> Hi Emmanuel, I am seeding a torrent of all the gutenberg epubs
>> without images.
>>
>> http://braddock.com/~braddock/expire/gutenberg-epub-201306.torrent
>>
>>
>>
I have very limited upload bandwidth. Lets start with the epubs
>> without images and see how it goes uploading across my DSL. I've
>> also got epubs with images, html with and without images.
>>
>> torrent size is about 7GB.
>>
>> -Braddock
>>
>>
>> On 06/21/2013 02:26 PM, Emmanuel Engelhart wrote:
>>> Hi Braddock
>>
>>> Le 21/06/2013 22:04, Braddock a écrit :
>>>> I finally got around to distilling the Gutenberg collection.
>>>> I've extracted all 40,000 books in epub format, and also
>>>> converted them to zipped html ("htmlz"). I've made
>>>> collections both with and without images.
>>>>
>>>> The result is four collections (sizes in gigabytes): 6.9G
>>>> gutenberg-htmlz 23G gutenberg-htmlz-images 7.0G
>>>> gutenberg-epub 23G gutenberg-epub-images
>>>>
>>>> There is no metadata in these collections, just the books.
>>>> We could generate some meta using our database. I'm not sure
>>>> what you would need to make a usable zim.
>>>>
>>>> We are probably just going to keep them in this format
>>>> (instead of zimmifying them all) for the Internet-in-a-Box.
>>>>
>>>> I can make these available via torrents.
>>
>>> This is a really great news! I can not wait to see how it
>>> looks like. So, yes, a torrent URL would be perfect for the
>>> download.
>>
>>>> Other news, I wrote a from-scratch pure python ZIM file
>>>> reader I'm calling "zimpy".
>>>> https://github.com/braddockcg/internet-in-a-box/blob/master/iiab/zimpy.py
>>>>
>>>>
>>>>
>>
>>>>
I'm now using the zimpy code for reading zims for Internet-in-a-Box, and
>>>> if it gets a bit more mature I'll release it as a separate
>>>> project. It doesn't currently do anything more than I need.
>>>> The existing openzim bindings did not support any read
>>>> capability.
>>
>>> This is great to see you were able to make your own ZIM reader.
>>> I have had it to our list:
>>> http://www.openzim.org/wiki/Readers#Without_user_interface_.2F_Console
>>
>>>
>>>
Regards Emmanuel
>>
>>
>>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJR0P2uAAoJEHWLR/DQzlZuid0IALcX24oQzSP7ilvndTGJvm57
LxEERjtwTeQKsdV70CR6/FZvp/V7QRo/xPvPMTjbIGm/mtqePWGrgZQOk/ZAhnqf
yUclAfHMMEBJ5YhSWzb2p/qjU+ws0USYyyihwDajuh+UdJ3NExyV4sPc0U8UgSUG
DMNAhwal8NNPEMS9QhogmEZvs9+MYtxmDeG23GDR5REATc7183u0gwi9WOCENGAT
4vQV9hkpeNPguo+qBVVgQ3gHB3kzz+FJgro+yXLC6/X+M2OiCk60JLoEu4noCFXV
h8KLLHH3kc1zUh0hVEWY9Bzm4jgF0CCQ2dJ9bK9NCUeG1ScQhyYvT25C/sNTOgk=
=GqMG
-----END PGP SIGNATURE-----
More information about the IIAB
mailing list