[IIAB] Gutenberg epubs and html

Braddock braddock at braddock.com
Sun Jun 30 20:55:26 PDT 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/30/2013 05:32 AM, Emmanuel Engelhart wrote:
> Dear Braddock
> 
> I'm more interested in gutenberg-htmlz-images than in
> gutenberg-epub. The reason is that ZIM files should be directly
> usable without using an additional EPUB reader.

I'm having some trouble getting qbittorrent to build the full 23GB
gutenberg-htmlz-images torrent.  Do you have a recommended (Linux)
tool?  I have never created a torrent before.

Also, is there a public tracker you can recommend?  I was surprised
that legaltorrents actually charges for posting.

- -braddock


> 
> The best case would be to have both in one ZIM file (I have no big 
> problem with the redundancy) to give the choice to the reader.
> 
> Do you think you can put this file instead the epub one online? If
> you need some place to put that online, I can probably help you.
> 
> Kind regards Emmanuel
> 
> Thank you for your feedback. I'm mostly interested in the Le
> 29/06/2013 23:55, Braddock a écrit :
>> Hi Emmanuel, I am seeding a torrent of all the gutenberg epubs
>> without images.
>> 
>> http://braddock.com/~braddock/expire/gutenberg-epub-201306.torrent
>>
>>
>> 
I have very limited upload bandwidth.  Lets start with the epubs
>> without images and see how it goes uploading across my DSL.  I've
>> also got epubs with images, html with and without images.
>> 
>> torrent size is about 7GB.
>> 
>> -Braddock
>> 
>> 
>> On 06/21/2013 02:26 PM, Emmanuel Engelhart wrote:
>>> Hi Braddock
>> 
>>> Le 21/06/2013 22:04, Braddock a écrit :
>>>> I finally got around to distilling the Gutenberg collection. 
>>>> I've extracted all 40,000 books in epub format, and also 
>>>> converted them to zipped html ("htmlz").  I've made
>>>> collections both with and without images.
>>>> 
>>>> The result is four collections (sizes in gigabytes): 6.9G 
>>>> gutenberg-htmlz 23G    gutenberg-htmlz-images 7.0G 
>>>> gutenberg-epub 23G    gutenberg-epub-images
>>>> 
>>>> There is no metadata in these collections, just the books.
>>>> We could generate some meta using our database.  I'm not sure
>>>> what you would need to make a usable zim.
>>>> 
>>>> We are probably just going to keep them in this format
>>>> (instead of zimmifying them all) for the Internet-in-a-Box.
>>>> 
>>>> I can make these available via torrents.
>> 
>>> This is a really great news! I can not wait to see how it
>>> looks like. So, yes, a torrent URL would be perfect for the
>>> download.
>> 
>>>> Other news, I wrote a from-scratch pure python ZIM file
>>>> reader I'm calling "zimpy". 
>>>> https://github.com/braddockcg/internet-in-a-box/blob/master/iiab/zimpy.py
>>>>
>>>>
>>>>
>>
>>>> 
I'm now using the zimpy code for reading zims for Internet-in-a-Box, and
>>>> if it gets a bit more mature I'll release it as a separate 
>>>> project.  It doesn't currently do anything more than I need.
>>>> The existing openzim bindings did not support any read
>>>> capability.
>> 
>>> This is great to see you were able to make your own ZIM reader.
>>> I have had it to our list: 
>>> http://www.openzim.org/wiki/Readers#Without_user_interface_.2F_Console
>>
>>>
>>> 
Regards Emmanuel
>> 
>> 
>> 
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJR0P2uAAoJEHWLR/DQzlZuid0IALcX24oQzSP7ilvndTGJvm57
LxEERjtwTeQKsdV70CR6/FZvp/V7QRo/xPvPMTjbIGm/mtqePWGrgZQOk/ZAhnqf
yUclAfHMMEBJ5YhSWzb2p/qjU+ws0USYyyihwDajuh+UdJ3NExyV4sPc0U8UgSUG
DMNAhwal8NNPEMS9QhogmEZvs9+MYtxmDeG23GDR5REATc7183u0gwi9WOCENGAT
4vQV9hkpeNPguo+qBVVgQ3gHB3kzz+FJgro+yXLC6/X+M2OiCk60JLoEu4noCFXV
h8KLLHH3kc1zUh0hVEWY9Bzm4jgF0CCQ2dJ9bK9NCUeG1ScQhyYvT25C/sNTOgk=
=GqMG
-----END PGP SIGNATURE-----



More information about the IIAB mailing list