[IIAB] [KIWIX][INTERNET-IN-A-BOX] Nice project!
Emmanuel Engelhart
kelson at kiwix.org
Tue Mar 26 14:50:26 PDT 2013
Hi Braddock
On 03/26/2013 09:56 PM, Braddock wrote:
> I'm preparing to do a dump of gutenberg to epub and converted html files
> that you could use to build a ZIM. (may or may not get it done this week
> however)
This is on my side really not urgent! I don't want to put you under
pressure :)
> How should I handle metadata for you to create a zim? Title, author,
> illustrator, year, language, etc? Is it possible to search on those fields?
Not sure exactly how to understand this question. This is maybe because
we have a misunderstood. My wish is to have one ZIM file with all the
books of Gutemberg. So, I don't see a big problem with the ZIM file
metadata. My proposition would be: publisher=yourprojectname,
creator=Project Gutenberg, language is a big challenge because you have
books in many languages. Maybe it's better to make a ZIM file per
language? With the Kiwix fulltext search engine, it should be pretty
easy to find stuff. With good index pages, this should also help... cf.
my next comment.
> By default the gutenberg files do not have interesting file names
> (pg30532-image.epub for example). The meta data is stored in the obtuse
> catalog.rdf XML file, indexed by the id number of the text. We have all
> this parsed and broken out into a SQLite database for our own use.
This is great. The most important is to have a good HTML title tag and
if possible a meta tag with good keywords. Do you plan to create custom
HTML indexes with these data (like lists of book per author, language,
title)?
Kind regards
Emmanuel
More information about the IIAB
mailing list