Jump to content

How do you organise your ebooks on disc


NAS

Recommended Posts

So I am revisiting how I organize eBooks on disk.

 

My current method is:

 

  • Book is passed to interim calibre database
  • Calibre is used to tag the book
  • Tagged books are moved to final calibre database

 

However I am no longer interested in this approach since it is both a time sink and calibre scales worse than any app I have ever seen.

 

So whilst I will still use calibre for finessing when required most of the time I want to just script something that takes a bunch of books and organizes them.

 

The script should be easy enough but I am curious how people do their structures and why.

 

I am thinking:

 

/ebooks/author/title/ebook.epub

 

With author being First then Last name (no commas). Unprintable characters such as : would need stripping and I am unsure of the best way to rename duplicates.

 

Thoughts? (does anyone apparent from me care about these things? :)

 

 

Link to comment

Much the same as you, though I don't have an interim database. Why not just put it directly into your final one in calibre and tag it?

 

I've not had any issues with calibre, but it will depend on the size of your library. I have around 4000 entries calibre is tracking.

 

I wouldn't use anything other than calibre otherwise I'd feel I was reinventing the wheel. What I would love would be for calibre to start using mysql as a backend. In my eyes this would speed things up hugely and allow for a distributed client model.

Link to comment

How many ebooks do you have, and what makes you say it scales poorly? My collection is at 18,410 so far, and I haven't noticed any issues.

Quite a few more than that. I find after a certain level you have to run multiple calibre databases to get decent performance i.e. if you add 10 books to a new database it will take seconds. The same amount to a large database can take minutes and it scales exponentially worse the larger the dbase.

 

Even as a trail I made a dedicated windows 7 machines with 32GB of ram and SSD and it still sucked.

 

Much the same as you, though I don't have an interim database. Why not just put it directly into your final one in calibre and tag it?

 

I am no longer interested in manual tagging books until they make their way from my general collection to my imminent reading list.

 

So certainly allowing calibre to do its auto magic on import would be fine but the problem returns to above where  I need to start needing to having 10+ separate databases for super slick performance and thats to much of a kludge for the first step general collection (especially since by all signs Ubooquity can handle the same collection with lightening performance).

 

I wouldn't use anything other than calibre otherwise I'd feel I was reinventing the wheel. What I would love would be for calibre to start using mysql as a backend. In my eyes this would speed things up hugely and allow for a distributed client model.

I will join your mysql club I think that would fundamentlaly fix my problem.

 

I ended up just splitting the collection to different genres. With the smaller size, quick switching to different libraries is near instantaneous. Not perfect, but it works as I need Calibre to transfer and manage books to my mobile devices.

I split as well now (as above) the problem is to do it by genre requires a metadata scrape from the internet and thats too unreliable to allow calibre to do it all by itself. I know people are going to say automagic scrape works for them and certainly it works almost perfectly for me as well but I have now come to prefer unscraped versus some incorrectly scraped.

 

I also dont like calibre use of opf and cover art files. I think on the face of it this is a great idea until you realise a single folder with 1000 book files becomes 3000 files. This quickly becomes an unnecessary inode strain. using the filesystem like a database is great on small sample sets but doesnt scale.

 

 

So after writing all of this i think what I am steering towards is two colelctions:

 

1. The big collection thats just sorted best effort into folder on disk (the subject of this thread). This is where most books live and the effort expended here is low with a  focus on using ubooquity.

2. The reading list. This is where the stuff I have and am about to lives and will have seen some calibre love. The on disk format here is obviously calibres native one.

 

Sorry to be so rambling I am coming up with thoughts as I go along.

Link to comment
  • 3 weeks later...

I am thinking:

 

/ebooks/author/title/ebook.epub

 

With author being First then Last name (no commas). Unprintable characters such as : would need stripping and I am unsure of the best way to rename duplicates.

 

My experience with library management would lead me to suggest using LastName_FirstName instead.  There's much less variability in last names than first names (eg. Robert, Rob, R), so it will be easier to find items when the name is slightly different.  Of course, that isn't true with anglicized names, especially Russian and Slavic ones!  And aliases are another headache.

 

Establish your own rules but follow them strictly.  Such as, always use the most common form of the authors name and alias, etc.  This will keep organization and searching under control.

 

Organizing them in folders by author and title is as good a system as any, I suppose.  But books and videos have so many access points (search attributes), that you still need an external tool to search them, which makes storage by author or title less important.

Link to comment

Sorry I was away working on this to get some more experience with actual data.

 

Originally I was using exiftool to pull the metadata but there are subtle variations in field spec that make it a bit more complex than expected. However now I have found ebook-meta which is part of the calibre tool set. So far I have to install the whole thing to get this one tool but the plus side is that it is far more capable i.e. it can pretty reliably tell you the Author even if there are variations in the spec used.

 

LastName_FirstName actually makes a lot of sense. I got caught up in the moment trying to make it all super neat but have since taken a step back and come at it again. I am not looking at only //ebooks/Author/**files**.

 

I have to deal with duplicate names and then I can pull a load of data and see if this is sensible.

 

Link to comment
  • 4 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...