All the books

September 6th, 2010

Here at Sinu, we are often discussing the value of terms like 'new', 'old' and 'everything'. It is something we hear often when talking to customers about their goals.
In a recent update from the Google Books project, we saw a glimpse of how difficult it is to try to quantify such a term.
Google has estimated that there are 210 million books in existence.

“Is that a final number of books in the world? Not quite. We still have to exclude non-books such as microforms (8 million), audio recordings (4.5 million), videos (2 million), maps (another 2 million), t-shirts with ISBNs (about one thousand), turkey probes (1, added to a library catalog as an April Fools joke), and other items for which we receive catalog entries.

Counting only things that are printed and bound, we arrive at about 146 million. This is our best answer today. It will change as we get more data and become more adept at interpreting what we already have.“

I wonder if they will ever try to recreate lost books or track stories passed down verbally. Trying to recreate the contents of the Library of Alexandria’s 200K scrolls would be pretty impressive.

There must be copies of the works in other places, this is how most of the scrolls were collected.

“By decree of Ptolemy III of Egypt, all visitors to the city were required to surrender all books and scrolls, as well as any form of written media in any language in their possession which, according to Galen, were listed under the heading "books of the ships". Official scribes then swiftly copied these writings, some copies proving so precise that the originals were put into the library, and the copies delivered to the unsuspecting owners. This process also helped to create a reservoir of books in the relatively new city.” - wikipedia

More and more we find ourselves working with our customers to create a sense of time and order to their data and to preserve it "forever" (another elusive term). It seems that business people are now understanding that data has to be thought of as a timeline, not a bottomless file cabinet.

- Larry, CTO