Results 1 to 27 of 27

Thread: Mass Storage Question

Thread Tools
- Show Printable Version
Display
- Switch to Linear Mode
- Switch to Hybrid Mode
- Threaded Mode

Threaded View

Previous Post

Next Post

Jan 6th, 2011, 09:25 PM #18
Code Doc

View Profile

View Forum Posts
Thread Starter
PowerPoster

Join Date

Mar 2007

Location

Omaha, Nebraska

Posts

2,354
Re: Mass Storage Question

Originally Posted by FireXtol

So just words? Not phrases, sentences, paragraphs.... Only unique words regardless of written language?

According to Google, they have a nice collection of over 13 million unigrams(words). Assuming 5.1 characters per word, about 66 megabytes(more like 80MB with metadata). I believe it's limited to English, though. Assuming around 5000 written languages, perhaps 323(391) GB is a good upper limit figure. This is using zero compression. Compression would be interesting on such a unique dataset. There's also a matter of delimiters, and potentially the character sets used(metadata).

I tend to agree. You could likely store all unique words that have ever been written in all of human history with half a terabyte. Further advances in compression could shrink that somewhat, but I am not sure there is anymore payout to that. Mass storage expansion and communication speeds have trumped that development, the same way that the Internet has all but crushed the compact disk and the floppy disk.

Doctor Ed
Reply With Quote

Quick Navigation General Discussion / Chit Chat Top

« Previous Thread | Next Thread »

Posting Permissions

You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
[VIDEO] code is On
HTML code is Off

Click Here to Expand Forum to Full Width

Terms and Conditions | About Us | Privacy Notice | Contact Us | Advertise | Sitemap| California - Do Not Sell My Info

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

All times are GMT -5. The time now is 03:50 AM.