Rather than a person knowing discrete facts, the database allows your data to be carefully analyzed as part of the aggregate. And when you analyze such a huge pot of data, you start finding odd correlations.
Showing posts with label databases. Show all posts
Showing posts with label databases. Show all posts
Sunday, May 23, 2010
Stop Thinking of Internet Privacy in Human Terms
David Hurley from Intellectual Freedom Roundtable (IFRT) writes about how liking Star Trek could hinder your chances at a job in childcare. Sort of. Hypothetically!
Wednesday, May 12, 2010
Stephen Wolfram - Making All Knowledge Computational
Stephen Wolfram talks about his ideas to make all knowledge computational. I'll be honest, a lot of it goes right over my head, but it's worth watching. Stephen Wolfram designed Wolfram Alpha, a curious little search engine that provides very up to date computational facts.
Friday, April 30, 2010
How Much Information is There?
How much information is there? Not counting books, just the digital information storage.
For most of us, “a crapload” is a sufficiently accurate answer. But for a few obsessive data analysts, more precision is necessary. According to a recent study by market-research company IDC, and sponsored by storage company EMC, the size of the information universe is currently 800,000 petabytes. Each petabyte is a million gigabytes, or the equivalent of 1,000 one-terabyte hard drives.
If you stored all of this data on DVDs, the study’s authors say, the stack would reach from the Earth to the moon and back.
Wednesday, April 14, 2010
Library of Congress and Twitter
Every Tweet Will Be Preserved for History. Seriously?
I wonder what category tweets go under in the Library of Congress Classification system.
I wonder what category tweets go under in the Library of Congress Classification system.
Saturday, April 3, 2010
FAQ: Google, China, and Censorship
WIRED.com has made an FAQ (a list of Frequently Asked Questions) regarding Google, China, and Censorship. It is useful!
Saturday, March 13, 2010
Wikipedia Collaborators and Their Roles
A paper written by a University of Arizona professor and a graduate student found that quality Wikipedia articles are the result of the work of different kinds of collaborators.
So basically, really good Wikipedia articles come about by embracing the core tenet of the project: many people working together provide better information.
The paper is also available for download.
Starters, for example, create sentences but seldom engage in other actions. Content justifiers create sentences and justify them with resources and links. Copy editors contribute primarily though modifying existing sentences. Some users – the all-round contributors – perform many different functions.
So basically, really good Wikipedia articles come about by embracing the core tenet of the project: many people working together provide better information.
The paper is also available for download.
Tuesday, February 23, 2010
Google's Algorithm
It's been far too long since the Google tag has come up. Here, learn about their search algorithms.
Take, for instance, the way Google’s engine learns which words are synonyms. “We discovered a nifty thing very early on,” Singhal says. “People change words in their queries. So someone would say, ‘pictures of dogs,’ and then they’d say, ‘pictures of puppies.’ So that told us that maybe ‘dogs’ and ‘puppies’ were interchangeable. We also learned that when you boil water, it’s hot water. We were relearning semantics from humans, and that was a great advance.”
But there were obstacles. Google’s synonym system understood that a dog was similar to a puppy and that boiling water was hot. But it also concluded that a hot dog was the same as a boiling puppy. The problem was fixed in late 2002 by a breakthrough based on philosopher Ludwig Wittgenstein’s theories about how words are defined by context. As Google crawled and archived billions of documents and Web pages, it analyzed what words were close to each other. “Hot dog” would be found in searches that also contained “bread” and “mustard” and “baseball games” — not poached pooches. That helped the algorithm understand what “hot dog” — and millions of other terms — meant. “Today, if you type ‘Gandhi bio,’ we know that bio means biography,” Singhal says. “And if you type ‘bio warfare,’ it means biological.”
Saturday, January 30, 2010
123 Hack Me
A New York Times article reveals many people are still using simple, easily-guessed passwords.
This list comes from a list of 32 million passwords a hacker posted from a company that makes software used by social networking sites like Facebook and MySpace. It was only briefly posted, but it was downloaded and examined by hackers and security specialists alike. What a great resource!
According to the article, here are the top 32 passwords:
Back at the dawn of the Web, the most popular account password was “12345.” Today, it’s one digit longer but hardly safer: “123456.”
This list comes from a list of 32 million passwords a hacker posted from a company that makes software used by social networking sites like Facebook and MySpace. It was only briefly posted, but it was downloaded and examined by hackers and security specialists alike. What a great resource!
According to the article, here are the top 32 passwords:
- 123456
- 12345
- 123456789
- password
- iloveyou
- princess
- rockyou
- 1234567
- 12345678
- abc123
- nicole
- daniel
- babygirl
- monkey
- jessica
- lovely
- michael
- ashley
- 654321
- qwerty
- iloveu
- michelle
- 111111
- 0
- tigger
- password1
- sunshine
- chocolate
- anthony
- angel
- FRIENDS
- soccer
Monday, January 25, 2010
Haiti People Finder
Alright, it may not have anything to do with libraries, but this is the Haiti people finder project. It does feature working with databases, which is a librarian thing, and also helping people - also a librarian thing. If you have time and can follow instructions, you can help.
Librarian fail: I neglected to notice where I got this link from.
Librarian fail: I neglected to notice where I got this link from.
Subscribe to:
Posts (Atom)