Tag Archives: Google

corporatizing copyright

Ursula Le Guin has some stones.  This whole Google digital books settlement is a bit complicated, but it boils down to something more than opting in and out for the authors.  It’s about signing away your authorship, and forcing companies like the great and powerful Goog to negotiate with you before you do so and not after they’ve been caught.  Le Guin says it better:

The “opt-out” clause in the Settlement is most disturbing:

First, it seems unfair that, by the terms of the class-action settlement, authors can officially present objections to the Court only by being “opted in” to the settlement and thereby subjecting themselves to its terms.

Second, while the “opt-out” clause appears to offer authors an easy way to defend their copyright, in fact it disguises an assault on authors’ rights. Google, like any other publisher or entity, should be required to obtain permission from the owner to purchase or use copyrighted material, item by item.

The free and open dissemination of information and of literature, as it exists in our Public Libraries, can and should exist in the electronic media. All authors hope for that. But we cannot have free and open dissemination of information and literature unless the use of written material continues to be controlled by those who write it or own legitimate right in it. We urge our government and our courts to allow no corporation to circumvent copyright law or dictate the terms of that control.

Google has some stones as well, dictating the terms of their own settlement to authors of works they’ve digitized without consent.  Perhaps Google is trying to claim some sort of perverted sense of fair use by chumming with libraries to assist in their digitization without bothering to negotiate with authors and forking out the dough to buy the item they want to scan from Amazon or AbeBooks.

resisting google: not so futile

Not too long ago I mused upon the idea of how some search engine companies are trying to provide more  human interaction when one has an online reference question, by either doing the searching or providing suggestions on how to perform the search.  This quasi virtual reference seems to be catching on, and librarians are suddenly becoming more recognized for the credibility they provide in their reference work.

This sentiment is the impetus for a new project that aims to compete with likes of the great goog, Reference Extract.  The project, an ever-increasing collaboration of libraries, aims to differ from Google in the credibility taken from the shrewd linkages that librarians provide in applying sound information literacy principles. Said better than myself:

Users will enter a search term and get results weighted towards sites most often referred to by librarians at institutions such as the Library of Congress, the University of Washington, the State of Maryland, and over 1,400 libraries worldwide.

The issue of credibility is interesting when compared to the measure of relevancy and popularity Google bases its index on.  The issue of credibility is more fully explained:

In essence linkages between web pages by anyone is replaced by citations to web pages by highly trained librarians in their daily work of answering the questions of scholars, policy makers and the general population. Instead of page rank, the team refers to this as “reference weighting.”

That is to say, it is no great leap to believe that working one-on-one with a librarian would yield highly credible results, but it also appears that gathering the sites librarians point to across these one-on-one interactions and making them searchable continues to yield highly credible results. Further since the librarians answer question on very wide range of topics, their answers can be applied to a general purpose search engine.

I find it clever that the organizers of RefEx measured their index by using the custom search engine provided by Google…beating it at its own game perhaps.

It is important to note that by using the Google Custom Search Engine service the exact same technology was used to search and rank the results, the only thing that varied was that one was an open web search, and one was limited to only those pointed to by reference librarians. So, even outside of the library website context the credibility of librarians is retained.

We may index less pages, but the ones we point to are more informationally literate. One question to walk away from with this: does less material indexed = more reliable?  Philosophically speaking, words like popular, relevant, and usefulness will cause debate; academically speaking, this justifies the librarian’s attempt to wean those frothing, zombie-like patrons away from The Google and more toward our subscribed databases, online resources and guides.  And with RefEx, Google’s helping us do it.

life in googlevision

The great and powerful Goog has now acquired the archived photos from LIFE magazine, and it’s publicly available on each of your interwebs:

The collection includes the entire works of Life photographers Alfred Eisenstaedt, Gjon Mili and Nina Leen. Also available are: the Zapruder film of the Kennedy assassination; Dahlstrom glass plates of New York from the 1880’s; and Hugo Jaeger Nazi-era Germany 1937-1944.

Dawn Bridges, a spokeswoman for TimeInc, the archives in their entirety would be available in the first quarter of next year. She said it was would not just be historical. “We will be adding new things. There will be thousands of new pictures from DC for the inauguration on January 20,” she said.

What’s cool is that according to the article, 97% of the photos (10 million) have never before been seen. Here’s Google’s portal for accessing the photos. A prominent issue now to consider is whether the photos are in the public domain. Obviously, the older ones might just be, but what about the ones less than the 70 or so years it takes for fair use? Pretty groovy for browsing, though.

search engine overload…or overlord?

Seems like search engines have been springing up all over the place.  Soon enough there will be needed search engines to search search engines (oh wait…we already have those). In any case, the emergence of new breed of mechasearchers has me intrigued whether or not Google might be spreading itself a bit too thin with all their gizmos in development.  I’m curious about the avenues that these particular developers are taking so that they just might be the one to slay the great Goog.  Three current avenues are particularly intriguing.

Preserve what little humanity we have left with ChaCha

ChaCha is a company that is building on the idea that it is not so much the technology that is delivering your indexed content as it is the humanoids manipulating the technology.

Thus Spake Zara-chacha:

ChaCha is conversational, fun, and easy to use. Simply ask your question like you are talking to a smart friend and ChaCha’s advanced technology instantly routes it to the most knowledgeable person on that topic in our guide community. Your answer is then returned to your phone as a text message within a few minutes.

Not that it’s necessary to use a live guide as their search engine works perfectly fine, but hooking a live one can be helpful especially if you’re not near a pulsing box of pixellation and you have your phone with you.  Texting your searches seems like all the rage, but mind you, standard rates may apply.

Make it sound as human as possible with Powerset

Taming the beast is the aim of Powerset, the beast being the search technology that cannot understand our queries.  So like ChaCha, there is nothing wrong with us, but that blasted speech sytnax that computers simply can’t understand.  Powerset writes it out for us:

Powerset’s goal is to change the way people interact with technology by enabling computers to understand our language. While this is a difficult challenge, we believe that now is the right time to begin the journey. Powerset is first applying its natural language processing to search, aiming to improve the way we find information by unlocking the meaning encoded in ordinary human language.

So with the intent of not having to resort to technical, complicated search strings, Powerset wants our search results directly related to the flow of our informal speech patterns.  In its infancy, Powerset currently indexes only articles submitted to Wikipedia, though containing several viewing options, references, and citations one would expect from a typical wikipedia entry.

Index early, index often with Cuil

And then there’s Cuil. Apparently created by defectors from the great Goog, these two have started their own search engine, and though like Shaquille O’Neal running a not-so-fast break, it’s definitely gaining momentum. So much so that it boasts possessing the world’s biggest index:

The Internet has grown exponentially in the last fifteen years but search engines have not kept up—until now. Cuil searches more pages on the Web than anyone else—three times as many as Google and ten times as many as Microsoft.

Rather than rely on superficial popularity metrics, Cuil searches for and ranks pages based on their content and relevance. When we find a page with your keywords, we stay on that page and analyze the rest of its content, its concepts, their inter-relationships and the page’s coherency.

Then we offer you helpful choices and suggestions until you find the page you want and that you know is out there. We believe that analyzing the Web rather than our users is a more useful approach, so we don’t collect data about you and your habits, lest we are tempted to peek. With Cuil, your search history is always private.

Very interesting claim as well that Cuil has no interest whatsoever with collecting user data or the habits thereof and indexing by popularity.  In any case, Cuil certainly intends to raise the stakes.

Three different philosophies, three different search engines.