04 2, 2008

Hadoop Summit

Yahoo! is hosting the first Apache Hadoop Summit on March 25th, 2008 at the Network Meeting Center in Santa Clara, California.

James's notes :

Hadoop: A Brief History

¡¤         Doug Cutting

¡¤         Started with Nutch in 2002 to 2004

o   Initial goal was web-scale, crawler-based search

o   Distributed by necessity

o   Sort/merge based processing

o   Demonstrated on 4 nodes over 100M web pages. 

o   Was operational onerous. ¡°Real¡± Web scale was a ways away yet

¡¤         2004 through 2006: Gestation period

o   GFS & MapReduce papers published (addressed the scale problems we were having)

o   Add DFS and MapReduce to Nutch

o   Two part-time developers over two years

o   Ran on 20 nodes at Internet Archive (IA) and UW

o   Much easier to program and run

o   Scaled to several 100m web pages

¡¤         2006 to 2008: Childhood

o   Y! hired Doug Cutting and a dedicated team to work on it reporting to E14 (Eric  Baldeschwieler)

o   Hadoop project split out of Nutch

o   Hit web scale in 2008


Continue reading "Hadoop Summit" »

03 28, 2007

Yahoo Mail Announces Unlimited Storage

Yahoo is announcing that all Yahoo Mail users will have free unlimited email storage starting in May 2007. The current storage limit is 1 GB per account (2 GB for $20/year premium users). With this change, Yahoo leapfrogs Gmail (2.8 GB and growing) and Live.com Mail (2GB). Yahoo mail currently has 250 million global users, more than any other online service (Live.com has 228 million and Gmail has 51 million users).

08 25, 2006

Windows Live Drive with 2GB of storage

CNET writes that details about the Windows Live Drive was given out by a Microsoft representative in Syndey.

Microsoft Australia technical specialist John Hodgson said that the basic Live Drive was likely to include around 2 gigabytes of storage for free. Additional storage capacity will be available for purchase, he said, though pricing and final release dates haven¡¯t been announced.

While there have been rumors about Live Drive service in the blogosphere, to date Microsoft has been cagey about officially confirming those plans. What is known is that the service can be mapped directly from PCs running the upcoming Windows Vista operating system.


More:
talk
AOL XDrive
GDriver

05 26, 2006

Human Computation

Luis von Ahn (Carnegie Mellon University)


Abstract
Tasks like image recognition are trivial for humans, but continue to challenge even the most sophisticated computer programs. This talk introduces a paradigm for utilizing human processing power to solve problems that computers cannot yet solve. Traditional approaches to solving such problems focus on improving software. I advocate a novel approach: constructively channel human brainpower using computer games. For example, the ESP Game, described in this talk, is an enjoyable online game -- many people play over 40 hours a week -- and when people play, they help label images on the Web with descriptive keywords. These keywords can be used to significantly improve the accuracy of image search. People play the game not because they want to help, but because they enjoy it. The ESP Game has been licensed by a major Internet company and will soon become the basis of their image search engine.

I describe other examples of "games with a purpose": Peekaboom, which helps determine the location of objects in images, and Verbosity, which collects common-sense knowledge. I also explain a general approach for constructing games with a purpose.

In addition, I describe my work on CAPTCHAs, automated tests that humans can pass but computer programs cannot. CAPTCHAs take advantage of human processing power in order to differentiate humans from computers, an ability that has important applications in practice.

The results of this work are currently in use by hundreds of Web sites and companies around the world, and over 100,000 people have played some of the games presented here. Practical applications include improvements in areas such as: image search, adult-content filtering, spam prevention, common-sense reasoning, computer vision, accessibility, and security in general.

Bio:
Luis von Ahn is a Post-Doctoral Fellow in the Computer Science Department at Carnegie Mellon University, where he also received his Ph.D. in 2005. Previously, Luis obtained a B.S. in mathematics from Duke University in 2000. He is the recipient of a Microsoft Research Fellowship. His research interests include encouraging people to do work for free, as well as catching and thwarting cheaters in online environments. His work has appeared in over 100 news outlets including The New York Times, CNN, USA Today, The BBC, and The Discovery Channel. Luis holds 4 patent applications, and has licensed technology to major Internet companies.

via:http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=484

About author:http://www.cs.cmu.edu/~biglou/research.html

Changes on Del.icio.us

The del.icio.us popular page is very¡­ well, popular. Many of you read it frequently and use it to find interesting new links. While it is a great source of links, I thought we could do a better job of showing off what the community is thinking about and looking at. So we're trying something new: You¡¯ll notice that the del.icio.us homepage now features a hotlist which is updated every hour to show you the top three most popular links as of that moment. These links are taken directly from the popular page, and we never show the same link twice. This guarantees that every time you visit the homepage, you¡¯ll see something new (well, at least every hour).

via:http://blog.del.icio.us/blog/2006/05/breaking_news.html