This article, found via digg, highlights an inherent ‘by-design’ flaw of automatic news aggregators, including Google News: they need a significant amount of press coverage before promoting news to their front page. As a result, automatic news aggregators are often hours late in covering breaking news.
The solution to the problem of “finding the most important news right now” cannot rely on one hour or so of news history. After one hour, it is no more a breaking news. It is late and repetitive.
Let’s formulate a challenging research problem from that: “Given novel and unique news, can you predict that there will be thousand of repetitions and reformulations?”