<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments for YooName - named entity recognition</title>
	<atom:link href="http://yooname.wordpress.com/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://yooname.wordpress.com</link>
	<description>Semi-supervised Named Entity Recognition software.</description>
	<lastBuildDate>Sun, 07 Dec 2008 22:54:27 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>Comment on Interesting new research problem by Brian Bean</title>
		<link>http://yooname.wordpress.com/2008/06/25/interesting-new-research-problem/#comment-84</link>
		<dc:creator>Brian Bean</dc:creator>
		<pubDate>Sun, 07 Dec 2008 22:54:27 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=60#comment-84</guid>
		<description>How about this:

Filter news for temporal immediacy and against &quot;reformulation&quot; (if possible).

Evaluate a reasonably large sample of original news stories that resulted in repetitions/reformulations for attribute commonality, e.g., natural disasters, significant loss of life, etc., and characterize these attributes by frequency of occurrence, impact (some attributes, for example, may result in a bigger storm of follow-on articles than other attributes) AND &quot;consistency&quot; of predictability (some attributes may be associated not only with &quot;important&quot; news but also with news that did not result in significant repetition/reformulation). Cross-correlate these attributes.

Construct and apply an algorithm based on the attribute evaluation of the prior paragraph to the news stream that meets the criteria of the first paragraph.

Ascertain the efficacy of the algorithm by observing how flagged news resulted or did not result in repetition/reformulation over the evaluator&#039;s time frame of interest.

Modify the algorithm to increase accuracy by repeating the attribute evaluation and cross correlation of the second paragraph at some selected interval.  This will address &quot;attribute drift&quot;.  For example, I suspect the attribute &quot;terrorist act&quot; exhibited a different predictive profile on September 12, 2001 than it did on September 10th of that year.

You have posed an interesting problem.  Good luck!

Brian</description>
		<content:encoded><![CDATA[<p>How about this:</p>
<p>Filter news for temporal immediacy and against &#8220;reformulation&#8221; (if possible).</p>
<p>Evaluate a reasonably large sample of original news stories that resulted in repetitions/reformulations for attribute commonality, e.g., natural disasters, significant loss of life, etc., and characterize these attributes by frequency of occurrence, impact (some attributes, for example, may result in a bigger storm of follow-on articles than other attributes) AND &#8220;consistency&#8221; of predictability (some attributes may be associated not only with &#8220;important&#8221; news but also with news that did not result in significant repetition/reformulation). Cross-correlate these attributes.</p>
<p>Construct and apply an algorithm based on the attribute evaluation of the prior paragraph to the news stream that meets the criteria of the first paragraph.</p>
<p>Ascertain the efficacy of the algorithm by observing how flagged news resulted or did not result in repetition/reformulation over the evaluator&#8217;s time frame of interest.</p>
<p>Modify the algorithm to increase accuracy by repeating the attribute evaluation and cross correlation of the second paragraph at some selected interval.  This will address &#8220;attribute drift&#8221;.  For example, I suspect the attribute &#8220;terrorist act&#8221; exhibited a different predictive profile on September 12, 2001 than it did on September 10th of that year.</p>
<p>You have posed an interesting problem.  Good luck!</p>
<p>Brian</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on The New York Times Annotated Corpus by Derek Gottfrid</title>
		<link>http://yooname.wordpress.com/2008/11/01/the-new-york-times-annotated-corpus/#comment-83</link>
		<dc:creator>Derek Gottfrid</dc:creator>
		<pubDate>Mon, 17 Nov 2008 02:28:04 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=71#comment-83</guid>
		<description>wait til you can play w/ the search api that leverages all that data.</description>
		<content:encoded><![CDATA[<p>wait til you can play w/ the search api that leverages all that data.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on The New York Times Annotated Corpus by Michael</title>
		<link>http://yooname.wordpress.com/2008/11/01/the-new-york-times-annotated-corpus/#comment-81</link>
		<dc:creator>Michael</dc:creator>
		<pubDate>Mon, 03 Nov 2008 18:35:32 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=71#comment-81</guid>
		<description>This could be a boon for training common sense AI system on. The correlations could be easily mined from this kind of structure.  

Exciting!</description>
		<content:encoded><![CDATA[<p>This could be a boon for training common sense AI system on. The correlations could be easily mined from this kind of structure.  </p>
<p>Exciting!</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on The New York Times Annotated Corpus by Peter Turney</title>
		<link>http://yooname.wordpress.com/2008/11/01/the-new-york-times-annotated-corpus/#comment-80</link>
		<dc:creator>Peter Turney</dc:creator>
		<pubDate>Sat, 01 Nov 2008 16:19:12 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=71#comment-80</guid>
		<description>I got the news from Daniel Lemire:

http://www.daniel-lemire.com/blog/</description>
		<content:encoded><![CDATA[<p>I got the news from Daniel Lemire:</p>
<p><a href="http://www.daniel-lemire.com/blog/" rel="nofollow">http://www.daniel-lemire.com/blog/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Difficult to Pwn IM Language iykwimaityd by yooname</title>
		<link>http://yooname.wordpress.com/2008/06/01/difficult-to-pwn-im-language-iykwimaityd/#comment-64</link>
		<dc:creator>yooname</dc:creator>
		<pubDate>Sun, 01 Jun 2008 21:10:13 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=55#comment-64</guid>
		<description>&lt;?xml version=&quot;1.0&quot;&gt;
&lt;YooName&gt;
&lt;ENAMEX TYPE=&quot;misc:internet_slang&quot;&gt;kewl&lt;/ENAMEX&gt;!
&lt;/YooName&gt;


funny :)</description>
		<content:encoded><![CDATA[<p>&lt;?xml version=&#8221;1.0&#8243;&gt;<br />
&lt;YooName&gt;<br />
&lt;ENAMEX TYPE=&#8221;misc:internet_slang&#8221;&gt;kewl&lt;/ENAMEX&gt;!<br />
&lt;/YooName&gt;</p>
<p>funny :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Difficult to Pwn IM Language iykwimaityd by Peter Turney</title>
		<link>http://yooname.wordpress.com/2008/06/01/difficult-to-pwn-im-language-iykwimaityd/#comment-63</link>
		<dc:creator>Peter Turney</dc:creator>
		<pubDate>Sun, 01 Jun 2008 19:32:33 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=55#comment-63</guid>
		<description>Kewl!</description>
		<content:encoded><![CDATA[<p>Kewl!</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on What is a Named Entity? by Bob Carpenter</title>
		<link>http://yooname.wordpress.com/2008/02/12/what-is-a-named-entity/#comment-57</link>
		<dc:creator>Bob Carpenter</dc:creator>
		<pubDate>Mon, 05 May 2008 20:49:33 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=47#comment-57</guid>
		<description>Any plans to share your corpus?

Though I like philosophy of language more than most (I taught it when I was a professor at Carnegie Mellon), I&#039;m not sure there&#039;s a place for Kripke&#039;s possible-worlds semantics notion of rigid designation in an engineering discipline.  I find its dependence on possible worlds rather circular.  And in that theory, &quot;water&quot; is typically taken to be non-rigid, whereas H20 is of debateable rigidity depending on your beliefs about the behavior of physics in other possible worlds.

Even names like &quot;Ronald Reagan&quot; and &quot;George W. Bush&quot; are not unique, even in 1985 or 2008.  So any notion of specificity is difficult to quantify.  Check out Russell&#039;s original theory and Strawson&#039;s replies about contextual dependence (not to mention Kaplan&#039;s cool early work on demonstrative pronouns and other indexicals).

Pascal makes a good point about brands: they&#039;re specific (abstract) entities .  How things get names and what names mean is another long detour through the philosophy of language that isn&#039;t particularly relevant for engineering.  It has an epistemological component concerned with how you learn what the names of things are.  And an ontological component because you can give the same thing (an ontological notion) different names (e.g. &quot;Morning Star&quot; and &quot;Evening Star&quot;, both of which refer to the planet Venus).</description>
		<content:encoded><![CDATA[<p>Any plans to share your corpus?</p>
<p>Though I like philosophy of language more than most (I taught it when I was a professor at Carnegie Mellon), I&#8217;m not sure there&#8217;s a place for Kripke&#8217;s possible-worlds semantics notion of rigid designation in an engineering discipline.  I find its dependence on possible worlds rather circular.  And in that theory, &#8220;water&#8221; is typically taken to be non-rigid, whereas H20 is of debateable rigidity depending on your beliefs about the behavior of physics in other possible worlds.</p>
<p>Even names like &#8220;Ronald Reagan&#8221; and &#8220;George W. Bush&#8221; are not unique, even in 1985 or 2008.  So any notion of specificity is difficult to quantify.  Check out Russell&#8217;s original theory and Strawson&#8217;s replies about contextual dependence (not to mention Kaplan&#8217;s cool early work on demonstrative pronouns and other indexicals).</p>
<p>Pascal makes a good point about brands: they&#8217;re specific (abstract) entities .  How things get names and what names mean is another long detour through the philosophy of language that isn&#8217;t particularly relevant for engineering.  It has an epistemological component concerned with how you learn what the names of things are.  And an ontological component because you can give the same thing (an ontological notion) different names (e.g. &#8220;Morning Star&#8221; and &#8220;Evening Star&#8221;, both of which refer to the planet Venus).</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on YooName&#8217;s creator honored at 2008 OCRI awards by Mauricio Nuñez</title>
		<link>http://yooname.wordpress.com/2008/04/08/yoonames-creator-honored-at-2008-ocri-awards/#comment-56</link>
		<dc:creator>Mauricio Nuñez</dc:creator>
		<pubDate>Fri, 11 Apr 2008 01:10:25 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=54#comment-56</guid>
		<description>Great news! congratulation. I&#039;m a follower of your work and trying to use for my own business.
Regards

Mauricio</description>
		<content:encoded><![CDATA[<p>Great news! congratulation. I&#8217;m a follower of your work and trying to use for my own business.<br />
Regards</p>
<p>Mauricio</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on What is a Named Entity? by Pascal</title>
		<link>http://yooname.wordpress.com/2008/02/12/what-is-a-named-entity/#comment-53</link>
		<dc:creator>Pascal</dc:creator>
		<pubDate>Wed, 13 Feb 2008 14:20:08 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/?p=47#comment-53</guid>
		<description>I would tend to disagree with some examples you provided. For instance:

iPhone: not single individual according to you. 

Suppose that iPhone was in the dictionary, I would expect it to be defined as something like: &quot;Popular model of mobile phone marketed by Apple&quot; (or something like that), which would be different from the definition for a mobile phone &quot;Electronic device that allow the transmission of voice using wave frequencies&quot; or something like that. 

Thus, while people tend to refer to their iPhone as any device of this brand (my iPhone, your iPhone), I still see it as a unique entity: iPhone is a brand name (not the actual devices), just like Apple is a company name (and not the actual computers). iPhone is thus an unique individual, which is: THIS specific and unique brand. Yet I think this is debatable, so I&#039;d put an * next to the examples that contain brands in your list :)

In the case of Miss America, I also believe it is a single individual in a given context (time frame). Miss America refers to a single individual at a specific time, just as is &quot;the President of the United States&quot;, which could be replaced by &quot;George W. Bush&quot; in a sentence in 2008, but that should be replaced by &quot;Ronald Reagan&quot; in a sentence written in 1985. On the other hand, I&#039;m not sure that it is a *named* entity, since it is not actually named at all. 

I think that the generic/specific criterion is the best one to define a named entity, but it lacks an additional criterion: the &quot;named&quot; part. In the sentence: &quot;John Smith said that [...] while *he* was in Toronto&quot;, the word *he* refers to an individual, but *he* is not a named entity. For this reason &quot;The president of the United States&quot; would not qualify as a real *named* entity, at least linguistically. 

A geeker way to define a Named Entity: something that would need a GUID. Using this allegory, a C++ pointer to an individual identified with a GUID would be akin to an anaphoric expression:

&quot;Fido is my dog and he is happy&quot; 

Dog Fido = new Dog(&quot;7855E60A-D97A-11DC-A110-85C856D89593&quot;);
Dog* he = &Fido;
he-&gt;State = Dog.Mood.HAPPY;

:)</description>
		<content:encoded><![CDATA[<p>I would tend to disagree with some examples you provided. For instance:</p>
<p>iPhone: not single individual according to you. </p>
<p>Suppose that iPhone was in the dictionary, I would expect it to be defined as something like: &#8220;Popular model of mobile phone marketed by Apple&#8221; (or something like that), which would be different from the definition for a mobile phone &#8220;Electronic device that allow the transmission of voice using wave frequencies&#8221; or something like that. </p>
<p>Thus, while people tend to refer to their iPhone as any device of this brand (my iPhone, your iPhone), I still see it as a unique entity: iPhone is a brand name (not the actual devices), just like Apple is a company name (and not the actual computers). iPhone is thus an unique individual, which is: THIS specific and unique brand. Yet I think this is debatable, so I&#8217;d put an * next to the examples that contain brands in your list :)</p>
<p>In the case of Miss America, I also believe it is a single individual in a given context (time frame). Miss America refers to a single individual at a specific time, just as is &#8220;the President of the United States&#8221;, which could be replaced by &#8220;George W. Bush&#8221; in a sentence in 2008, but that should be replaced by &#8220;Ronald Reagan&#8221; in a sentence written in 1985. On the other hand, I&#8217;m not sure that it is a *named* entity, since it is not actually named at all. </p>
<p>I think that the generic/specific criterion is the best one to define a named entity, but it lacks an additional criterion: the &#8220;named&#8221; part. In the sentence: &#8220;John Smith said that [...] while *he* was in Toronto&#8221;, the word *he* refers to an individual, but *he* is not a named entity. For this reason &#8220;The president of the United States&#8221; would not qualify as a real *named* entity, at least linguistically. </p>
<p>A geeker way to define a Named Entity: something that would need a GUID. Using this allegory, a C++ pointer to an individual identified with a GUID would be akin to an anaphoric expression:</p>
<p>&#8220;Fido is my dog and he is happy&#8221; </p>
<p>Dog Fido = new Dog(&#8220;7855E60A-D97A-11DC-A110-85C856D89593&#8243;);<br />
Dog* he = &Fido;<br />
he-&gt;State = Dog.Mood.HAPPY;</p>
<p>:)</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on GeoNames&#8217; Inside by What is a Named Entity? &#171; YooName - named entity recognition</title>
		<link>http://yooname.wordpress.com/2007/10/01/geonames-inside/#comment-49</link>
		<dc:creator>What is a Named Entity? &#171; YooName - named entity recognition</dc:creator>
		<pubDate>Tue, 12 Feb 2008 13:37:07 +0000</pubDate>
		<guid isPermaLink="false">http://yooname.wordpress.com/2007/10/01/geonames-inside/#comment-49</guid>
		<description>[...] expanded the number of type to 100, as guided by our definition. We calculated that less than 1% of the millions of entities we have are ambiguous with sets of words that are not handled so far. The problem is that this 1% [...]</description>
		<content:encoded><![CDATA[<p>[...] expanded the number of type to 100, as guided by our definition. We calculated that less than 1% of the millions of entities we have are ambiguous with sets of words that are not handled so far. The problem is that this 1% [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>
