<?xml version="1.0"?>
<News hasArchived="true" page="8" pageCount="9" pageSize="10" timestamp="Sun, 10 May 2026 08:24:58 -0400" url="https://beta.my.umbc.edu/groups/ebiquity/posts.xml?page=8&amp;tag=machine-learning">
<NewsItem contentIssues="true" id="8563" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/8563">
<Title>Detecting fake Google+ profiles with image search</Title>
<Body>
<![CDATA[
    <div class="html-content">
    <div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F09%2F11%2Fdetecting-fake-google-profiles-with-image-search%2F&amp;text=Detecting%20fake%20Google%2B%20profiles%20with%20image%20search&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F09%2F11%2Fdetecting-fake-google-profiles-with-image-search%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p><img src="http://ebiquity.umbc.edu/blogger/wp-content/uploads/2011/09/janetSmith.jpg" alt="" width="500" height="330" style="max-width: 100%; height: auto;"></p>
    <p>Many Google+ users have been reporting frequent notices about new followers that they don’t know and appear to be attractive young women. The suspicious followers have minimal profiles and no posts. These are obviously false accounts being created for some yet unknown purpose, but how can one prove it?</p>
    <p>I just got a notice, for example, that <a href="https://plus.google.com/103336248599628782640/" rel="nofollow external" class="bo">Janet Smith</a> of Philadelphia is following me. Now Janet Smith is a common name and Philadelphia is a big place — there are probably hundreds of people who live in the Philadelphia area with that name. The 990 other people she’s following seem like a pretty random bunch, though I do know many and have more than a few in my own circles. Most seem to have a fair number of followers.</p>
    <p>So there is not much to go on other than her profile image. This is a great use for <a href="http://bit.ly/qayRhT" rel="nofollow external" class="bo">Google’s new image search</a>. I dragged the picture into the image search query field and Google identified its best guess for the image as Indian actress <a href="http://bit.ly/oDiyhf" rel="nofollow external" class="bo">Koyel Mullick</a>. Sure enough, if you <a href="http://bit.ly/oT3GcT" rel="nofollow external" class="bo">search</a> for images with her name, the precise Janet Smith image is result number 15.</p>
    <p>Of course, there are still some subtle issues.  This is just one kind of false profile — one created for one identity but using an image from a different one.  It’s common on most social media systems, including G+, for some people to use a picture of someone or something other than themselves.  But it’s obvious to a human viewer that using a picture of a rabbit, Marilyn Monroe or the mighty Thor on your profile is not meant to deceive.  It will be challenging to automate the process of discriminating the intent to deceive from modesty, homage or an ironic gesture.</p>
    </div>
]]>
</Body>
<Summary>Tweet     Many Google+ users have been reporting frequent notices about new followers that they don’t know and appear to be attractive young women. The suspicious followers have minimal profiles...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/09/11/detecting-fake-google-profiles-with-image-search/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/8563/guest@my.umbc.edu/e532f54bb90c06ab7bdaa7e2d2bddc17/api/pixel</TrackingUrl>
<Tag>machine-learning</Tag>
<Tag>semantic-web</Tag>
<Tag>social-media</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>8</PawCount>
<CommentCount>1</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Sun, 11 Sep 2011 17:00:42 -0400</PostedAt>
<EditAt>Sun, 11 Sep 2011 17:00:42 -0400</EditAt>
</NewsItem>

<NewsItem contentIssues="true" id="8407" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/8407">
<Title>Mid-Atlantic student colloquium on speech, language...</Title>
<Body>
<![CDATA[
    <div class="html-content">Full Title: Mid-Atlantic student colloquium on speech, language and learning<div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F09%2F02%2Fmid-atlantic-student-colloquium-on-speech-language-and-learning%2F&amp;text=Mid-Atlantic%20student%20colloquium%20on%20speech%2C%20language%20and%20learning&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F09%2F02%2Fmid-atlantic-student-colloquium-on-speech-language-and-learning%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <img src="http://ebiquity.umbc.edu/blogger/wp-content/uploads/2011/09/stieff.png" alt="" width="505" height="223" style="max-width: 100%; height: auto;">
    <p>The First <a href="http://sites.google.com/site/studentcolloquiumsll/" rel="nofollow external" class="bo">Mid-Atlantic Student Colloquium on Speech, Language and Learning</a> is a one-day event to be held at the Johns Hopkins University in Baltimore on Friday, 23 September 2011.  Its goal is to bring together students taking computational approaches to speech, language, and learning, so that they can introduce their research to the local student community, give and receive feedback, and engage each other in collaborative discussion.  Attendance is open to all and free but space is limited, so online <a href="https://sites.google.com/site/studentcolloquiumsll/registration" rel="nofollow external" class="bo">registration</a> is requested by September 16.  The <a href="https://sites.google.com/site/studentcolloquiumsll/program" rel="nofollow external" class="bo">program</a> runs from 10:00am to 5:00pm and will  include oral presentations, poster sessions, and breakout sessions.</p>
    </div>
]]>
</Body>
<Summary>Full Title: Mid-Atlantic student colloquium on speech, language and learning Tweet   The First Mid-Atlantic Student Colloquium on Speech, Language and Learning is a one-day event to be held at the...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/09/02/mid-atlantic-student-colloquium-on-speech-language-and-learning/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/8407/guest@my.umbc.edu/afed2d18f40a33fa94d36cb1bd32e518/api/pixel</TrackingUrl>
<Tag>ai</Tag>
<Tag>conferences</Tag>
<Tag>kr</Tag>
<Tag>machine-learning</Tag>
<Tag>nlp</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>1</PawCount>
<CommentCount>0</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Fri, 02 Sep 2011 22:31:15 -0400</PostedAt>
<EditAt>Fri, 02 Sep 2011 22:31:15 -0400</EditAt>
</NewsItem>

<NewsItem contentIssues="false" id="7947" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/7947">
<Title>Free online courses on AI, databases and machine learning</Title>
<Body>
<![CDATA[
    <div class="html-content">
    <div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F08%2F16%2Ffree-online-courses-on-ai-databases-and-machine-learning%2F&amp;text=Free%20online%20courses%20on%20AI%2C%20databases%20and%20machine%20learning&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F08%2F16%2Ffree-online-courses-on-ai-databases-and-machine-learning%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p>Stanford is experimenting with an interesting <a href="http://www.nytimes.com/2011/08/16/science/16stanford.html" rel="nofollow external" class="bo">idea</a> — offering some of their most popular undergraduate computer science courses online for free and simultaneously with their regular offerings.  An AI course was <a href="http://my.umbc.edu/groups/csee/media/1200" rel="nofollow external" class="bo">announced</a> several weeks ago and now there are similar offerings for databases and machine learning.  These are taught by first rate instructors (who are also top researchers!) and are the same courses that Stanford students take.</p>
    <ul>
    <li>“A bold experiment in distributed education, <a href="http://www.ai-class.com/" rel="nofollow external" class="bo">“Introduction to Artificial Intelligence”</a> will be offered free and online to students worldwide during the fall of 2011. The course will include feedback on progress and a statement of accomplishment. Taught by Sebastian Thrun and Peter Norvig, the curriculum draws from that used in Stanford’s introductory Artificial Intelligence course. The instructors will offer similar materials, assignments, and exams.”</li>
    <li>“A bold experiment in distributed education, <a href="http://www.db-class.org/" rel="nofollow external" class="bo">“Introduction to Databases”</a> will be offered free and online to students worldwide during the fall of 2011. Students will have access to lecture videos, receive regular feedback on progress, and receive answers to questions. When you successfully complete this class, you will also receive a statement of accomplishment. Taught by Professor Jennifer Widom, the curriculum draws from Stanford’s popular Introduction to Databases course.”</li>
    <li>“A bold experiment in distributed education, <a href="http://www.ml-class.org/" rel="nofollow external" class="bo">“Machine Learning”</a> will be offered free and online to students worldwide during the fall of 2011. Students will have access to lecture videos, lecture notes, receive regular feedback on progress, and receive answers to questions. When you successfully complete the class, you will also receive a statement of accomplishment. Taught by Professor Andrew Ng, the curriculum draws from Stanford’s popular Machine Learning course.”</li>
    </ul>
    <p>If successful, this might be a game changer.  Two weeks after the online AI course was announced, 56,000 students had signed up!  The approach might work for many disciplines, not just CS. The <a href="http://www.khanacademy.org/" rel="nofollow external" class="bo">Kahn Academy</a> is a related effort.</p>
    <p>Universities should keep an eye on them and think about how to adapt if they are successful.  Most of our students will probably benefit from taking our traditional courses.  If so, we should be able to explain the benefits from taking them (and make sure we deliver those benefits).  At the same time, we may want to leverage the online material from these courses in a synergistic way.</p>
    </div>
]]>
</Body>
<Summary>Tweet  Stanford is experimenting with an interesting idea — offering some of their most popular undergraduate computer science courses online for free and simultaneously with their regular...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/08/16/free-online-courses-on-ai-databases-and-machine-learning/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/7947/guest@my.umbc.edu/49834dbf0d3e86997046097d238df835/api/pixel</TrackingUrl>
<Tag>ai</Tag>
<Tag>cs</Tag>
<Tag>database</Tag>
<Tag>machine-learning</Tag>
<Tag>social-media</Tag>
<Tag>web</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>3</PawCount>
<CommentCount>0</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Tue, 16 Aug 2011 01:32:13 -0400</PostedAt>
</NewsItem>

<NewsItem contentIssues="false" id="7543" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/7543">
<Title>Mid-Atlantic Student Colloquium on Speech, Language...</Title>
<Body>
<![CDATA[
    <div class="html-content">Full Title: Mid-Atlantic Student Colloquium on Speech, Language and Learning, 23 Sept 2011<div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F07%2F13%2Fmid-atlantic-student-colloquium-on-speech-language-and-learning-23-sept-2011%2F&amp;text=Mid-Atlantic%20Student%20Colloquium%20on%20Speech%2C%20Language%20and%20Learning%2C%2023%20Sept%202011&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F07%2F13%2Fmid-atlantic-student-colloquium-on-speech-language-and-learning-23-sept-2011%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p>The <a href="https://sites.google.com/site/studentcolloquiumsll/" rel="nofollow external" class="bo">Mid-Atlantic Student Colloquium on Speech, Language and Learning</a> is a one day, free event bringing together faculty, researchers and students from universities in the Mid-Atlantic area working in Speech/Language/ML. The colloquium is an opportunity for students to present preliminary or completed work and to network with other students, faculty and researchers working in related fields. The event will be held in Baltimore MD at the Johns Hopkins University on Friday 23 September 2011.</p>
    <p>Students are encouraged to submit one-page abstracts by Monday, August 15 describing ongoing, planned, or completed research projects, including previously published results and negative results. Student research in any field applying computational methods to any aspect of human language, including speech and learning, from all areas of computer science, linguistics, engineering, neuroscience, information science, and related fields, is welcome. Submissions and presentations must be made by students or postdocs.  See the <a href="https://sites.google.com/site/studentcolloquiumsll/call-for-papers" rel="nofollow external" class="bo">call for papers</a> for more information.</p>
    <p>Accepted submissions will be presented as posters and each will also be given a one-minute presentation during a poster spotlight session. A small number of submissions will be selected to be presented as talks, on the basis of diversity and general interest.</p>
    <p>Student-led breakout sessions of one hour will also be held to discuss papers on topics of interest and stimulate interaction and discussion. Topics and suggested papers for breakout sessions should be submitted by students alongside abstracts.</p>
    <p>The event is sponsored by the <a href="http://web.jhu.edu/hltcoe" rel="nofollow external" class="bo">Human Language Technology Center of Excellence</a> and the <a href="http://www.clsp.jhu.edu/" rel="nofollow external" class="bo">Center for Language and Speech Processing</a> at the Johns Hopkins University.</p>
    </div>
]]>
</Body>
<Summary>Full Title: Mid-Atlantic Student Colloquium on Speech, Language and Learning, 23 Sept 2011 Tweet  The Mid-Atlantic Student Colloquium on Speech, Language and Learning is a one day, free event...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/07/13/mid-atlantic-student-colloquium-on-speech-language-and-learning-23-sept-2011/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/7543/guest@my.umbc.edu/5ac72c887aa8a00f3aad4d79035c9f64/api/pixel</TrackingUrl>
<Tag>ai</Tag>
<Tag>machine-learning</Tag>
<Tag>nlp</Tag>
<Tag>semantic-web</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>1</PawCount>
<CommentCount>0</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Wed, 13 Jul 2011 23:24:47 -0400</PostedAt>
<EditAt>Wed, 13 Jul 2011 23:24:47 -0400</EditAt>
</NewsItem>

<NewsItem contentIssues="true" id="6233" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/6233">
<Title>New frontiers in spam: the Kindle Swindle</Title>
<Body>
<![CDATA[
    <div class="html-content">
    <div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F04%2F06%2Fnew-frontiers-in-spam-the-kindle-swindle%2F&amp;text=New%20frontiers%20in%20spam%3A%20the%20Kindle%20Swindle&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F04%2F06%2Fnew-frontiers-in-spam-the-kindle-swindle%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p>Publishing trends has a good <a href="http://www.publishingtrends.com/2011/03/the-kindle-swindle/" rel="nofollow external" class="bo">post</a> describing a new variation on spam: creating low-quality ebooks from plagiarized or public-domain content and selling them in ebook markets like Amazon’s Kindle store.  If you want to MAKE.MONEY.FAST there are <a href="http://www.warriorforum.com/warrior-special-offers-forum/354604-no-way-no-work-just-income-brand-new-hands-free-passive-income-special-wednesday-price.html" rel="nofollow external" class="bo">people</a> willing to help:</p>
    <img src="http://ebiquity.umbc.edu/blogger/wp-content/uploads/2011/04/kindleCash.png" alt="" width="450" height="345" style="max-width: 100%; height: auto;">
    <p>Automatically detecting these spam ebooks might be a good machine learning project.  One problem is that to use features of the ebook itself (e.g., poor formatting) might require purchasing it.  But there are sure to be many useful features that the ebook store provides that might support an effective classifier.</p>
    <p>(h/t <a href="&lt;br%20/&gt;%0Ahttp://www.schneier.com/blog/archives/2011/04/ebook_fraud.html" rel="nofollow external" class="bo">Bruce Schneier</a>)</p>
    </div>
]]>
</Body>
<Summary>Tweet  Publishing trends has a good post describing a new variation on spam: creating low-quality ebooks from plagiarized or public-domain content and selling them in ebook markets like Amazon’s...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/04/06/new-frontiers-in-spam-the-kindle-swindle/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/6233/guest@my.umbc.edu/4e7420bbbbae6915378a76f668011647/api/pixel</TrackingUrl>
<Tag>machine-learning</Tag>
<Tag>spam</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>3</PawCount>
<CommentCount>0</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Wed, 06 Apr 2011 09:10:29 -0400</PostedAt>
<EditAt>Wed, 06 Apr 2011 09:10:29 -0400</EditAt>
</NewsItem>

<NewsItem contentIssues="false" id="6211" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/6211">
<Title>DARPA uses computer game to learn anti-submarine...</Title>
<Body>
<![CDATA[
    <div class="html-content">Full Title: DARPA uses computer game to learn anti-submarine warfare tactics<div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F04%2F05%2Fdarpa-uses-computer-game-to-learn-anti-submarine-warfare-tactics%2F&amp;text=DARPA%20uses%20computer%20game%20to%20learn%20anti-submarine%20warfare%20tactics&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F04%2F05%2Fdarpa-uses-computer-game-to-learn-anti-submarine-warfare-tactics%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p>DARPA is developing a new component to track “quiet submarines” to be part of the Navy’s Anti Submarine Warfare toolkit and is using a <a href="http://www.darpa.mil/NewsEvents/Releases/2011/2011/04/04_DARPA%E2%80%99s_Anti-Submarine_Warfare_game_goes_live.aspx" rel="nofollow external" class="bo">software game</a> to collect effective strategies for its use.</p>
    <blockquote>
    <p> “Before autonomous software is developed for ACTUV’s computers, DARPA needs to determine what approaches and methods are most effective. To gather information from a broad spectrum of users, ACTUV has been integrated into the Dangerous Waters™ game. DARPA is offering this new ACTUV Tactics Simulator for free public <a href="https://actuv.darpa.mil" rel="nofollow external" class="bo">download</a>.</p>
    <p>This software has been written to simulate actual evasion techniques used by submarines, challenging each player to track them successfully. Your tracking vessel is not the only ship at sea, so you’ll need to safely navigate among commercial shipping traffic as you attempt to track the submarine, whose driver has some tricks up his sleeve. You will earn points as you complete mission objectives, and will have the opportunity to see how you rank against the competition on DARPA’s <a href="https://actuv.darpa.mil/LeaderBoard.aspx" rel="nofollow external" class="bo">leaderboard page</a>.  You can also share your experiences and insights from playing the simulator with others.”</p>
    </blockquote>
    <p>This is a kind of crowdsourcing — leveraging the experiences of a large number of people playing a game.  Applying various kinds of machine learning algorithms to the simulator data could be an effective way to train an autonomous tool for this task.</p>
    </div>
]]>
</Body>
<Summary>Full Title: DARPA uses computer game to learn anti-submarine warfare tactics Tweet  DARPA is developing a new component to track “quiet submarines” to be part of the Navy’s Anti Submarine Warfare...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/04/05/darpa-uses-computer-game-to-learn-anti-submarine-warfare-tactics/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/6211/guest@my.umbc.edu/6ffe9e714972fc38c2468194f1da6f95/api/pixel</TrackingUrl>
<Tag>ai</Tag>
<Tag>gaim</Tag>
<Tag>machine-learning</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>3</PawCount>
<CommentCount>0</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Tue, 05 Apr 2011 10:16:36 -0400</PostedAt>
<EditAt>Tue, 05 Apr 2011 10:16:36 -0400</EditAt>
</NewsItem>

<NewsItem contentIssues="false" id="5379" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/5379">
<Title>Did Watson enjoy a head start on Jeopardy?</Title>
<Body>
<![CDATA[
    <div class="html-content">
    <div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F02%2F22%2Fdid-watson-enjoy-a-head-start-on-jeopardy%2F&amp;text=Did%20Watson%20enjoy%20a%20head%20start%20on%20Jeopardy%3F&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F02%2F22%2Fdid-watson-enjoy-a-head-start-on-jeopardy%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p><img src="http://ebiquity.umbc.edu/blogger/wp-content/uploads/2011/02/watson_the_computer_beats_ken_jennings_and_brad_rutter_at_jeopardy_full.jpg" alt="IBM's Watson on Jeopardy!" width="510" height="224" style="max-width: 100%; height: auto;"></p>
    <p>IBM’s Watson’s performance in last week’s Jeopardy Challenge was an amazing accomplishment and a demonstration of how our computer systems are becoming more intelligent and capable of solving difficult tasks.</p>
    <p>But I wonder if the way that questions were given to the human players and Watson doesn’t give Watson a short, but significant head start.  According to the <a href="http://www.nytimes.com/2010/06/20/magazine/20Computer-t.html" rel="nofollow external" class="bo">New York Times</a></p>
    <blockquote><p>
    “During the sparring matches, Watson received the questions as electronic texts at the same moment they were made visible to the human players;”
    </p></blockquote>
    <p>Once Watson received a query, it could process it immediately.  While the human contestants got to see the query as written text at the same time, Alex Trebek also starts reading the question aloud.  When I was watching Jeopardy, I found it almost impossible to read and understand the question more quickly than it was being spoken and suspect that Ken Jennings and Brad Rutter might also.  It’s often observed that people find it very difficult to simultaneously process two language streams.  While it took Trebek only a second or two to read the short Jeopardy queries, that could have given Watson a significant head start, enabling it to determine that it had a good answer and press its buzzer before the competition.</p>
    <p>If this is the case, I am not sure if it is an unfair advantage.  People and computers each have native advantages and disadvantages.  If Jennings and Rutter got the questions as text without them being simultaneous read aloud, Watson might still have had the advantage of a quicker start.</p>
    </div>
]]>
</Body>
<Summary>Tweet     IBM’s Watson’s performance in last week’s Jeopardy Challenge was an amazing accomplishment and a demonstration of how our computer systems are becoming more intelligent and capable of...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/02/22/did-watson-enjoy-a-head-start-on-jeopardy/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/5379/guest@my.umbc.edu/a8c34d71b6bc1c4b1696d7c44947935d/api/pixel</TrackingUrl>
<Tag>ai</Tag>
<Tag>machine-learning</Tag>
<Tag>semantic-web</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>9</PawCount>
<CommentCount>2</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Tue, 22 Feb 2011 09:12:14 -0500</PostedAt>
<EditAt>Tue, 22 Feb 2011 09:12:14 -0500</EditAt>
</NewsItem>

<NewsItem contentIssues="false" id="5170" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/5170">
<Title>Six lessons for the age of machines</Title>
<Body>
<![CDATA[
    <div class="html-content">
    <div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F02%2F13%2Fsix-lessons-for-the-age-of-machines%2F&amp;text=Six%20lessons%20for%20the%20age%20of%20machines&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F02%2F13%2Fsix-lessons-for-the-age-of-machines%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p>On the eve of the big Jeopardy! match, Peter Norvig’s opinion piece in the New York Post (!) today, <a href="http://www.nypost.com/p/news/opinion/opedcolumnists/the_machine_age_tM7xPAv4pI4JslK0M1JtxI" rel="nofollow external" class="bo">The Machine Age</a> looks at AI’s progress over the past sixty years and lays out six surprising lessons we’ve learned.</p>
    <ul>
    <li>The things we thought were hard turned out to be easier.</li>
    <li>Dealing with uncertainty turned out to be more important than thinking with logical precision.</li>
    <li>Learning turned out to be more important than knowing.</li>
    <li>Current systems are more likely to be built from examples than from logical rules.</li>
    <li>The focus shifted from replacing humans to augmenting them.</li>
    <li>The partnership between human and machine is stronger than either one alone.</li>
    </ul>
    <p>When took Pat Winston’s undergraduate AI class in 1970, only the first of those ideas was current. It’s a good essay.</p>
    <p>Of course, after we we’ve exploited the new data-driven, statistical paradigm for the next decade or so, we’ll probably have to go back to figuring out how to get logic back into the framework.</p>
    </div>
]]>
</Body>
<Summary>Tweet  On the eve of the big Jeopardy! match, Peter Norvig’s opinion piece in the New York Post (!) today, The Machine Age looks at AI’s progress over the past sixty years and lays out six...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/02/13/six-lessons-for-the-age-of-machines/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/5170/guest@my.umbc.edu/a9298d503cf2b8d7e7cab5df8102125e/api/pixel</TrackingUrl>
<Tag>ai</Tag>
<Tag>datamining</Tag>
<Tag>machine-learning</Tag>
<Tag>nlp</Tag>
<Tag>semantic-web</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>3</PawCount>
<CommentCount>0</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Sun, 13 Feb 2011 23:19:35 -0500</PostedAt>
<EditAt>Sun, 13 Feb 2011 23:19:35 -0500</EditAt>
</NewsItem>

<NewsItem contentIssues="false" id="5156" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/5156">
<Title>Science on Dealing with Data</Title>
<Body>
<![CDATA[
    <div class="html-content">
    <div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F02%2F12%2Fscience-on-dealing-with-data%2F&amp;text=Science%20on%20Dealing%20with%20Data&amp;related=ebiquity&amp;lang=en&amp;count=vertical&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2011%2F02%2F12%2Fscience-on-dealing-with-data%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p>The current (11 February 2011) issue of Science is a special issue on <a href="http://www.sciencemag.org/site/special/data/" rel="nofollow external" class="bo">Dealing with Data</a>.  It includes a collection of free, online articles that “highlights both the challenges posed by the data deluge and the opportunities that can be realized if we can better organize and access the data.”  Some of the articles are drawn from three sister publications: <em>Science Signaling</em>, <em>Science Translational Medicine</em> and <em>Science Careers</em>.</p>
    <p>From the issue’s <a href="http://www.sciencemag.org/content/331/6018/692.short" rel="nofollow external" class="bo">introduction</a>:</p>
    <p><a href="http://www.sciencemag.org/site/special/data/ScienceData-hi.pdf" rel="nofollow external" class="bo"><img src="http://ebiquity.umbc.edu/blogger/wp-content/uploads/2011/02/data-cover.jpg" alt="Special issue of Science on Dealing with Data" width="97" height="124" style="max-width: 100%; height: auto;"></a></p>
    <blockquote><p> “Scientific innovation has been called on to spur economic recovery; science and technology are essential to improving public health and welfare and to inform sustainability; and the scientific community has been criticized for not being sufficiently accountable and transparent. Data collection, curation, and access are central to all of these issues.<br>
    …<br>
    As you will discover, two themes appear repeatedly: Most scientific disciplines are finding the data deluge to be extremely challenging, and tremendous opportunities can be realized if we can better organize and access the data.”  </p></blockquote>
    <p>One of the great things about the “data deluge” is that there is something in it for almost all computer science researchers including areas like machine learning, data mining, NLP, visualization, semantic web, security and privacy, social media, high performance computing, HCI, etc. Here are some of the articles that caught our eye:</p>
    <ul>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/649.short" rel="nofollow external" class="bo">Editorial: Making Data Maximally Available</a>, B. Hanson, A. Sugden, B. Alberts</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/705.short" rel="nofollow external" class="bo">Changing the Equation on Scientific Data Visualization</a>,  P. Fox and J. Hendler</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/719.short" rel="nofollow external" class="bo">Ensuring the Data-Rich Future of the Social Sciences</a>, G. King</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/721.short" rel="nofollow external" class="bo">Metaknowledge</a>,  J. A. Evans and J. G. Foster</li>
    <li>
    <a href="http://stm.sciencemag.org/content/3/69/69cm4.abstract" rel="nofollow external" class="bo">Electronic Consent Channels: Preserving Patient Privacy Without Handcuffing Researchers</a>, R. H. Shelton</li>
    <li>
    <a href="http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/2011_02_11/caredit.a1100012" rel="nofollow external" class="bo">More than Words: Biomedical Ontologies Provide New Scientific Opportunities</a>, C. Wald</li>
    </ul>
    <p>and still more that look very interesting:</p>
    <ul>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/700.short" rel="nofollow external" class="bo">Climate Data Challenges in the 21st Century</a>,  J. T. Overpeck <em>et al</em>.</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/703.short" rel="nofollow external" class="bo">Challenges and Opportunities of Open Data in Ecology</a>,  O. J. Reichman <em>et al</em>.</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/708.short" rel="nofollow external" class="bo">Challenges and Opportunities in Mining Neuroscience Data</a>,  H. Akil <em>et al</em>.</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/712.short" rel="nofollow external" class="bo">The Disappearing Third Dimension</a>,  T. Rowe and L. R. Frank</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/714.short" rel="nofollow external" class="bo">Advancing Global Health Research Through Digital Technology and Sharing Data</a>,  T. Lang</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/717.short" rel="nofollow external" class="bo">More Is Less: Signal Processing and the Data Deluge</a>,  R. G. Baraniuk</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/725.short" rel="nofollow external" class="bo">Access to Stem Cells and Data: Persons, Property Rights, and Scientific Progress</a>,  D. J. H. Mathews <em>et al</em>.</li>
    <li>
    <a href="http://www.sciencemag.org/content/331/6018/728.short" rel="nofollow external" class="bo">On the Future of Genomic Data</a>,  S. D. Kahn</li>
    <li>
    <a href="http://stke.sciencemag.org/cgi/content/abstract/scisignal.2001871" rel="nofollow external" class="bo">Conquering the Data Mountain</a>, N. R. Gough and M. B. Yaffe</li>
    <li>
    <a href="http://stm.sciencemag.org/content/3/69/69cm3.abstract" rel="nofollow external" class="bo">Power to the People: Participant Ownership of Clinical Trial Data</a>, S. F. Terry and P. F. Terry</li>
    <li>
    <a href="http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/2011_02_11/caredit.a1100013" rel="nofollow external" class="bo">Surfing the Tsunami</a>, E. Pain</li>
    <li>
    <a href="http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/2011_02_11/caredit.a1100014" rel="nofollow external" class="bo">Sharing Data in Biomedical and Clinical Research</a>, K. Travis</li>
    </ul>
    </div>
]]>
</Body>
<Summary>Tweet  The current (11 February 2011) issue of Science is a special issue on Dealing with Data.  It includes a collection of free, online articles that “highlights both the challenges posed by the...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2011/02/12/science-on-dealing-with-data/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/5156/guest@my.umbc.edu/ef37de7c7e5917cb73b9d8f302326611/api/pixel</TrackingUrl>
<Tag>machine-learning</Tag>
<Tag>semantic-web</Tag>
<Tag>social-media</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>11</PawCount>
<CommentCount>1</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Sat, 12 Feb 2011 18:52:36 -0500</PostedAt>
<EditAt>Sat, 12 Feb 2011 18:52:36 -0500</EditAt>
</NewsItem>

<NewsItem contentIssues="true" id="3935" important="false" status="posted" url="https://beta.my.umbc.edu/groups/ebiquity/posts/3935">
<Title>Naive Bayes classifier in 50 lines</Title>
<Body>
<![CDATA[
    <div class="html-content">
    <div><a href="http://twitter.com/share?url=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2010%2F12%2F07%2Fnaive-bayes-classifier-in-50-lines%2F&amp;text=Naive%20Bayes%20classifier%20in%2050%20lines&amp;related=&amp;lang=en&amp;count=horizontal&amp;counturl=http%3A%2F%2Febiquity.umbc.edu%2Fblogger%2F2010%2F12%2F07%2Fnaive-bayes-classifier-in-50-lines%2F" rel="nofollow external" class="bo">Tweet</a></div>
    <p>The <a href="http://en.wikipedia.org/wiki/Naive_Bayes_classifier" rel="nofollow external" class="bo">Naive Bayes classifier</a> is one of the most versatile machine learning algorithms that I have seen around during my meager experience as a graduate student, and I wanted to do a toy implementation for fun. At its core, the implementation is reduced to a form of counting, and the entire Python module, including a test harness took only 50 lines of code. I haven’t really evaluated the performance, so I welcome any comments. I am a Python amateur, and am sure that experienced Python hackers can trim a few rough edges off this code.</p>
    <h2>Intuition and Design</h2>
    <p>Here is definition a of the classifier functionality (from wikipedia):</p>
    <p><img src="http://upload.wikimedia.org/math/c/2/e/c2e227dfe0979e43cf06bfa318652dd3.png" alt="" width="504" height="55" style="max-width: 100%; height: auto;"></p>
    <p>Now this means, that for each possible class label, multiply together the conditional probability of each feature, given the class label. This means, for us to implement the classifier, all we need to do, is compute these individual conditional probabilities for each label, for each feature, p(Fi | Cj), and multiply them together with the prior probability for that label p(Cj). The label for which we get the largest product, is the label returned by the classifier.</p>
    <p>In order to compute these individual conditional probabilities, we use the <a href="http://en.wikipedia.org/wiki/Maximum_likelihood" rel="nofollow external" class="bo">Maximum Likelihood Estimation</a> method. In a very short sentence, we approximate these probabilities using the counts from the input/training vectors.</p>
    <p>Hence we have: p(Fi | Cj) = count( Fi ^ Cj) / count(Cj)</p>
    <p>That is, we count from the training corpus, the ratio of the number of occurrences of the feature Fi and the label Cj together to the total number of occurrences of the label Cj.</p>
    <h2>Zero Probability Problem</h2>
    <p>What if we have never seen a particular feature Fa and a particular label Cb together in the training dataset? Whenever they occur in the test data, p(Fa | Cb) will be zero. Hence the overall product will also be zero. This is a problem with maximum likelihood estimates. Just because a particular observation was not made during training does not mean that it will never occur in the test data. In order to remedy this issue, we use what is known as smoothing. The simplest kind of smoothing that we use in this code, is called “add one smoothing”. Essentially, the probability for an unseen event should be greater than one. We achieve this by adding one to each zero count. The net effect should be that we redistribute some of the probability mass from the non-zero count observations to the zero-count observations. Hence, we also need to increase the total count for each label by the number of possible observations, in order to maintain the total probability mass at 1.</p>
    <p>For example, if we have two classes C = 0 and C = 1, then after smoothing, the smoothed MLE probabilities can be written as:</p>
    <p>p-smoothed(Fi | Cj) = [count(Fi ^ Cj) + 1]/[count(Cj) + N] where N is the total number of observations across all features in the training corpus.</p>
    <h2>Code</h2>
    <p>For simplicity, we will use Weka’s <a href="http://www.cs.waikato.ac.nz/~ml/weka/arff.html" rel="nofollow external" class="bo">ARFF</a> file format as input. We have a single class called Model which has a few dictionaries and lists to store the counts and feature vector details. In this implementation, we only deal with discrete valued features.</p>
    <p></p>
    <p>The dictionary ‘features’ saves all possible values for a feature. ‘<em>featureNameList</em>‘ is simply a list that contains the names of the features in the same order that it appears in the ARFF file. This is because our features dictionary does not have any intrinsic order, and we need to maintain feature order explicitly. ‘<em>featureCounts</em>‘ contains the actual counts for co-occurrence of each feature value with each label value. The keys for this dictionary are tuples of the form (class_label, feature_name, feature_value). Hence, if we have observed the feature F1 with the value ‘x’ for the label ‘yes’, fifteen times, then we will have the entry {(‘yes’, ‘F1′, 15)} in the dictionary. <strong>Note</strong> how the default values for counts in this dictionary is ’1′ instead of ’0′. This is because we are smoothing the counts. The ‘<em>featureVectors</em>‘ list actually contains all the input feature vectors from the ARFF file. The last feature in this vector is the class label itself, as is the convention with weka ARFF files. Finally, ‘<em>labelCounts</em>‘ stores the counts of the class labels themselves, i.e. now many times did we see the label Ci during training.</p>
    <p>We also have the following member functions in the Model class:<br>
    </p>
    <p>The above method simply reads the feature names (including class labels), their possible values, and the feature vectors themselves; and populate the appropriate data structures defined above.<br>
    </p>
    <p>The TrainClassifier method simply counts the number of co-occurrences of each feature value with each class label, and stores them in the form of 3-tuples. These counts are automatically smoothed by using add-one smoothing as the default value of count for this dictionary is ’1′. The counts of the labels is also adjusted by incrementing these counts by the total number of observations.</p>
    <p></p>
    <p>Finally, we have the Classify method, that accepts as argument, a single feature vector (as a list), and computes the product of individual conditional probabilities (smoothed MLE) for each label. The final computed probabilities for each label are stored in the ‘<em>probabilityPerLabel</em>‘ dictionary. In the last line, we return the entry from <em>probabilityPerLabel</em> which has the highest probability. Note that the multiplication is actually done as addition in the log domain as the numbers involved are extremely small. Also, one of the factors used in this multiplication, is the prior probability of having this class label.<br>
    Here is the complete code, including a test method:</p>
    <p></p>
    <p>Download the <a href="http://cs.umbc.edu/~krishna3/linked-files/tennis.arff" rel="nofollow external" class="bo">sample ARFF file</a> to try it out.</p>
    </div>
]]>
</Body>
<Summary>Tweet  The Naive Bayes classifier is one of the most versatile machine learning algorithms that I have seen around during my meager experience as a graduate student, and I wanted to do a toy...</Summary>
<Website>http://ebiquity.umbc.edu/blogger/2010/12/07/naive-bayes-classifier-in-50-lines/</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/3935/guest@my.umbc.edu/9303130906169c8fa1a596ad82a806d9/api/pixel</TrackingUrl>
<Tag>machine-learning</Tag>
<Group token="ebiquity">Ebiquity Research Group</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/ebiquity</GroupUrl>
<AvatarUrl>https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="original">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/original.gif?1282159680</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="xlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xlarge.png?1282159680</AvatarUrl>
<AvatarUrl size="large">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/large.png?1282159680</AvatarUrl>
<AvatarUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/medium.png?1282159680</AvatarUrl>
<AvatarUrl size="small">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/small.png?1282159680</AvatarUrl>
<AvatarUrl size="xsmall">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xsmall.png?1282159680</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/003/36ac8e558ac7690b6f44e2cb5ef93322/xxsmall.png?1282159680</AvatarUrl>
<Sponsor>ebiquity research group</Sponsor>
<PawCount>7</PawCount>
<CommentCount>0</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Tue, 07 Dec 2010 00:39:46 -0500</PostedAt>
<EditAt>Tue, 07 Dec 2010 00:39:46 -0500</EditAt>
</NewsItem>

</News>
