“We worked on implementing ‘roads’—direct paths between articles indicating when one article referenced another.”
—Talha Ahsan ’18
Talha Ahsan ’18
Eagan, Minnesota
Computer Science, Neuroscience
Semantic relatedness examines relationships between concepts, and much of that involves analyzing human writing. I worked with computer science professor Shilad Sen and a group of eight other student researchers to explore semantic relatedness using Wikipedia as a prime resource.
Our research was part of a multi-year research effort. In previous years, Professor Sen’s group essentially built two large projects, Wikibrain and Cartograph, which we used and developed further.
Wikibrain lets us process an entire language database in Wikipedia (English Wikipedia or Spanish Wikipedia, for example). Cartograph then lets us visualize the processed data, creating a “map” with the Wikipedia articles acting as “cities.”
Cartograph can create “countries” where each country is composed of articles that are in some way similar. For example, in the simple English Wikipedia dataset, the “United States” article is very close to the “Barack Obama” article, and they are both a part of the country associated primarily with the United States government.
Our research group focused on cleaning up and expanding Cartograph’s functionality. Initially, Cartograph would show us clusters or countries of articles, but not the relationships between them. We worked on implementing “roads”—direct paths between articles indicating when one article referenced another. Providing roads showing relationships added another layer of information to Cartograph.
Initially, Shilad had us work in two separate groups, both looking at trying out various methods to show roads. After exploring different options, we decided to work together on a method of showing major roads on the screen and we hope to have the roads fully ready to go for Cartograph sometime this year.
Talha’s research was supported by a grant from the National Science Foundation.
November 15 2017
Back to top