We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
Fourth IEEE International Conference on eScience
Mendeley – A Last.fm for Research?
Victor Henning Jan Reichelt
Bauhaus-University of Weimar, Germany University of Cologne, Germany Mendeley Ltd., London, UK Mendeley Ltd., London, UK victor.henning@mendeley.com jan.reichelt@mendeley.com
Abstract and musical preferences. The data is gathered as follows:
Last.fm’s “Audioscrobbler” desktop software, after This paper aims to explore how the principles of a having been installed on a user’s PC, starts tracking a well-known Web 2.0 service, the world’s largest social user’s music listening behavior. The listening data is sent music service “Last.fm” (www.last.fm), can be applied to to the Last.fm website, where a profile of the user’s research, which potential it could have in the world of musical tastes is created. Listening statistics for each research (e.g. an open and interdisciplinary database, song, album, artist, and genre are aggregated and made usage-based reputation metrics, and collaborative available online. In this way, Last.fm has created the filtering) and which challenges such a model would face world’s largest open music database, comprising over 80 in academia. A real-world application of these million songs, accessible by everyone. The user- principles, “Mendeley” (www.mendeley.com), will be generated data also lays the foundation for demoed at the IEEE e-Science Conference 2008. personalization, collaborative filtering, and ontological classifications: · Users can view timelines and statistics about their 1. Introduction own listening behavior, · view the most popular tracks for each of their Ways of how to turn Web 2.0 applications into favorite artists, and most popular artists for their productive social research tools are currently being favorite genre, discussed at major academic conferences (e.g. European · receive music recommendations based on the song Science Open Forum 2008, Science Blogging Conference library already existing on their PC, and 2008, Science in the 21st Century Conference), and both · discover similar tracks/artists for every track/artist in multi-purpose social software, such as wikis, blogs, and the Last.fm database. social networks, and more specific services such as Twitter, Friendfeed, or CiteULike are currently being 3. The model of Last.fm applied to research used and evaluated by a number researchers and academics. This paper aims to explore how the principles Last.fm’s service is based on aggregating the users’ of a well-known Web 2.0 service, “Last.fm”, can be existing music libraries, relationships between artists applied to the domain of academic research. writing songs in different genres, and the users’ music listening behavior. Similarly, a service for academic 2. How Last.fm works researchers could be based on aggregating scholars’ existing research paper libraries, relationships between Last.fm (www.last.fm), which bills itself as a “social researchers writing papers in different disciplines, and music service”, has managed to create the largest the scholars’ paper reading behavior. ontological classification (and the largest open database) Along these lines, a “Last.fm for research” would be of music in the world, by aggregating the musical tastes able to display statistics to each individual user about his of its 20 million users and then data-mining it for similar personal library, to aggregate readership statistics about musical genres, artists, and songs. The users form a papers, authors, journals, and academic disciplines, and social network that is not based on pre-existing real- to recommend interesting articles and researchers to the world relationships; instead, Last.fm’s network emerges user. We envision that such a tool consists of two parts: around data that describes its users’ listening behavior First, a desktop application which helps researchers
DOI 10.1109/eScience.2008.128 manage their academic papers and anonymously tracks linguistics, neurophysiology). However, due to space their reading habits and literature usage. Second, a constraints, he would mostly limit his citations to papers website where the users can discover aggregated published in psychology journals. Picking up such statistics, top papers, trends and charts for each patterns, CF recommendations would thus help discipline, paper recommendations, and introductions to researchers to discover literature that could be of interest people with similar research interests. Adoption of such a to them, even though it is not found in the citation service would have a number of advantages for academia network of their existing library. Moreover, CF would at large, of which we will discuss three important ones. enable researchers to identify people with similar The creation of an open and interdisciplinary research interests (based on their paper libraries) and database: Similar to Last.fm’s efforts in the space of thus foster collaboration and academic networking. music, a tool which aggregates metadata, tags and article Finally, CF would start generating a rich network of usage of a large number of researchers could lead to an relationships for a paper as soon as it is published, rather open, interdisciplinary and ontological database of than having to wait months or years to get cited. research, providing a free and invaluable source of Of course, there are a number of obstacles to be information to every individual researcher. Working in overcome before such a model can be turned into reality conjunction with Open Access libraries, this would be in the field of academic research. Arguably the biggest another cornerstone in building alternatives to expensive obstacle is that a sufficient number of participants is pay-walled databases. needed to gather reliable usage data. So how could Usage-based reputation metrics and real-time scholars be convinced of taking part in generating statistics: Usage-based reputation metrics, which could research paper usage data? In our opinion, the answer is be described as “Nielsen ratings for science”, would that doing so must confer some type of utility to them alleviate many of these problems associated with beyond the idea of contributing to a fuzzy “greater good traditional citation-based reputation metrics. A starting of science”. More specifically, the tool that does the point for usage-based metrics would be to track the measuring on the researchers’ computers should do this pervasiveness of research papers, i.e. whether they are only as a secondary purpose, and must have some other present on the computers of a wide-ranging, distributed primary usage value. Moreover, whereas Last.fm openly sample of academics. This would be a measure of the displays each user’s listening behavior, privacy is critical popularity or awareness that a paper – and by association, in the space of research. While it isn’t much of a its author, publication journal, and topic – is enjoying. A technical issue to hide a researcher’s library and reading second, more fine-grained usage metric would be the data, a “Last.fm for research” would have to convince actual time spent reading each research paper (on screen, potential users that it can be trusted with such sensitive e.g. in Adobe Reader), and the number of repeat readings data, and that no personally identifiable data would ever per paper. This would be a measure of the intensity with be made public without the researcher’s explicit consent which the paper (its author, publication journal, topic) is being examined – did readers only skim through a paper, 4. Mendeley – A Last.fm for research or did they peruse it in detail? Finally, these metrics could be augmented with quality ratings and tags that Mendeley is an application of Last.fm’s principles to help differentiate mere measures of attention from the domain of research. Mendeley Desktop, a free and explicit quality judgments. cross-platform desktop application, automatically extracts Collaborative filtering: Usage data would also be the metadata, full-text and cited references from research basis for developing paper recommendation engines papers to minimize manual data input when setting up a based on collaborative filtering (CF) principles. It has local research paper database. It then enables researchers been argued that citations already act as reading to manage, tag, full-text search, cite in Word and LaTeX, recommendations, and recommendation models based on and share research papers, thus providing researchers co-citations or citation networks (“localized graph with usage value independent of any network effects. search”) already exist. CF recommendations, however, The companion website, Mendeley Web, can be used for have additional potential. First, they could increase the backing up research papers, creating a public research interdisciplinarity of research because they may be better profile, and connecting to like-minded researchers. in uncovering parallels between academic disciplines Mendeley Web already displays the pervasiveness of than citation networks. For example, when doing research papers, authors, journals and tags as measured research on the psychology of emotion, a psychologist by Mendeley Desktop. Reading time and quality rating would typically also have read papers on emotion metrics as well as CF recommendation mechanisms will published in adjacent fields (philosophy, literature, be implemented soon.