Starting with first principles: Broadening the definition of scholarly communication, review, and peer
The communications of today’s scholars encompass not only book and journal publications, but also less formal textual communications and a variety of other work products, many of them made possible by recent advances in information technology. It is not uncommon today for an academic CV or faculty activity report (the productivity report that faculty members provide each year) to list news articles, blog postings, tweets, video presentations, artworks, patents, datasets, recordings of government testimony, computer code, in-person (conference and meeting) communications, and other artifacts. Yet our entrenched systems of review and recognition are out of sync with this evolving scholarly record. New ways to track our broader scholarly record, and new methods and metrics for identifying relevant work and evaluating quality and impact of works, are needed to help legitimize a broader definition of academic contribution.
The term “review” refers to several different events or functions in the scholarly communications workflow. It can refer to evaluation of the aggregate contributions of a researcher as well as evaluation of specific publications describing specific research. It encompasses both qualitative and quantitative evaluation, at both pre-publication and post-publication stages.
Our vision of the future of peer review in scholarly communication sees a diminishing role for pre-publication evaluation, as the high costs associated with it have begun to outweigh the researcher incentives to contribute on a quid pro quo basis. New systems for post-publication evaluation are much better poised to leverage current and emerging scholarly workflows, and to capture multi-dimensional measures that combine qualitative and quantitative information in a way that serves the researcher-as-reader (by helping the reader to most effectively allocate limited attention to an ever more overwhelming scholarly record) and the researcher-as-author/contributor. Additionally, these new forms of review can provide valuable input into appointment (promotion and tenure) assessment processes — though we strongly oppose an over-reliance on simple quantitative measures related to publications as a substitute for thorough peer review of a scholar’s contributions within the assessment process.
Our vision of the future of peer review also sees a much broader definition of “peers” — those with a voice to participate in the review process — to include any consumer of the work, evaluating the work in any traceable context. This will add new complexity to the system, but also perhaps new transparency, and may have the effect of turning some review processes into more conversational activities.
The current scholarly evaluation system works in many ways. It is increasingly clear that it could work better, but before we start suggesting alternatives, we need to examine what’s good about the present evaluation system. At their best, our current methods of review persist because they have been relatively successful at filtering out erroneous, duplicative, and less significant work, as well as helping us identify significant insights, advances, and discoveries. They leverage the trust we place in our peers. The hierarchy of peer-reviewed journals, supported by quantitative metrics like the Impact Factor, offers easy-to-use reputational heuristics in the academic review process.
These systems represented the state-of-the-art of an earlier technological era (the era of paper journals and books), and have served the community well. However, new technologies present opportunities to do better. Where legacy systems increasingly fall short is that they don’t function well across disciplines, they are subject to misuse, gaming, and misinterpretation, they can substantially slow the process of disseminating research findings, and they place significant burdens on researchers (this is particularly true with regards to pre-publication peer review). The result is that the process can be said to fail (in some measure) for the majority of published content.
Because the review process is deeply embedded in cultural norms for disciplines and sub-fields, innovative tools fail more commonly due to reasons of incompatibility with existing norms and behavior rather than due to technical shortcomings. Recent experiments that have failed have done so because they fail to understand disciplinary norms and boundaries (for instance, while a tool like arXivhas been very successful in certain fields of physics, the attempt to port it into certain fields of chemistry failed due to differing norms of sharing in the two fields); or they attempt to source crowds where there are no interested crowds (e.g., Facebook-like social networks for scientists). Even the best tools will not succeed unless they harness the essential elements of existing review systems — particularly filtering and trust — and also support existing communities’ cultural parameters. Change of this nature is likely to come slowly, particularly if it is to have lasting impact.
Most promising technologies and practices for the future
Considering the diversity of research, its practitioners, and the quickly-changing scholarly environment, we should not expect a one-size-fits-all solution to the problems of review nor for a single Holy Grail metric to emerge in the foreseeable future. Rather, the future will rely on a range of tools to filter the work and assess its quality in the their disciplinary communities and amongst individual scholars.
Different disciplines will obviously rely on different tools but scholars must take responsibility for the accuracy and completeness of their own part of the scholarly record. While Open Access is one part of the solution, scholars need to educate themselves on the specific tools used in their disciplines and be aware of the tools used in others to maximize the possibilities.
These tools monitor the traces they leave to ensure their scholarship is accurately presented in the venues their disciplines value most. It is the responsibility of (and indeed, incumbent upon) the individual scholar to take control of his or her bibliographic legacy and reputation. Beyond this, scholars have vastly extended opportunities to understand how their work is being read and built upon by other scholars.
While the tools for consumers and producers of scholarship listed below provide an important overview of the current possibilities, there isn’t an expectation that they are appropriate or relevant for all disciplines.
Tools for consumers of scholarship: How to find the good stuff
● RSS feeds for scientists, shared through bookmarking
● Social Networks: Twitter, Google+, Facebook
○ Discipline-specific Twitter lists, Facebook groups, Google+ circles (making it easier to follow colleagues and colleagues’ work)
● Computing over literature: textual analysis (SEASR)
Tools for producers of scholarship: How to make sure your stuff is given its just rewards
Currently Unsolved Problems, Solutions, and Strategic Opportunities
There are several (currently) unsolved problems, these fall into 3 distinct categories:
Technical / Business Limitations
Problem: There are a lack of industry standards, meaning that the same metric cannot be easily compared between different platforms. Solution: Standards bodies (such as NISO) should define cross platform industry standards
Problem: Related to this, many corpora are not open in a machine readable fashion, making it problematic to apply a uniform metric across them. Solution: If possible, authors should mirror their content in Open Access Repositories
Problem: Certain metrics benefit from content being available in a specific (non-universal) format (e.g., Open Access content will gain more usage; multimedia content will receive more interaction). Hence certain content will be naturally disadvantaged in any alt-metrics evaluation of its benefits. Solution: Standard meta data sets should be attached to all output.
Problem: If we rely on third parties for data, then we must accept that those sources may change over time (or disappear). This means that alt-metric evaluations may never be ‘fixed’ or ‘repeatable’. Solution: Everything decays, but permalinks, and archival storage of data can limit the damage.
Problem: Metrics by and large only include so-called formal publications and do not capture the variety of informal media and data that are increasingly important to scholarly discourse. Solution: Providers of data need to open up their data sources, allowing tools to easily mine the widest possible variety of sources
Problem: Important decision makers (e.g., tenure committees) do not use alt-metrics in their evaluation process. Solution: The utility of alt-metrics needs to be demonstrated in order to persuade decision makers to use them
Problem: People ‘cite’ work in a wide variety of ways with a variety of semantics, resulting in difficulty automatically mining these cites. Solution: Due to the human condition, It is possible that there is no solution
Problem: Some work is never cited at all, but simply influences the work of others (e.g., a study which may inform a change in governmental policy). This can make automated mining impossible. Solution: Perhaps automation is impossible, but crowdsourcing this discovery is a solution
Adoption / Understanding Issues
Problem: Generational, geographic, and disciplinary differences mean that not all academics have adopted new methods of dissemination / evaluation to the same extent, hence disadvantaging certain sectors. Solution: Metrics that get valued the most should be the ones which have been adopted to the greatest extent. Societies and funders should encourage ‘best’ adoption of tools.
Problem: The notion of long-term ‘impact’ is not really well understood in traditional metrics and therefore is hard to replicate or improve upon in emerging methods. Different metrics have different value for different people and so academia will need to understand that there may never be a single metric to describe all work. Solution: We need a more nuanced and multi-dimensional understanding of impact for different groups.
● Propagate changes in attribution across scholarly systems (ORCID). ORCID is not in a position to handle the retrospective problem, so perhaps ‘gamify’ the problem to disambiguate author names;
● Develop centralized (disambiguated) tools to track when you are being cited in the widest variety of possible sources;
● Develop standards to define metrics and metadata
● Open up common social media tools to develop more open environments for data interchange
● Dedicate more attention to non-Western tools and services (including multilingual tools)
● Propagate the adoption of Open (pre-publication) Peer Review
Interdependencies with other topics
The future of peer review in scholarly communication will include new methods and metrics for evaluating quality and impact that:
● extend beyond traditional print and digital outputs,
● are dependent on a broadening definition and semantic richness of literature — that is, of the communications that form the “reviewable” scholarly record.
Efforts to broaden and filter our scholarly record will legitimize a broader definition of academic productivity and enable new models of academic recognition and assessment that better align with the actual activities and contributions of today’s scholars.
Peter Binfield, PLoS
Amy Brand, Harvard University
Gregg Gordon, SSRN
Sarah Greene, Faculty of 1000
Carl Lagoze , Cornell University
Clifford Lynch, CNI
Tom McMail, Microsoft Research
Jason Priem, UNC-Chapel Hill
Katina Rogers, Alfred P. Sloan Foundation
Tom Scheinfeldt, Roy Rosenzweig Center for History and New Media at George Mason University