Unique Author Identifiers: ORCID, Scopus ID, Researcher ID – what’s the difference?

I posted this earlier on Northumbria’s ORCID blog as part of our Jisc-funded project to embed ORCID across researcher career paths. I’m not an expert by any means on the topic of unique author IDs, but the Library, which is running the project, felt it would be good to get a non-librarian’s point of view on this issue, so I did some research and this is what I found:

Fingerprint by CPOA (CC BY-ND 2.0)
Fingerprint by CPOA (CC BY-ND 2.0)

“As we’ve mentioned previously, there is a problem in overcoming “standards proliferation” fatigue when promoting any new research tool or ID standard. The web has enabled an unprecedented level of sharing and connectivity, especially with regards to research, but the downside of opening up the floodgates means you need better ways of filtering information and finding what’s relevant.

In the context of research, unique author identifiers are one way to solve the problem of author ambiguity. Being able to consistently and accurately identify the author of a piece of research is valuable for researchers (because it means you can quickly find other work by the same author and ensure your own work is linked to you) and for administrators and librarians (because it’s vital to be able to manage and report on the research your institution has produced).

Several unique author identifiers have been developed to address this issue in recent years, but the three main ones are: ResearcherID (developed by Thomson Reuters and used in Web of Science and related products); Scopus Author ID (developed by Elsevier and used in Scopus and related products); and ORCID (developed by ORCID Inc., which is a non-profit, community-driven organisation, based in the USA but with international membership).

At a basic level, all three tools do more or less the same thing; that is, uniquely identify an author and link this unique ID with his/her publications/outputs. However, there are important differences between them and reasons why you may need more than one.

The ownership of these IDs is an important signal of their use. ResearcherID is used to identify authors and enables users to build a publication profile and generate citation metrics from Web of Science (assuming you have access to this product, usually provided through your institution). Scopus Author ID is automatically assigned to all authors and also ensures this information is accurately reflected in the various Scopus tools such as search and analytics (e.g. citation tracking, h-index etc.).* Researchers benefit from this by clearly identifying their work, but so do the publishers: ensuring their information is up-to-date and accurate means that they raise the quality of their associated search, discovery and analysis products, which potentially leads to increased traffic to these services and, ultimately, sales.

orcid-logoORCID, by contrast, is not owned by a publisher, but is a community-driven organisation which includes representatives from a range of stakeholders on its board of directors, including funders (Wellcome Trust), publishers (Thomson Reuters, Wiley-Blackwell, Nature, Elsevier) and universities (Cornell, Barcelona, MIT). While ResearcherID and Scopus ID are linked to the journal output of their respective publishers, ORCID is neutral in this respect; you can associate an ORCID with an output in any article from any publisher, and you can also attach the identifier to datasets, equipment, media stories, experiments, patents and notebooks. ORCID does not offer any “added value” services such as citation analysis linked to the acquisition of an ORCID (as is the case with ResearcherID and Scopus ID) – its mission statement is really very simple: “to solve the name ambiguity problem in research and scholarly communications by creating a central registry of unique identifiers for individual researchers and an open and transparent linking mechanism between ORCID and other current researcher ID schemes.”

If you are a researcher, then, depending on your discipline, you may need to sign up for all three. But this is not as difficult as it sounds. Both ResearcherID and Scopus Author ID have relatively painless ways of linking up with your ORCID (see ResearcherID-ORCID integration and Scopus ID-ORCID integration). This will automatically link the research linked to ResearcherID/Scopus ID to your ORCID and ensure that publication lists can be imported from one ID system into another.

If you are a researcher and you haven’t yet signed up for any of these author identification tools then the best way forward is to sign up for an ORCID. ORCID is rapidly becoming the standard, and is supported by many publishers, universities and funders.”

* 07/08/14 Following recent discussion with Scopus on Twitter, I’ve clarified the wording here. All authors are/can be assigned Scopus IDs/ResearcherIDs – not just authors published in Elsevier/Thomson Reuters journals.


Web of Knowledge Changes This Weekend

A guest post from Michelle Walker, Research Support Librarian at Northumbria University, to alert researchers to upcoming changes to the Web of Knowledge database:

web by David Reid CC BY-NC 2.0Web of Knowledge is undergoing a revamp this weekend and so there will be some downtime associated with the upgrade.

Between Sunday 12th  2pm and Monday 13th 2am Web of Knowledge may not be available intermittently during this period.

There will be some significant changes to the look and feel of the database so you may wish to consult the online help or training videos for a quick refresher. It will now be known as simply ‘Web of Science’.

See the video below for a quick tour of the new features and presentation:

More information can be found at the links below:

Information leaflet: http://wokinfo.com/nextgenwebofscience

Further training videos: http://wokinfo.com/training_support/training/web-of-science/

There should be no changes to search engine or search logic  so if you have Saved searches/alerts these should continue to work as expected. There also are no changes to inbound URLs  so all existing links/bookmarks providing access should continue to work.

For technical issues regarding the upgrade please contact the IT helpline or for support using and searching the database please contact ask4help@northumbria.ac.uk


Bibliometrics for Beginners – Workshop Report

I can read a book by mitikusa.net CC BY-NC-ND 2.0Our Assistant Director of Research, Ruth Hattam, recently attended a workshop jointly run and sponsored by ARMA and Thomson Reuters on Bibliometrics for Beginners. Bibliometrics – a key part of which is use of citation data – is growing in importance in Higher Education, particularly as research funding becomes more competitive and institutions need ways to analyse strengths and target key areas for support.

Bibliometrics can to some extent allow strategic oversight of research activity and performance, although it does have several drawbacks and limitations, some of which were covered in the workshop.

Read on to find out more and for an overview of the day.

Bibliometrics: What is it and how is it used?

Bibliometrics literally means ‘measure of books’.

Use of citation indexes is clearly key to bibliometrics.  A number were mentioned including Web of Science (Thomson Reuters), Scopus (Elsevier) – both databases – and Google Scholar, which also includes internet sources.  There are also regional/subject-specific databases.  As different editorial policies apply only one database should be used per comparison – mixing and matching isn’t encouraged as it wouldn’t give a balanced picture.

Possible uses of bibliometric data include:

  • individual review and recruitment;
  • University rankings;
  • REF 2014 will use citations for the first time (not in all Panels but 3, 4 and 11 will);
  • and grant applications

In terms of metrics, the h-index was developed to measure quality and quantity. This is the point at which the number of papers which have been most cited is equal to the number of citations (for example, an academic with 378 papers – 48 of which had been cited 48 times).  Self-citation is one of possible pitfalls when using h-indices – but this figure can be removed from the analysis.

While citation based analysis like this can be useful, one needs to bear in mind that it doesn’t always take into account the nature of the citation itself. For example, a paper can be cited because the author: wants to build on prior knowledge; agrees or disagrees with the analysis; wants to help or hinder other researchers; wants to disprove the conclusion; and to improve their own impact factor. Outliers can also skew the results significantly.  Bear in mind that people can look very good on paper even though they are no longer researching, for example Aristotle!

Data: From indexing to indicators

It’s important to understand what various bibliometric databases do and don’t include: First, they don’t contain all journals – 80 % of papers are published in 40% of journals, so databases don’t try to capture 100% of journals.  Google Scholar is much more inclusive because it catches more publications, but the flip side is that these are not necessarily as high quality.

Second, it is worth considering how data is collected. Thomson Reuters (TR) uses:

  • publishing standards e.g. peer review, editorial conventions, TR have subject specialists who assess content;
  • editorial content e.g. TR have subject specialists;
  • diversity, regional influence of authors;
  • citation analysis – for new journals there is an analysis of editor and authors’ prior work

Third, what kind of outputs are indexed? The majority of citations from books are within arts and humanities and social sciences. Science subject nearly always cite other journals which reflects the speed at which field moves. Where books are concerned TR insist on original research and exclude textbooks.

Fourth, how are the data organised? TR has 249 subject areas, and has incorporated REF categories.

While there are clear advantages to using citation analysis, there are also a number of limitations:

  • productivity is volume not quality, although you could argue that has been quality tested (i.e. peer reviewed) to get into the journal in first place;
  • number of self-citations – the h-index would not distinguish these;
  • it is papers and not the person being cited;
  • stage of career not factored in: established researchers have higher productivity, citation count, and h index, so you have to normalise for publication year. One approach is to divide citation count by number of active years of research.  TR compares each paper only with other papers of same year – look at average number of citations papers received in that year;
  • subject differences need to be factored in. Need to do comparisons by subject not within university.  TR normalise by subject category and academic year.  Also need to distinguish between outputs, e.g. original research versus reviews so need to look at document type;
  • value of citation not assessed – will include negative citations;
  • relative contribution of each author on a paper not known;
  • number of authors on a paper not known. Can normalise by calculating average number of authors per paper calculation;
  • does not automatically account of differences in subject field: there are lots of initial citations in sciences, then the field moves on – mathematics is a low cited field and number of citations is more constant.

Two other bibliometric analyses are worth considering here: Journal Impact Factor (JIF) and Eigenfactor metrics:

JIF looks at impact of journal in a particular research community over the last 2 years based on number of citations. This is then normalised for size of journal.  The impact factor is the number of citations divided by number of articles published in that journal. This is not as good an indicator for slow moving fields because it only goes two years back. It is good at capturing high level activity for fast moving subjects, e.g. natural sciences, engineering, and can inform where to publish in those subjects. JIF has developed 5 year impact factor to take account of subject differences.

Eigenfactor metrics were developed at University of Washington by Jevin West and Carl Bergstrom. From Wikipedia: Eigenfactor is a rating of the total importance of a scientific journal. Journals are rated according to the number of incoming citations, with citations from highly ranked journals weighted to make a larger contribution to the eigenfactor than those from poorly ranked journals. Eigenfactor scores and article influence scores are freely viewable on eigenfactor.org.

Conclusion: International issues and the future of bibliometrics

Citation data can be used to examine the extent of international collaborations of researchers or institutions.  Data showed that working with international collaborators increases the number of citations.

It can also be used by a university to look at citation data relative to number of published papers.  Universities are starting to look at citations and other factors, e.g. amount of industry income brought in per researcher, number of doctoral degrees awarded per member of academic staff – based on data from HESA.  Thomson Reuters also combine with citation analysis with other data sources to perform more fine-grained analyses.

The point was made that China might be bringing down average due to massive increase in growth of number of papers but relatively low citation rate.

Ultimately before doing any sort of analysis or evaluation, you need to clearly define your objectives – what do you want to know and what will the data inform?


For any Northumbria staff interested in finding out more about the way bibliometrics can be used, it’s worth noting that the Library provides online instructional support in using such tools in the Measuring your Research Performance  section of Skills Plus: http://nuweb2.northumbria.ac.uk/library/skillsplus/topics.html?l3-13