Citation metrics

Publish or Perish calculates the following citation metrics:

Total number of papers
Total number of citations
Average number of citations per paper
Number of citations per author
Number of citations per author per year
Number of papers per author
Average number of authors per paper
Hirsch's h-index and related parameters, shown as h-index and Hirsch a=y.yy, m=z.zz in the output. Also Zhang's e-index.
Egghe's g-index, shown as g-index in the output
The contemporary h-index, shown as hc-index and ac=y.yy in the output
Three variations of the individual h-index, shown as hI-index, hI,norm, and hm-index in the output
The average annual increase in the individual h-index, shown as hI,annual
The age-weighted citation rate
An analysis of the number of authors per paper.

Please note that these metrics are only as good as their input. We recommend that you consult the following topics for information about the limitations of the citation metrics and the underlying sources that Publish or Perish uses:

Basic metrics

The basic metrics are quite straightforward and are calculated as follows in Publish or Perish.

Total number of papers: This is simply the number of papers returned by Google Scholar or Microsoft Academic Search in reply to a query.
Total number of citations: The sum of the citation counts across all papers.
Average number of citations per paper: The sum of the citation counts across all papers, divided by the total number of papers. The median and mode are also calculated.
Number of citations per author: For each paper, its citation count is divided by the number of authors for that paper to give the normalized per-author citation count for the paper. The normalized citation counts are then summed across all papers to give the number of citations per author over the result set.
Number of citations per author per year: This is the number of citations per author as above, divided by the number of years covered by the result set.
Number of papers per author: For each paper, 1/author_count is calculated to give the normalized author count for the paper. The normalized author counts are then summed across all papers to give the number of papers per author.
Average number of authors per paper: The sum of the author counts across all papers, divided by the total number of papers. The median and mode are also calculated.

h-index

The h-index was proposed by J.E. Hirsch in his paper An index to quantify an individual's scientific research output, arXiv:physics/0508025 v5 29 Sep 2005. It is defined as follows:

A scientist has index h if h of his/her N_p papers have at least h citations each, and the other (N_p-h) papers have no more than h citations each.

It aims to measure the cumulative impact of a researcher's output by looking at the amount of citation his/her work has received. Publish or Perish calculates and displays the h index proper, its associated proportionality constant a (from N_c,tot = ah²), and the rate parameter m (from h ~ mn, where n is the number of years since the first publication).

The properties of the h-index have been analyzed in various papers; see for example Leo Egghe and Ronald Rousseau: An informetric model for the Hirsch-index, Scientometrics, Vol. 69, No. 1 (2006), pp. 121-129.

Publish or Perish also calculates the e-index as proposed by Chun-Ting Zhang in his paper The e-index, complementing the h-index for excess citations, PLoS ONE, Vol 5, Issue 5 (May 2009), e5429. The e-index is the (square root) of the surplus of citations in the h-set beyond h², i.e., beyond the theoretical minimum required to obtain a h-index of 'h'. The aim of the e-index is to differentiate between scientists with similar h-indices but different citation patterns.

These metrics are shown as h-index, Hirsch a=y.yy, m=z.zz, and e-index in the output.

g-index

The g-index was proposed by Leo Egghe in his paper Theory and practice of the g-index, Scientometrics, Vol. 69, No 1 (2006), pp. 131-152. It is defined as follows:

[Given a set of articles] ranked in decreasing order of the number of citations that they received, the g-index is the (unique) largest number such that the top g articles received (together) at least g² citations.

It aims to improve on the h-index by giving more weight to highly-cited articles.

This metric is shown as g-index in the output.

Contemporary h-index

The Contemporary h-index was proposed by Antonis Sidiropoulos, Dimitrios Katsaros, and Yannis Manolopoulos in their paper Generalized h-index for disclosing latent facts in citation networks, arXiv:cs.DL/0607066 v1 13 Jul 2006.

It adds an age-related weighting to each cited article, giving (by default; this depends on the parametrization) less weight to older articles. The weighting is parametrized; the Publish or Perish implementation uses gamma=4 and delta=1, like the authors did for their experiments. This means that for an article published during the current year, its citations count four times. For an article published 4 years ago, its citations count only once (4/4). For an article published 6 years ago, its citations count 4/6 times, and so on.

This metric is shown as hc-index and ac=y.yy in the output.

Individual h-index (3 variations)

The Individual h-index was proposed by Pablo D. Batista, Monica G. Campiteli, Osame Kinouchi, and Alexandre S. Martinez in their paper Is it possible to compare researchers with different scientific interests?, Scientometrics, Vol 68, No. 1 (2006), pp. 179-189.

It divides the standard h-index by the average number of authors in the articles that contribute to the h-index, in order to reduce the effects of co-authorship; the resulting index is called h_I.

Publish or Perish also implements an alternative individual h-index, h_I,norm, that takes a different approach: instead of dividing the total h-index, it first normalizes the number of citations for each paper by dividing the number of citations by the number of authors for that paper, then calculates h_I,norm as the h-index of the normalized citation counts. This approach is much more fine-grained than Batista et al.'s; we believe that it more accurately accounts for any co-authorship effects that might be present and that it is a better approximation of the per-author impact, which is what the original h-index set out to provide.

The third variation is due to Michael Schreiber and first described in his paper To share the fame in a fair way, h_m modifies h for multi-authored manuscripts, New Journal of Physics, Vol 10 (2008), 040201-1-8. Schreiber's method uses fractional paper counts instead of reduced citation counts to account for shared authorship of papers, and then determines the multi-authored h_m index based on the resulting effective rank of the papers using undiluted citation counts.

These metrics are shown as hI-index (Batista et al.'s), hI,norm (PoP's), and hm-index (Schreiber's) in the output.

Average annual increase in individual h-index

The individual, average annual increase of the h-index called hI,annual was proposed by Anne-Wil Harzing, Satu Alakangas and David Adams in their paper hIa: An individual annual h-index to accommodate disciplinary and career length differences, Scientometrics, vol. 99, no. 3, pp. 811-821, which is available online on the Harzing.com web site.

As of release 4.3 Publish or Perish calculates and displays this new index. The average annual increase in the individual h-index is useful for the following reasons:

In common with the hI,norm index, it removes to a considerable extent any discipline-specific publication and citation patterns that otherwise distort the h-index.
It also reduces the effect of career length and provides a fairer comparison between junior and senior researchers.

The hI,annual is meant as an indicator of an individual's average annual research impact, as opposed to the lifetime score that is given by the h-index or hI,norm.

This metric is shown as hI,annual in the output.

Age-weighted citation rate (AWCR, AWCRpA) and AW-index

The age-weighted citation rate was inspired by Bihui Jin's note The AR-index: complementing the h-index, ISSI Newsletter, 2007, 3(1), p. 6.

The AWCR measures the number of citations to an entire body of work, adjusted for the age of each individual paper. It is an age-weighted citation rate, where the number of citations to a given paper is divided by the age of that paper. Jin defines the AR-index as the square root of the sum of all age-weighted citation counts over all papers that contribute to the h-index.

However, in the Publish or Perish implementation we sum over all papers instead, because we feel that this represents the impact of the total body of work more accurately. (In particular, it allows younger and as yet less cited papers to contribute to the AWCR, even though they may not yet contribute to the h-index.)

The AW-index is defined as the square root of the AWCR to allow comparison with the h-index; it approximates the h-index if the (average) citation rate remains more or less constant over the years.

The per-author age-weighted citation rate is similar to the plain AWCR, but is normalized to the number of authors for each paper.

These metrics are shown as AWCR, AWCRpA and AW-index in the output.