Pybites Logo

Citation indexes

Level: Intermediate (score: 3)

Academic research innovates by publishing papers, i.e., documents collecting ideas and experimental results. Google Scholar -- powered by Google Search -- enables you to search through academic literature across multiple research fields.

Multiple metrics have been proposed to quantify the quality and quantity of a researcher body of work, but they are all based on citations count.

Basically, when you write your paper, it is likely you need to mention previous work already published. Each of such citations is increasing the importance of the associated paper. As result, each entry found in a Google Scholar search shows you also its citation count: the higher the count, the higher the paper importance according to the research community.

By extension, to judge the body of work of an academic writer you can look at how many times his/her publications have been cited by other researchers: you need a citation index.

If you register in Google Scholar, Google generates (and automatically updates over time) your publications list, and provides you your h-index and i10-index, two popular citation indexes.

For instance, this is the profile of Charles Darwin (notice the indexes on the top right).

The definitions of those indexes are:

* i10-index is the number of papers with at least 10 citations.

* h-index is the maximum value of h such that the given author/journal has published at least h papers that have each been cited at least h times.

Your job in this Bite is to code the algorithms to compute such metrics. 

Your input is citations, a list (or tuple) containing positive integers (greater than or equal to zero) each representing the citations accumulated by a different paper.

You need to find and return h and i10 using such counters and based on the definitions above.

We further stress that both h and i10 represent a "selection of papers", thus their possible values are in the range 0..len(citations) (both edges included).

Some examples for i10-index

>>> i10_index([0, 0, 1, 1, 10])
1
>>> i10_index((0, 0, 1, 1))
0

In the first example i10 is 1 since only one paper has at least 10 citations, namely the 5th paper. Conversely, in the second example i10 is zero since there are no papers with at least 10 citations.

Some examples for h-index

>>> h_index([0, 0, 1, 1, 10, 5, 1, 3])
3
>>> h_index([0, 0, 1, 1])
1

In the first example h is 3 since there are 3 papers with at least 3 citations, namely the 5th paper (10 citations), the 6th paper (5 citations) and the 8th paper (3 citations).

In the second example h is 1 even if you have 2 papers with 1 citations. In fact, to have an h-index of 2 you would need 2 papers with at least 2 citations each, but here you have 2 papers with only 1 citation each, so the maximum value of h that is possible is 1 which you read as "there is at least 1 paper with at least 1 citation".

As mentioned, citations can be a list or tuple, so raise a TypeError otherwise, and specify the message "Unsupported input type: use either a list or a tuple".

Instead, raise a ValueError if citations is None, empty, or does not contain positive integer, and specify the message "Unsupported input value: citations cannot be neither empty nor None, and can only have positive integers"

Hint: if you are stuck and the definition of h-index is confusing to you, try having a look at the wikipedia article linked above as it contains also a visual explanation of the logic behind the index.

Happy python coding!