[LeetCode] 274. H-Index
Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.
According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."
For example, given
citations = [3, 0, 6, 1, 5]
, which means the researcher has 5
papers in total and each of them had received 3, 0, 6, 1, 5
citations respectively. Since the researcher has 3
papers with at least 3
citations each and the remaining two with no more than 3
citations each, his h-index is 3
.
Note: If there are several possible values for
h
, the maximum one is taken as the h-index.
Thought process:
The definition of h-index in the problem is confusing. Actually, according to Wikipedia, "the definition of the index is that a scholar with an index of h has published h papers each of which has been cited in other papers at least h times".
A naive solution is to sort the citations. Iterate through them in reverse order. If citations[i] < i + 1, break. The h-index is i.
Solution 1:
1 2 3 4 5 6 7 8 9 10 11 12 | public class Solution { public int hIndex(int[] citations) { Arrays.sort(citations); for (int i = citations.length - 1; i >= 0; i--) { int h = citations.length - i; if (citations[i] < h) { return h - 1; } } return citations.length; } } |
Time complexity: O(nlogn).
Counting sort requires the elements are within a certain range. In this problem, there is no upper limit for citations. How do I use counting sort? Note that although there is no upper bound for citations, there is an upper bound for h-index, length of array. Therefore, whenever I see an element that's larger than length, replace it with length.
Traditionally, counting sort calculates the counts of every element, creates a less array to make the sorting stable, and derives the sorted array based on the less array. In this problem, I don't need the sorting to be stable because I don't care the relative positions of equal citations. In fact, I even don't need the sorted citations, because I only need the sum of the citations in descending order, and I can get that as I iterate through the counts array.
Iterate through the counts backwards. Keep track of the sum of counts. The first index where sum >= index is the h-index.
Solution 2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | public class Solution { public int hIndex(int[] citations) { int length = citations.length; int[] count = new int[length + 1]; for (int citation : citations) { count[Math.min(citation, length)]++; } int sum = 0; for (int h = length; h >= 0; h--) { sum += count[h]; if (sum >= h) { return h; } } return 0; } } |
Time complexity: O(n).
"A naive solution is to sort the citations. Iterate through them in reverse order. If citations[i] < i + 1, break. The h-index is i."
ReplyDeleteI know this is right. But can you prove it why? Or just explain it?