November 28, 2011 by Fridolin Wild
Comments (2)
Antti Oulasvirta used MS academic search to calculate a simple statistic for the HCI conferences: a ranking of average citations per paper. Since MS academic search also offers the same conference citation count for the category Computer Science/ Computer Education (which arguable does not contain all revelant conferences for TEL, but quite a few interesting ones), I played a bit with the data to gauge what you see.
On my first impression, I was very astonished to see numerous conferences pop up on top of the list, that never really caught my attention in the past. On the second look, however, I discovered that all of them (up to position 20 in the 5 year stats) are CS education related, so: phew.
Going down further in the list, I was astonished to see that the difference e.g. ECTEL and e.g. ASCILITE was still so big: ASCILITE had a 2.50 citations per paper, whereas ECTEL 'only' had 1.60. Also MS academic search listed 457 papers for ECTEL, which seemed a bit to high for me as we usually accepted somewhat around 30 full papers since it was founded in 2006.
So I extracted the first 200 most highly cited papers from MS academic search and calculated a few more statistics. My impression was right: we had accepted an average of 29.7 full papers per year, summing up to 178 full papers from 2006 to 2011. These are the papers that were evaluated best by the reviewers and accepted in their original submission length. Additionally, there were 201 short papers, posters, and demo papers printed in the proceedings. These were the ones that the reviewers had still considered interesting, but that were not elaborate enough and thus got 6 or less pages of space in the proceedings. Together this is still a bit short of what MS academic search had counted (457 versus actual 379 papers), but a quick look into the data explains this artefact: the search had included papers in workshop proceedings (with no guarantee for completeness, as the workshops published their papers in quite different media and forms).
Of course these years follow the usual publication pattern: the early years have accumulated more publications already than the recent years (2006: 390 citations in 74 publications, 2011: none). This is clear: it takes a while for publications citing literature to get published and one would expect that after a number of years the impact per year also goes down again -- but ECTEL is existing only for 6 years, so we do not have any information on that yet.
Among these 200 top cited papers, however, I can find a different pattern than indicated by the pure MS academic search stats: 85 long papers (of 178), 66 short papers (out of 201), 49 workshop papers (out of an unknown number, maybe somewhere around 8 workshops in 6 years with 5 papers each = 240, but probably much more than that). So: long papers do pay back, which is not very astonishing. What is more surprising is that the impact of these long papers is actually much higher: they received (on this rank-200 cutoff) 3.9 citations per paper, whereas short papers and workshop papers received only 3.3 and 3.6. Every second full paper was in the top 200 cited papers, whereas only every third short paper. And every second full paper received a lot more attention than 1.60 citations in average as outlined in the introduction.
Still. Even if it were 3.9 per paper, then it still would be less than e.g. ICLS or CSCL. But then again: a closer look reveals that both conferences obviously do not expose their meta-data, as for CSCL there are only two papers listed in the last 5 years and for ICLS its only 3? The two for CSCL received 14 citations, giving an average of 7 and the three for ICLS had 11.67 in average.
In comparison: the best ECTEL papers got 34 and 31 citations, the top 10 had 18.7 citations in average.
My overall resume thus would be: impact stats are interesting, but typically distorted. And if the data is not open, than you can kick any impact analysis based on citation counts anyway into the bucket.
And here is the table (last 5 years stats from the above MS academic search listing):
Interesting case study! I think there are two facts to blame:
1. Automated services such as MS Academic search cover a lot of publications, but there is little information on quality and completeness of the data. Edited services such as WoS on the other hand, offer more reliability in that direction - coming at a hefty price tag, of course, and less coverage.
2. Conferences are harder to track than journals: as Fridolin pointed out, there are many types of contributions. For an automated service that draws on multiple data sources including institutional archives and personal web pages, it is hard to distinguish between a full paper and a workshop paper, if they have been both attributed to EC-TEL.
The result is that in the EC-TEL case, one cannot draw conclusions about single conferences using the raw data. Citation analysis require careful processing and a broad analysis of possible biases. If the sources are not open and transparent - as Fridolin already noted - this is not possible.
Peter Kraker 169 days ago
Hendrik Drachsler
Profile
Friends
Friends of
Blog
Pages
Files
Wow, great insights into that Fridolin, thanks for sharing this.
Hendrik Drachsler 169 days ago