I spent June and a good part of July developing an exciting Synthetic Biology programme at the interface of biology and physical sciences. The remainder of my time was spent with my research group, editing colleagues’ grants and doing a bit of departmental fire fighting. Then I went on “holiday”. There was one large fly in the ointment: REF. I had an edit overdue and a meeting on papers to attend. So the first part of the hols was spent writing in my mobile office and an evening on Skype with my colleagues in Liverpool.
In late April, I posted “In Defence of REF” to highlight the positives of the UK’s assessment of research, though I did temper this with the statement that I am a modest fan only. Why is my enthusiasm not wholehearted? Because, while REF has pushed university hiring away from old boy networks towards a meritocracy, it causes a lot of damage.
I am a REF wallah at Liverpool and, at a personal level, REF is a voracious consumer of my time: REF uses up a huge amount of time when we don’t have the time to spare. What goes out of the window? Teaching? Mentoring? Research? Admin? In my case one activity that drops off is grant writing. Semi-fatal in the long run, perhaps, but I have always done a lot on relatively slim resources, preferring a tight knit “Alpine style assault” to a large cumbersome siege – (for non climbers see this excellent interview with Doug Scott on the subject)
Something else goes out of the window – the number of seminars I attend, particularly those outside my comfort zone (often the most interesting and useful) and the time I spend idly discussing science with colleagues. Consequently, the roots of scientific creativity take a hit. It makes sense for universities to put their more prolific researchers on REF panels. The result of the call on their time is to stunt the development of research on campus. So REF directly reduces the ability of a university to deliver future high quality research.
At an organisational level, REF is particularly corrosive. The academic units in a University generally reflect the focus of teaching, but staff in these units span a wide range of disciplines. My own department, Structural and Chemical Biology is typical, housing staff with PhDs in biochemistry, chemistry, physics, etc. Our research Institute is even more multidisciplinary, spanning biology from the electron to entire ecosystems. A fantastically stimulating place to work, but what a nightmare for pigeonholing papers and associated impact cases into Units of Assessment. Not everyone is relaxed about where they are returned. They feel that REF somehow reflects their research – it doesn’t, but that doesn’t necessarily change someone’s reactions. So being returned in a unit of assessment that they do not identify with, though their output fits well, sometimes upsets people. Moreover, a fair number view REF as a judgement on their value. It isn’t. These members of staff produce a steady stream of papers, but insufficient in number and/or quality to be above the bar. Result, someone who is doing a great all around job ends up demoralised.
The ‘quality” word. ouch. How do we judge the quality of a paper?
The only way to assess research is to actually read papers. That there is no other proxy is clear. Impact factors are (I hope) discredited, relating at most or popularity, fashion and the citation of reviews rather than original sources in the interests of brevity (or laziness). Some REF panels will use citation data from Scopus to “only inform academic judgement”. My interpretation is that citation data will be used when there is argument and uncertainty. This is problematic.
First, the kinetic of citations of papers is variable. In other words, the accumulation of citations by papers over time is not equivalent. Some papers are a flash in the pan, accumulating a lot of citations quickly and then burning out. Other papers are slow burners, steadily accumulating citations over one or more decades. Yet others are like Sleeping Beauty: no one cites them for 5-10 years, then the field wakes up and these papers then are well cited. Papers that add but a small detail to a field are sporadically cited at a low level. Finally, the papers that are plain wrong, but because science self-rights inefficiently (more about this another day), remain on record, sometimes in so-called top journals and these may be very well cited.
Second, is that within a field, different sub fields or themes will have very different citation patterns. Philip Moriarty gave a very nice exemplar from physics on his Physicsfocus blog, Not-everything-that-counts-can-be-counted .
How well can one judge? With a 4 point scale (1 star to 4 star) for grading, I would estimate the error to be 0.5 to 0.75 of a star, when papers are read by people fairly close to the field. Further away, the error rises.
There are two corrosive effects of our cyclical judgement of papers.
The first is chasing the journal. This behaviour is strongly reinforced by poor mentoring and the actions of many, but not all, on selection panels: “this person must be good they are publishing in top journals”. Most of the time the individual saying these words has not bothered to read the papers! So staff chase the chimera of “top quality journal + impact factor”, wasting a huge amount of time and effort, when the usual outlet, e.g., the journal of a learned society or an Open Access journal would do just fine. Result: demoralisation, and the depression of their sense of worth.
The second is the destruction of fields of research. Whole areas are disappearing, from enzymology to taxonomy. This is extremely worrying. Who to turn to when you have an enzyme problem? There used to be a whole bunch of people down the corridor, but they have all left and often not replaced, though they may be re-appearing in Chemistry departments and catalysis centres, somewhat divorced from their biological roots. Need some taxonomical advice? Many UK universities got rid of those people quite a few years ago. We can sequence of genome of anything, but few places have staff with the skills to identify the animal/plant/etc. So the notion of “quality” results in a lopsided scientific base, which ultimately cannot be successful, because it is lopsided.
I would not argue against measuring the quality of scientific output. However, I would argue strongly against attempts to do so on anything other than a linear scale. This is simply because the errors in measurement are substantial and we may not know was really important for another 10 or more years, once the froth of fashion and hype has died down.