Er is wel degelijk een open access citatievoordeel -maar misschien gaat het daar niet om

Reactie op het ScienceGuide artikel ‘Publicaties in open access worden minder geciteerd, maar hebben meer impact’

Bianca Kramer & Jeroen Bosman

[This post in Dutch is a reaction to a Dutch language article in the online magazine ScienceGuide. In it we point at methodological issues in that article where it concerns calculating citation advantage ratios of open access publications.]

Een recent artikel in ScienceGuide ‘Publicaties in open access worden minder geciteerd, maar hebben meer impact’ ‘stelt dat open access (OA) artikelen vaker gedownload, gedeeld en bediscussieerd worden dan artikelen die niet open access beschikbaar zijn (vooral door lezers buiten de academische wereld), maar minder vaak worden geciteerd.

Het artikel rapporteert over onderzoek dat door Springer Nature is uitgevoerd met medewerking van de VSNU en de Nederlandse universiteitsbibliotheken. De stelling dat open access publicaties minder geciteerd worden is echter gebaseerd op een eigen analyse door ScienceGuide van de database Dimensions. Op deze analyse valt ons inziens het een en ander af te dingen, wat we hier met een korte check hopen te laten zien.  

ScienceGuide stelt dat ‘een OA-artikel in 2020 gemiddeld 17 keer geciteerd werd, terwijl verwijzingen naar betaalde artikelen gemiddeld 20 keer voorkwamen’. Als gemiddeld aantal citaties per artikel in één jaar zouden dergelijke hoge aantallen sowieso vraagtekens moeten oproepen.  Voor zover wij kunnen nagaan, is in de analyse van ScienceGuide het aantal citaties in 2020 naar alle OA publicaties in Dimensions gedeeld door het aantal OA publicaties uit 2020: 51,462,310 / 3,092,745 = 16,6 en idem voor gesloten publicaties: 61.099.078 / 3,007,612 = 20,3 (data van 6 maart 2021). Daarbij is niet gefilterd op artikelen, terwijl in de tekst wel wordt gesproken over artikelen. Waar we hier echter op in willen gaan is dat de berekening zoals die is uitgevoerd niet zinvol is en een onjuiste suggestie wekt.

Als de intentie is geweest om na te gaan hoe vaak in 2020 gemiddeld verwezen werd naar een een OA artikel versus een gesloten artikel, zou het aantal citaties in 2020 gedeeld moeten worden door het totaal aantal artikelen in de database (voor zowel OA en gesloten artikelen). Die grove berekening, eveneens uitgevoerd in Dimensions, wijst op een citatievoordeel voor OA artikelen (49,664,551 / 28,393,702 = 1,7 citaties per artikel) vergeleken met gesloten artikelen (56,164,426 / 67,479,719 = 0,8 citaties per artikel).

Het is ook mogelijk om te kijken naar het totaal aantal citaties per artikel (dus niet alleen citaties uit 2020). Als we dat doen voor artikelen uit de jaren 2012-2020 (zie data en berekeningen), zien we opnieuw een citatievoordeel voor OA artikelen, dat toeneemt naarmate artikelen langer geleden gepubliceerd zijn (en dus langer de tijd hebben gehad om geciteerd te worden). Als we de artikelen uitsplitsen naar type OA, blijkt het citatievoordeel het sterkst voor green OA (artikelen gedeeld in een repository) en hybrid OA (OA artikelen in abonnementstijdschriften, die ook gesloten artikelen bevatten). Green OA betekent hier ‘green only’: artikelen die niet ook gold of hybrid of bronze open access zijn.

Omdat gemiddelde aantallen citaties per artikel sterk beïnvloed kunnen worden door een klein aantal artikelen dat extreem vaak geciteerd wordt, hebben we ook gekeken naar de mediaan van het aantal citaties per artikel, een parameter die ook getoond wordt in Dimensions. Hieruit blijkt voor artikelen uit de meest recente jaren geen algemeen citatievoordeel voor OA artikelen versus gesloten artikelen, maar nog steeds wel voor green OA. 

Ten slotte hebben we gekeken naar het percentage artikelen dat (volgens de informatie in Dimensions) ten minste één keer geciteerd is. Het stuk in ScienceGuide noemt de lage citatiegraad van artikelen, naar we aannemen die uit 2020. Dat is niet verwonderlijk omdat artikelen uit dat jaar nog nauwelijks de kans hebben gehad om geciteerd te worden. Sommige artikelen uit 2020 zijn pas net verschenen. Zoals te verwachten is het percentage geciteerde artikelen hoger naarmate artikelen ouder zijn. We zien hier dat, in vergelijking met gesloten artikelen, OA artikelen die ouder zijn 2 jaar wat vaker minimaal één keer geciteerd zijn. Dit geldt in sterke mate voor green OA artikelen, waar het effect voor alle jaren zichtbaar is. Al deze berekeningen en data in deze post zijn overigens beschikbaar.

In tegenstelling tot de berekening die ScienceGuide heeft toegepast, lijken al deze data te wijzen op een (licht) citatievoordeel voor OA artikelen, wat in lijn is met een aantal eerdere onderzoeken, waaronder de grootschalige studies van Archambault et al. (2016) en van Piwowar et al. (2018) en de overzichtsstudie van Lewis (2018). Tevens is er een nuttige lijst van SPARC Europe met tientallen studies waarin is gekeken naar het vermeende citatievoordeel. 

Ook in de studie van Springer Nature die door ScienceGuide besproken wordt, is behalve naar downloads en altmetrics data, gekeken naar citaties. Voor 350K publicaties (artikelen, conference proceedings en boekhoofdstukken) uit 2017 die gerelateerd zijn aan de Sustainable Development Goals werd in Dimensions geen direct citatievoordeel gevonden voor OA versus gesloten publicaties, maar wanneer een regressiemodel werd toegepast met correcties voor ‘meerdere variabelen op het niveau van de publicatie, auteur en tijdschrift’ leek er alsnog een citatievoordeel te zijn voor hybrid OA (zie de figuur hieronder, overgenomen uit het Springer Nature rapport, p. 15). In de studie van Springer Nature is overigens in het geheel niet gekeken naar green OA. 

Het is goed te bedenken dat de door ons uitgevoerde analyses afhankelijk zijn van de compleetheid van publicatie- en citatiedata in Dimensions. Elke database met citatiegegevens heeft zijn eigen beperkingen, maar een vergelijkbare analyse in Lens (een vrij beschikbare bibliografische database) geeft hetzelfde beeld (zie data en grafieken). En uiteraard impliceren statistische verbanden niet automatisch causale verbanden. De populaties waarnaar gekeken wordt kunnen onderling verschillen op andere aspecten dan alleen open access status, wat een effect kan hebben op de gevonden patronen. Het kan om die reden bijvoorbeeld ook interessant zijn om te kijken naar verschillen tussen vakgebieden (zie data en grafieken). Een analyse hiervan voert hier echter te ver.

De gevonden opvallend hogere waarden voor artikelen die via green OA zijn gedeeld komt overeen met wat werd gevonden in de studies van Piwowar et al. en Archambault et al. Hogere waarden voor green en ook hybrid OA, vooral ten opzichte van artikelen in full gold open access tijdschriften kunnen mogelijk worden verklaard uit het feit dat green en hybrid open access vooral van toepassing is op traditionele tijdschriften, met gemiddeld een grotere bekendheid en op dit moment nog vaak sterkere reputatie dan veel van de nieuwere full gold open access tijdschriften. Specifiek voor green OA komen daar mogelijk nog 2 effecten bij: het effect van de glossy tijdschriften waarin open access publiceren tot voor kort niet mogelijk was (zoals Nature, Science en Cell), en waar green OA dus de enige mogelijkheid was, en het effect dat veel tijdschriften in Life Sciences artikelen green OA beschikbaar maken via PubMed Central en dat veel auteurs in Physical Sciences artikelen delen in arXiv. 

De stelling in het ScienceGuide artikel dat OA artikelen minder geciteerd worden dan gesloten artikelen, blijkt in onze analyse niet door de gebruikte data ondersteund te worden. Er zijn wel degelijk sterke aanwijzingen dat open access artikelen vaker geciteerd worden. Los hiervan zijn we geen voorstander van het tegen elkaar afzetten van citaties en ‘externe impact’ als doelen, zeker waar dit laatste wordt afgemeten aan een eendimensionale maat als een geaggregeerde Altmetric score. Het doet geen recht aan de vele manieren waarop impact bereikt kan worden en doet tevens geen recht aan aan de vele beweegredenen om open access te publiceren.

Deze post heeft een CC BY 4.0 license.

Green OA: publishers and journals allowing zero embargo and CC-BY

Jeroen Bosman and Bianca Kramer, Utrecht University, July 2020
Accompanying spreadsheet: https://tinyurl.com/green-OA-policies

Introduction

We witness increased interest in the role of green open access and how it can contribute to the goals of open science. This interest focuses on immediacy (reducing or eliminating embargoes) and usage rights (through open licenses), as these can contribute to wider and faster dissemination, reuse and collaboration in science and scholarship. 

On July 15 2020, cOAlition S announced their Rights Retention Strategy, providing authors with the right to share the accepted manuscript (AAM) of their research articles with an open license and without embargo, as one of the ways to comply with Plan S requirements. This raises the question to what extent immediate and open licensed self archiving of scholarly publications is currently already possible and practiced. Here we provide the results of some analyses carried out earlier this year, intended to at least partially answer that question. We limit this brief study to journal articles and only looked at CC-BY licenses (not CC0, CC-BY-SA and CC-BY-ND, which can also meet Plan S requirements).

Basically, there are two possible approaches to inventorize journals that currently allow immediate green archiving under a CC-BY license:

  • policy-based – by checking journal- or publisher policies, either directly or through Sherpa Romeo or Share Your Paper from Open Access Button.
  • empirically – by checking evidence for green archiving with 0 embargo and CC-BY license (with potential cross-check against policies to check for validity).

Here we only report on the first approach.

A full overview of journal open access policies and allowances (such as will be provided by the Journal Checker Tool that cOAlition S announced early July 2020) was beyond our scope here. Therefore, we carried out a policy check for a limited set of 36 large publishers to get a view of currently existing options for immediate green archiving with CC-BY license, supplemented with anecdotal data on journals that offer a compliant option. We also briefly discuss the potential and limitations of an empirical approach, and potential publisher motivations behind (not) allowing immediate sharing and sharing under a CC-BY license, respectively.

Our main conclusions are that:

  1. Based on stated policies we found very few (18) journals that currently allow the combination of immediate and CC-BY-licensed self archiving.
  2. Based on stated policies of 36 large publishers, there are currently ~2800 journals with those publishers that allow immediate green, but all disallow or do not explicitly allow CC-BY.

Large publishers – policies

We checked the 36 largest non-full-OA publishers, based on number of 2019 articles according to Scilit (which uses Crossref data), for self archiving policies allowing immediate sharing on (institutional) repositories. Of these 36 publishers, 18 have zero embargo allowances for at least some of their journals for green sharing of AAMs from subscription (incl. hybrid) journals in institutional or disciplinary repositories. Overall that pertains to at least 2785 journals. Elsevier only allows this in the form of updating a preprint shared on ArXiv or RePEc. From these large publishers, those with the most journals allowing zero embargo repository sharing are  Sage, Emerald, Brill,  CUP, T&F (for social sciences), IOS and APA. Notably, though not a large publisher in terms of papers or journals, the AAAS also allows immediate sharing through repositories.

None of these policies allow using a CC-BY license for sharing in repositories. Three explicitly mention another CC-license (NC or NC-ND), others do not mention licenses at all or ask authors to state that the copyright belongs to the publisher. Sometimes CC-licenses are not explicitly mentioned, but it is indicated that the AAM shared in repositories are for personal and/or non-commercial use only. 

For the data see columns F-H in the tab ‘Green OA‘ in the accompanying spreadsheet.

Other evidence

From the literature and news sources we know of a few examples of single publishers allowing zero embargo sharing in repositories combined with a CC-BY license:

  • ASCB:
    • Molecular Biology of the Cell (PV OA (CC-BY) after 2 months,
      AAM 0 embargo with CC-BY)
  • MIT Press:
    • Asian Development Review (full OA but PV has no open license)
    • Computational Linguistics (full OA but PV=CC-BY-NC-ND)
  • Microbiology Society
    • Microbiology
    • Journal of general Virology
    • Journal of medical Microbiology
    • Microbial genomics
    • International Journal of Systematic and Evolutionary Microbiology
    • JMM case reports
  • Royal Society
    • Biology Letters
    • Interface
    • Interface Focus
    • Notes and records
    • Philosophical Transactions A
    • Philosophical Transactions B
    • Proceedings A 
    • Proceedings B 

A check of the long tail of smaller publishers could yield additional examples of journals compliant with 0 embargo / CC-BY sharing from smaller publishers. 

Empirical analysis of green archiving

Empirical analysis of actual green archiving behaviour (e.g. using Unpaywall and/or Unpaywall data in Lens.org) could also provide leads to journals allowing early sharing.

Since Unpaywall data do not contain information on the date a green archived copy was made available in a repository, a direct empirical analysis of zero-embargo archiving is not readily possible. As a proxy, a selection could be made of articles published in a period of 3 months before a given database snapshot, and then identifying those that are only available as green OA. A period of 3 months, rather than 1 month or less, would allow for some delay in posting to a repository. 

The benefit of using Lens.org for such an analysis is the availability of a user-friendly public interface to perform queries in real time. The disadvantage is that, although Lens sources OA information from Unpaywall, no license information for green OA is included, and no distinction is made between submitted, accepted and published versions. Analyses could also be done on a snapshot of the Unpaywall database directly, which includes license information for green OA (where available) and provides version information.

Gap analysis report

In our previous gap analysis report that gave a snapshot of publication year 2017, we did harvest policies from Sherpa Romeo systematically for the subset of journals included in the gap analysis (journals in Web of Science publishing articles resulting from Plan S-funded research). As explained above, updating this approach was beyond our scope for this exercise. 

In our original gap analysis data, we found no examples of journals that allowed 0 embargo in combination with CC-BY. 

Journal policies for green OA: embargo lengths and licenses
(source: Open access potential and uptake in the context of Plan S – a partial gap analysis)

Potential publisher motivations 

From checking policies and behaviour, different publisher approaches emerge regarding embargoes and licenses for self-archived article versions. It seems that the reluctance of publishers to allow immediate sharing is weaker overall than the reluctance to allow CC-BY for green OA. That may have to do with the reasons behind these two types of reluctance. 

The reason to not allow immediate sharing may concern fears of losing subscription income and perhaps also a dwindling effect on visitors to their platform. However, several publishers have noticed that this fear may be ungrounded, as libraries do not unsubscribe yet just because some percentage of articles is also immediately available as AAM, not only because of incomplete open availability but also because of the wish to provide access to published versions in their platform context. Some publishers (e.g. Sage) have also publicly stated that they do not witness a negative effect on subscriptions. 

For the reluctance to allow CC-BY licenses we expect other reasons to be at play, primarily the desire to be in control over how, where and in what form content is shared. This relates to  protecting income from derivative publications (reprints, printing-on-demand, anthologies etc.) and also to preventing others having any monetary gain from including content on competing platforms. 

Another aspect is the inability of publishers to require linking back to the publisher version in cases where the CC-BY licensed AAM in the repository is reused, rather than depending on community norms to provide information on and links to various versions of a publication.

Looking at the empirical evidence and these considerations, it can potentially be expected that across publishers, a move towards shorter embargoes might be easier to achieve than a move towards a fully open license for green-archived versions. It should be noted that while there are examples of publishers allowing shorter embargoes in response to specific funder mandates (e.g from Wellcome, NIH), to our knowledge there has not, prior to Plan S, been funder or institutional pressure to require open licenses for green archived AAMs. Thus, it will remain to be seen whether publishers would be inclined to move in this direction in response. The reactions to the letter cOAlition S sent to a large number of publishers to inform them on the cOAlition S Rights retention Strategy should provide clarity on that. 

In addition to funder policies, institutions and governments could further support this development through policies and legislation relating to copyright retention, as well as zero embargoes and licenses for green OA archiving of publications resulting from publicly funded research. This could provide authors with more rights and put pressure on publishers to seriously reconsider their stance on these matters. 

Towards a Plan S gap analysis? (1) Open access potential across disciplines

(NB this post is accompanied by a second post on presence of full gold open access journals in Web of Science and DOAJ)

In the proposed implementation guidelines for Plan S, it has become clear there will be, for the coming years at least, three ways to open access (OA) that are compliant with Plan S:

  • publication in full open access journals and platforms
  • deposit in open access repositories of author accepted manuscript (AAM) or publisher version (VOR)
  • publishing in hybrid journals that are part of transformative agreements

Additional requirements concern copyright (copyright retention by authors or institutions), licensing (CC-BY, CC-BY-SA or CC0), embargo periods (no embargo’s) and technical requirements for open access journals, platforms and repositories.

In the discussion surrounding plan S, one of the issues that keeps coming back is how many publishing venues are currently compliant. Or, phrased differently, how many of their current publication venues researchers fear will no longer be available to them.

However, the current state should be regarded as a starting point, not the end point. As Plan S is meant to effect changes in the system of scholarly publication, it is important to look at the potential for moving towards compliance, both on the side of publishers as well as on the side of authors.

https://twitter.com/lteytelman/status/1067635233380429824

Method
To get a first indication as to what that potential for open access is across different disciplines, we looked at a particular subset of journals, namely those in Web of Science. For this first approach we chose Web of Science because of its multidisciplinary nature, because it covers both open and closed journals, because it has open access detection and because it offers subject categories and finally, because of its functionality in generating and exporting frequency tables of journal titles. We fully recognize the inevitable bias related to using Web of Science as source, and address this further below and in an accompanying blogpost.

For a number of (sub)disciplines, we identified the proportion of full gold, hybrid and closed journals in Web of Science, as well as the proportion of hybrid and closed journals that allows green open access by archiving AAM/VOR in repositories.  We also looked at the number of publications from 2017 (articles & reviews) that were actually made open access (or not) under each of these models.

Some methodological remarks:

  • We used the data available in Web of Science for OA classification at the article level. WoS uses Unpaywall data but imposes its own classification criteria:
    • DOAJ gold: article in journal included in DOAJ
    • hybrid: article in non-DOAJ journal, with CC-license
      (NB This excludes hybrid journals that use a publisher-specific license)
    • green: AAM or VOR in repository 
  • For journal classification we did not use a journal list, but we classified a journal as gold, hybrid and/or allowing green OA if at least one article from 2017 in that journal was classified as such. This method may underestimate:
    • journals allowing green OA in fields with long embargo’s (esp. A&H)
    • journals allowing hybrid or green OA if those journals have very low publication volumes (increasing the chance that a certain route is not used by any 2017 paper)
  • We only looked at green OA for closed articles, i.e. when articles were not also published OA in a gold or hybrid journal.
  • Specific plan S criteria are not (yet) taken into account in these data, i.e. copyright retention, CC-BY/CC-BY-SA/CC0 license, no embargo period (for green OA) and being part of transformative agreements (for hybrid journals)
  • For breakdown across (sub)disciplines, we used WoS research areas (which are assigned at the journal level). We combined Physical Sciences and Technology into one to get four major disciplines. In each major discipline, we identified 10 subdisciplines  with the highest number of articles & reviews in 2017 ((excluding ‘other topics’ and replacing Astronomy & Astrophysics for Mechanics because of specific interest in green OA in Astronomy & Astrophysics)
  • We used the full WoS Core collection available through our institution’s license, which includes the Science Citation Index Expanded (SCIE), the Social Sciences Citation Index (SSCI), the Arts & Humanities Citation Index (AHCI) and the Emerging Sources Citation Index (ESCI).

All data underlying this analysis are available on Zenodo:
https://doi.org/10.5281/zenodo.1979937

Results

As seen in Figure 1A-B, the proportion of full gold OA journals is relatively consistent  across major disciplines, as is the proportion of articles published in these journals. Both are between 15-20%. Despite a large proportion of hybrid journals in Physical Sciences & Technology and Life Sciences & Medicine, the actual proportion of articles published OA in hybrid journals is quite low in all disciplines. The majority of hybrid journals (except in Arts & Humanities) allow green OA, as do between 30-45% of closed journals (again except in Arts&Humanities). However, the actual proportion of green OA at the article level is much lower. As said, embargo periods (esp. those exceeding 12 months) might have an overall effect here, but the difference between potential and uptake remains striking.

https://101innovations.files.wordpress.com/2018/11/all1.png

All2

Fig 1A-B. OA classification of journals and publications (Web of Science, publication year 2017)

Looking at subdisciplines reveals interesting differences both in the availability of open access options and the proportion of articles & reviews using these options (Fig 2).

  • In Physical Sciences and Technology, the percentage of journals that is fully gold OA is quite low in most fields, with slightly higher levels in energy fuels, geology, optics and astronomy. Uptake of these journals is lower still, with only the optics and geology fields slightly higher. Hybrid journals are numerous in this discipline but see their gold and green open access options used quite infrequently. The use of green OA for closed journals, where allowed, is also limited, with the exception of astronomy.  (but note that green sharing of preprints is not included in this analysis). In all fields in this discipline over 25% of WoS indexed journals seem to have no open options at all. Of all subdisciplines in our analysis, those in the  physical sciences fields display the starkest contrast between the ample OA options and their limited usage.
  • In Life Sciences & Biomedicine, penetration of full gold OA journals  is higher than in Physical sciences, but with starker differences, ranging from very low levels in environmental science and molecular biochemistry to much higher levels for general internal medicine and agriculture. In the Life sciences and Biomedicine discipline, uptake of gold OA journals is quite good, again especially in general internal medicine. Availability of hybrid journals is quite high but their use is limited; exceptions are cell biology and cancer studies that do show high levels of open papers in hybrid journals. Green sharing is a clearly better than in Physical sciences, especially in fields like neurosciences, oncology and cell biology (likely also due to PMC / EuropePMC) but still quite low given the amount of journals allowing it.
  • In Social Sciences there is a large percentage of closed non-hybrid subscription journals, but many allow green OA sharing. Alas the uptake of that is limited, as far as detected using Unpaywall data. In this regard the one exception is psychology, with a somewhat higher level of green sharing. Hybrid OA publishing is available less often than in Physical Sciences or Life Sciences, but with relatively high shares in psychology, sociology, geography and public administration. The fields with the highest shares of full gold OA journals are education, linguistics, geography and communication, with usage of gold in Social Sciences more or less corresponding with full gold journal availability.
  • In Arts & Humanities, the most striking fact is the very large share of journals offering no open option at all. Like in Social Sciences, usage of gold across Humanities fields more or less corresponds with full gold journal availability. Hybrid options are limited and even more rarely used, except in philosophy fields. Green sharing options are already limited, but their use is even lower.

PT 1-2 large

LM 1-2 largeSOC 1-2 large

AH 1-2 large

Fig 2. OA classification of journals and publications in different subdisciplines (Web of Science, publication year 2017)

Increasing Plan-S compliant OA 

Taking these data as a starting point (and taking into account that the proportion of Plan S compliant OA will be lower than the proportions of OA shown here, both for journals and publications), there are a number of ways in which both publishers and authors can increase Plan S-compliant OA (see Fig 3):

  • adapt journal policies to make existing journals compliant
    (re: license, copyright retention, transitional agreements, 0 embargo)
  • create new journals/platforms or flip existing journals to full OA (preferably diamond OA)
  • encourage authors to make use of existing OA options (by mandates, OA funding (including for diamond OA) and changes in evaluation system)

We also made a more detailed analysis of nine possible routes towards plan S-compliance (including potential effects on various stakeholders) that might be of interest here.

Towards compliancy

Fig 3. Ways to increase Plan S-compliant OA

Towards a gap analysis? Some considerations

In their implementation guidance, cOAlition S states it will commission a gap analysis of Open Access journals/platforms to identify fields and disciplines where there is a need to increase their share. In doing so, we suggest it would be good to not only look at the share of currently existing gold OA journals/platforms, but view this in context of the potential to move towards plan S compliance, both on the side of publishers and authors. Filling any gaps could thus involve supporting new platforms, but also supporting flipping of hybrid/closed journals and supporting authors in making use of these options, or at least considering the effect of the latter two developments on the expected gap size(s).

Another consideration in determining gaps is whether to look at the full landscape of (Plan S-compliant) full gold journals and platforms, or whether to make a selection based on relevance or acceptability to plan S-funded authors, e.g.  by impact factor, by inclusion in an ‘accepted journal list’ (e.g. the Nordic list(s) or the ERA-list) or by other criteria. In our opinion, any such selection should be presented as an optional overlay/filter view, and preferably be based on criteria other than journal prestige, as this is exactly what cOAlition S wants to move away from in the assessment of research.  Some more neutral criteria that could be considered are:

    • Language: English and/or at least one EU language accepted?
    • Content from cOAlition S or EU countries?
    • Readership/citations from cOAlition S or EU countries?
    • Editorial board (partly) from cOAlition S or EU countries?
    • Volume (e.g. papers per annum)

Of course we ourselves already made a selection by using WoS, and we fully recognize this practical decision leads to limitation and bias in the results. For a further analysis of inclusion of DOAJ journals in WoS per discipline, as well as the proportion of DOAJ journals in ESCI vs SCIE/SSCI/AHCI, see the accompanying blogpost ‘Gold OA journals in WoS and DOAJ‘.

To further explore bias in coverage, there are also other journal lists that might be worthwhile to compare (e.g. ROAD, EZB, JournalTOCs, Scopus sources list). Another interesting initiative in this regard is the ISSN-GOLD-OA 2.0 list that provides a matching list of ISSN for Gold Open Access (OA) journals from DOAJ, ROAD, PubMed Central and the Open APC initiative. It is especially important to ensure that existing (and future) publishing platforms, diamond OA journals and overlay journals will be included in any analysis of gold OA publishing venues. One initiative in this area is the crowdsourced inventarisation of (sub)areas within mathematics where there is the most need for Fair Open Access journals.

There are multiple ways in which the rough analysis presented here could be taken further. First, a check on specific Plan S compliant criteria could be added, i.e. on CC-license type, copyright retention, embargo terms, and potentially on inclusion of hybrid journals in transitional agreement. Many of these (though not the latter) could be derived from existing data, e.g. in DOAJ and SherpaRomeo. Furthermore, an analysis such as this would ideally be based on fully open data. While not yet available in one interface that enables the required filtering, faceting and export functionality,  a combination of the following sources would be interesting to explore:

  • Unpaywall database (article, journal, publisher and repository info, OA detection)
  • LENS.org (article, journal, affiliation and funder info, integration with Unpaywall)
  • DOAJ (characteristics of full gold OA journals)
  • SherpaRomeo (embargo information)

Ultimately, this could result in an open database that would allow multiple views on the landscape of OA publication venues and the usage thereof, enabling policy makers, service providers (including publishers) and authors alike to make evidence-based decisions in OA publishing. We would welcome an open (funding) call from cOAlition S funders to get people together to think and work on this.