Challenging but rewarding – Wellcome Trust Data Re-use Prize winner, Quentin Leclerc, on reusing open data

Last November the Wellcome Trust launched the Data Re-use Prize to celebrate innovative reuse of open data either in antimicrobial resistance (AMR) or malaria. Entrants were asked to generate a new insight, tool or health application from two open data resources, the AMR ATLAS dataset or the Malaria ROAD-MAP dataset.

MRC-LID PhD student and member of the winning team for AMR, Quentin Leclerc, dropped by the SGUL RDM Service to talk about the prize and the challenging but rewarding process of reusing open data.

Quentin, congratulations on the win. Can you tell me a little bit about your team’s entry for the Data Re-Use Prize?

Sure. We developed a tool to help inform empiric therapy. Empiric therapy is basically when physicians pool multiples sources of data together to make the best informed guess about how to treat a patient. This is before they know exactly what bacteria a patient is infected with and its potential resistance to antibiotics. Say, for example, a patient has sepsis and needs to be treated right away. A physician might determine the most likely causes as E.coli and S. aureus infection and then make an informed guess about the best antibiotic to prescribe to treat both of these bacteria, bearing in mind regional estimates of each of pathogen’s resistance to different antibiotics. The physician is basically thinking, “given what we know about the common causes of this condition and antibiotic resistance, which antibiotic is likely to work best?”

Our proof of concept web app integrates data from a range of open data sources to visualise antibiotic resistance rates for common infections to help physicians prescribe faster and more accurately. If developed, the tool can potentially be used to inform national guidelines on how to treat common infections in many countries, particularly in low and middle income counties where data aren’t always available to inform empiric therapy at the local or hospital level.

app screenshot
Some visualisations from the team’s AR.IA app

Sounds very exciting. As a first year PhD student, what was it like to win a prize like this?

It was really unexpected. We didn’t expect to win, we just thought, ‘we’ll publish our findings anyway so let’s see how this goes’. The other entries for the prize were very specific while our entry was pretty broad so we weren’t very confident. It was a real surprise and a great effort from everyone on the team.

Team photo
Team photo (l to r): Gwen Knight, Quentin Leclerc, Nichola Naylor and Alexander Aiken
Missing: Francesc Coll

As a PhD student, it was an interesting experience overall. This project is very different from my PhD but working on this tool helped me to get used to the various datasets out there and to look at the big picture of antimicrobial resistance and antibiotic prescribing. It was an enlightening process.

Can you tell me a little bit more about the process of reusing existing data? What was it like?

It was surprising. The thing with data is that it’s collected for a purpose. When someone comes in trying to use that data for a different purpose, they start to see what’s missing. They start to make approximations and assumptions to use the data for something it wasn’t intended for. The ATLAS dataset is very accurate and it’s very rich but it suits its original purpose. For example, we needed to group the data in increasingly complex ways. Once we started doing this, the sample sizes started to look quite small. The dataset wasn’t suited to those kinds of groupings.

When we started comparing the ATLAS dataset to other datasets, the AMR data appeared to show slightly different information. So we started to ask, who collected this data? In what contexts would this data have been collected? Might there be a sampling bias that explains this difference we’re seeing between the datasets? There was a legitimate reason for the difference we were seeing, but that’s why it’s really important to think about why you’re using a dataset and exactly what you want to achieve because the data may not suit your purpose.

Also, we integrated data from a range of sources. When you start doing this, comparing available datasets, you realise the heterogeneity of the data that’s out there; they are all in different formats, they have different naming conventions, even the bacteria aren’t named in the same way and we had to work out exactly which bacteria different datasets were referring to. There aren’t any standards across the different sources to make integrating the datasets easy.

So there were a lot of challenges to reusing data that someone else created?

Yes, we needed to keep in mind that the data was not created to answer our research question. We also found that there was a lack of information in the available literature around the common causative pathogens of several infections to help us understand and use the data correctly.

What advice would you give to researchers wanting to reuse open datasets but are hesitant?

It is important to look at the dataset and really understand it. Ask yourself why it was collected, where it was collected, how it was collected. Don’t take anything for granted. Open datasets are incredible resources but you can’t blindly go in there.

Once you understand the dataset you’ll naturally get the confidence to use it and ask the right questions of it. You won’t be scared or overwhelmed by it. You’ll also save a lot of time once you start working on the data and better understand how to combine it with other datasets.

Quentin and his team’s winning entry, Antibiotic Resistance: Interdisciplinary Action (AR:IA), is openly available here. The team was led by Dr Gwen Knight at the London School of Hygiene and Tropical Medicine and included Nichola Naylor, Francesc Coll and Alexander Aiken.    

If you have any questions about finding and reusing open data contact Michelle Harricharan, Research Data Support Manager.

UPDATE 03/05/2019: You can read the official SGUL news release on this prize here.

Advertisements

A year’s worth of Open Research and SGUL

A year's worth of Open Research and SGULIf you are a researcher at SGUL, we are here to help you share and preserve your data, and publish in a way that meets your funder open access mandates, as many have a commitment to making data and publications as openly available as possible.

SGUL has two repositories to enable researchers to share and preserve both data and publications: read on for more facts and figures about how adding your work to these ties in with our Strategic Plan to maximise the impact of our research.

Research Data

In late 2017 the Research Data Management Service announced our pilot Research Data Repository. In 2018 we published more than 20 outputs to the repository including the official proceedings from SGUL’s Education Day (2017), presentations from Infection and Immunity’s annual INTERTB symposium, and, to mark World AIDS Day this December, the Centre for Global Health released the first of six free training modules to share SGUL expertise on treating one of the biggest causes of HIV-related mortality in Africa. Our work has been viewed, downloaded and shared locally and internationally.

Contact the Research Data Management Service to talk about sharing your data, powerpoint presentations, posters and videos on the repository.

This year also saw the introduction of new Europe-wide data protection legislation. How could we forget that? Our team worked closely with colleagues across St George’s and external organisations to support our researchers in the run-up to 25 May. Our GDPR and Health Research blog post was part of that awareness raising campaign.

In 2018 SGUL’s Information Management (IM) Team was also formed. Made up of our Information Governance Manager, Data Protection Officer, Freedom of Information Officer, Archivist, Records Manager and Research Data Manager, the IM Team looks to streamline information flows across St George’s and raise awareness of information policies and good practice. We run regular seminars on IM.

Contact our Records Manager for more information.

 

384px-Open_Access_logo_PLoS_transparent.svgOpen Access publications

On the publications front, the number of articles now free to read via SORA (St George’s Online Research Archive) has been steadily increasing, driven by the open access mandate for the 2021 REF (for more on this, see our webpages).   We now have nearly 3000 articles publicly accessible via SORA with more being added all the time. Downloads of the articles is also rising; up to 2,300+ downloads per month on average in 2018 (from 1,800+ downloads per month on average in 2017). As with data, the articles have a global reach, being downloaded by readers in all parts of the world.

Records are included in the open access aggregation platform CORE, which contains over 11 million full-text articles.  CORE is working with trusted parties such as institutional and subject repositories and journals (other sources of articles such as SciHub1 and Research Gate2 have been subject to action by publishers due to copyright infringement). CORE also allows for text mining of the corpus.

This year we also upgraded our CRIS (Current Research Information System). Among other improvements, if you confirm your ORCiD in your CRIS profile, any publications matched in our data sources with your ORCiD will be automatically claimed for you. For more on ORCiDs and the benefits of having one, see our blogpost from earlier this year.

Contact us at sora@sgul.ac.uk if you would like guidance on keeping your CRIS publication lists & metrics up to date.

 

Funder initiatives

Funder mandates and publisher policies around open access to research are an area of constant evolution. This year has seen the announcement of Wellcome Trust’s plans to update their open access policy for 2020, to ensure all Wellcome-funded research articles are made freely available at the time of publication, and Plan S, which aims to require all research articles funded by the coalition of research funding organisations behind the plan be published in open access journals, or on open access platforms.

Plan S has certainly caught the attention of publishers – for example it has been welcomed with caveats by the International Association of Scientific, Technical and Medical Publishers3, and Nature recently reported it has support in China4

SGUL researchers have benefited from negotiations by Jisc Collections5 with publishers around subscriptions and open access charges; for instance in being able to publish open access for free under the Springer Open Choice agreement.

Contact us via openaccess@sgul.ac.uk if you have any questions about how to meet your funder open access policies.

 

Lastly, special thanks to all of our researchers who have answered our calls to be involved with open research.

In particular, to the laboratory researchers who opened up their groups, projects and labs to us earlier this year and told us all about their data and records management practices. We have now produced a report on our findings and will be building on this work in the New Year.

And to all who have been making their papers open access, as we work towards the next REF.

We hope to see or hear from you in 2019

Michelle Harricharan, Research Data Support Manager
Jenni Hughes, Research Publications Assistant
Jennifer Smith, Research Publications Librarian

 

Contacts

CRIS & Deposit on acceptance: sora@sgul.ac.uk

Open Access Publications: openaccess@sgul.ac.uk

Research Data Management: researchdata@sgul.ac.uk

 

References

1. Page, B. Publishers succeed in getting Sci-Hub access blocked in Russia. The Bookseller [Internet]. 2018 Dec 11 [cited 2018 Dec 13]. Available from: https://www.thebookseller.com/news/sci-hub-blocked-russia-following-court-action-publishers-911571

2. McKenzie, L. Publishers escalate legal battle against ResearchGate. Inside Higher Ed [Internet]. 2018 Oct 4 [cited 2018 Dec 13]. Available from: https://www.insidehighered.com/news/2018/10/04/publishers-accuse-researchgate-mass-copyright-infringement

3. STM. STM statement on Plan S: Accelerating the transition to full and immediate Open Access to scientific publications [Internet]. The Hague: International Association of Scientific, Technical and Medical Publishers; 2018 [cited 2018 Dec 13]. Available from: https://www.stm-assoc.org/2018_09_04_STM_Statement_on_PlanS.pdf

4. Schiermeier Q. China backs bold plan to tear down journal paywalls. Nature [Internet]. 2018 Dec 13 [cited 2018 Dec 14]. Available from: http://dx.doi.org/10.1038/d41586-018-07659-5

5. Earney, L. National licence negotiations advancing the open access transition – a view from the UK. Insights [Internet]. 2018 [cited 2018 Dec 14]; 31 (11). Available from: http://doi.org/10.1629/uksg.412

 


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

Open Access Week 2018: Medical charities collaborate further to ensure results are shared.

OA Week 2018 Banner Website

As the theme of 2018 International Open Access Week  “Designing Equitable Foundations for Open Knowledge” acknowledges, “setting the default to open is an essential step toward making our system for producing and distributing knowledge more inclusive”.

Following on the heels of Wellcome Trust setting up Wellcome Open Research in 2016 – which publishes scholarly articles reporting any basic scientific, translational and clinical research that has been funded (or co-funded) by Wellcome – a group of funders have come together to launch AMRC Open Research:

AMRC screenshot

This is a platform “for rapid author-led publication and open peer review of research funded by AMRC member charities” – which include Parkinson’s UK, Stroke Association, Alzheimer’s Research UK and many more.

All articles benefit from immediate publication, transparent refereeing and the inclusion of all source data

If you are an SGUL researcher in receipt of a grant from these funders, take a moment to look at How it Works.

The AMRC platform levies relatively minimal charges  for publication by researchers funded by the participating charities – much lower than the cost of publishing in traditional journals (see Wellcome is going to review its open access policy blog post, March 2018).

Any questions about making your publications open access, please visit our Open Access FAQs or contact us on openaccess@sgul.ac.uk

For any questions about sharing or preserving data, please visit our Research Data Management pages or contact us on researchdata@sgul.ac.uk

Jennifer Smith

Research Publications Librarian


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

Open Access Open Research

SGUL’s open access institutional repository SORA now has over two thousand full text publications written by SGUL researchers freely available online, a great milestone for SGUL to celebrate in International Open Access Week 2017.

On average there are over 1800 downloads of papers per month from SORA, the papers are indexed in SGUL’s Hunter, and in Google for maximum discoverability:

Screenshot of St George's Online Research Archive website

Win a £30 Amazon voucher: follow the library’s Twitter account @sgullibrary to enter our competition on this year’s OA week theme “Open in order to…” – tell us why you think ‘Open’ is good. (See our blog post and Terms and Conditions for how to enter).

Open access publication is a requirement of many of the big funders in biomedical and life sciences research due to its role in making research more accessible, more discoverable and more impactful1.

On the 4th October the Wellcome Trust released a new science strategy, Improving health through the best research. In it, they reaffirm their commitment to open research:

“Scientific knowledge achieves its greatest value when it is readily available to be used by others. And if knowledge generated with Wellcome support can be used for the improvement of health, it should be.”

Open research is an umbrella term bringing together a variety of efforts to make scientific research transparent and reproducible, and to increase its impact on policy, practice and technological advances. Open access publication is an important part of open research, helping to make research outputs accessible and useable by anyone. Another key tenet of open research is open data, and St George’s has recently launched a data repository to enable researchers to share, store and preserve their research content.

Queen’s University, Belfast, has put together some examples of how open access has benefitted their researchers.


For further information, please visit our open access webpage or contact openaccess@sgul.ac.uk.

1 The Open Access Citation Advantage Service, SPARC (Scholarly Publishing and Academic Resources Coalition) Accessed 19 October 2017


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

Open in order to…

The theme of this year’s International Open Access Week, which runs from 23rd-29th October, is “Open in order to…”. This year the focus is on thinking about possibilities are opened up by making research outputs open access.

Win a £30 Amazon voucher: follow the library’s Twitter account @sgullibrary to enter our competition on this year’s OA week theme “Open in order to…” – tell us why you think ‘Open’ is good. (For terms and conditions, and how to enter, see the end of this post.)

Open in Order to Open Access banner for 2017

Here are some reasons why research is made “open in order to…”

…improve public health

Breakthroughs in medical science are frequently in the news, but the research publications underpinning the headlines are often locked away behind a publisher’s paywall. For example, the research article referred to in this recent article from the BBC  is currently only available to subscribers to the British Journal of Obstetrics and Gynecology, and many publications cited in the recent award of the Nobel Prizes for Chemistry and Physics are not publicly accessible. By contrast, a recent study by SGUL researchers on meningitis in children was published in an open access journal, meaning that the full article can be read by anyone, anywhere in the world, at any time.

Open access research allows anyone who is interested to read and evaluate the research for themselves. This might include:

  • Medical professionals wanting to improve patient care;
  • Members of the public wanting to learn more about a condition they have;
  • Journalists wanting to report more accurately on the story;
  • Policy makers;
  • Researchers whose institutions don’t subscribe to the journal the research is published in, or who are operating outside an institution.

Opening up research helps improve public health by increasing access to academic research.

 

…raise the visibility of my research

Studies1 have consistently shown a citation advantage for open access publications over closed access ones. Depositing your work in a repository increases the avenues by which your research can be discovered, as well as helping readers to follow your research from paper to paper more easily by collecting them all together.

 

…enable global participation in research

Making research open enables all researchers to access it and removes the financial barrier for those working in less well funded institutions, as well as independent researchers working outside institutions. Making your data and publications accessible for free and licensing it under terms which allow for reuse means that other researchers can pick up on and build on your research, benefitting the global research community as a whole.

 

…find new collaborators

Making your work open helps researchers on related topics find it and identify possibilities for collaboration. Open access can also promote cross-disciplinary working by making it easier for researchers to access work outside their own discipline.

1 The Open Access Citation Advantage Service, SPARC (Scholarly Publishing and Academic Resources Coalition) Accessed 19 October 2017

 

How to enter:

Follow @sgullibrary on Twitter and complete the phrase “Open in order to…” using the hashtag #openinorderto and @sgullibrary’s Twitter handle.

Terms and Conditions:

  1. The competition will run from Monday 23 October 2017 until Sunday 29 October 2017.
  2. The prize draw is open to anyone with a valid SGUL ID.
  3. Winners will be chosen from all valid entries once the competition has closed on Sunday 29 October 2017.
  4. Winners will be contacted via Twitter. Be sure to check your account.
  5. The prize can only be collected in person from St George’s Library on production of a valid ID card.
  6. Prizes must be collected within two weeks of notification.
  7. The Judges’ decision is final and no correspondence will be entered in to.
  8. Photos of the prize winners will be taken to be used in publicity on Library media channels.
  9. One prize winner will be selected, unless the prize is not collected by the deadline, in which case the uncollected prize will be reselected (once only).
  10. Your tweets may be reused by St George’s Library for future promotional or informational purposes.
  11. Entries must contain the hashtag #openinorderto and must tag the library’s Twitter account @sgullibrary.

 


To find out more about open access, contact openaccess@sgul.ac.uk or visit the Library open access webpages.


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

St George’s announces new research data repository

The Research Data Management Service has launched a research data repository for use by St George’s researchers, including our doctoral student researchers.

Figshare homepage screengrab for blog

Powered by figshare, the repository is the first phase of a pilot project to develop a shared research data management infrastructure for UK higher education. The pilot is headed by Jisc, and St George’s is proud to be one of just 13 higher education organisations included in the project. More information about this can be found on the project website.

The SGUL data repository is a digital archive for sharing, storing and preserving research content produced at St George’s. It was acquired to enable our researchers to better engage in Open Science and to respond to funder and publisher requirements for data sharing and preservation.

Researchers can use the repository to share research data, source code, posters, PowerPoint presentations, images, videos, electronic lab notebooks and a range of other digital research outputs. The repository can also be used to catalogue and link to items that are already in the public domain, but are difficult to discover, cite and measure for impact. Each deposit in the repository is provided with a persistent identifier, which allows items to be uniquely identified, cited and measured for impact.

All items deposited with us will be preserved for the lifetime of the repository.

Depositing to the repository is easy. All research staff and doctoral students are automatically registered for the service. Just log in to the repository using your institutional credentials and deposit your items following figshare’s normal deposit procedures. All deposits will be checked by a member of the research data management team before your research is published, giving you added peace of mind.

It is advisable to contact the Research Data Management Service if you intend to deposit your data in the repository to avoid any delay in publishing your research.


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

Evidence based healthcare resources

BMJ Case Reports

BMJ Case Reports is an international, peer reviewed collection of over 13,500 clinical cases covering all disciplines for clinicians and researchers.

Search by keyword or browse by specialty to find clinically important information on common and rare conditions, or subscribe to the RSS feed to receive updates on latest articles, most read articles or new blog posts.

Access is available to NHS staff via their NHS OpenAthens account (self-register here), and to SGUL staff and students via their university login details.

If you have an interesting case, you can receive peer reviews and rapid publication by submitting it for inclusion to BMJ Case Reports Journal. See the BMJ website for submission templates and full details on how to submit your case.

For more information about the SGUL subscription, or to obtain the institutional fellowship code, contact journals@sgul.ac.uk.

Drugs and Therapeutics Bulletin

Independent of the pharmaceutical industry, Government and regulatory authorities, each article in the DTB has been evaluated by a wide range of specialist and generalist commentators.

By providing rigorous, unbiased assessments and recommendations of drugs and other treatments for diseases, this journal can be relied upon by doctors, pharmacists and other healthcare professionals who are looking to make evidence based decisions to ensure their patients receive the best possible care.

Access is granted via Shibboleth for SGUL staff and students, and via OpenAthens for NHS staff.