Challenging but rewarding – Wellcome Trust Data Re-use Prize winner, Quentin Leclerc, on reusing open data

Last November the Wellcome Trust launched the Data Re-use Prize to celebrate innovative reuse of open data either in antimicrobial resistance (AMR) or malaria. Entrants were asked to generate a new insight, tool or health application from two open data resources, the AMR ATLAS dataset or the Malaria ROAD-MAP dataset.

MRC-LID PhD student and member of the winning team for AMR, Quentin Leclerc, dropped by the SGUL RDM Service to talk about the prize and the challenging but rewarding process of reusing open data.

Quentin, congratulations on the win. Can you tell me a little bit about your team’s entry for the Data Re-Use Prize?

Sure. We developed a tool to help inform empiric therapy. Empiric therapy is basically when physicians pool multiples sources of data together to make the best informed guess about how to treat a patient. This is before they know exactly what bacteria a patient is infected with and its potential resistance to antibiotics. Say, for example, a patient has sepsis and needs to be treated right away. A physician might determine the most likely causes as E.coli and S. aureus infection and then make an informed guess about the best antibiotic to prescribe to treat both of these bacteria, bearing in mind regional estimates of each of pathogen’s resistance to different antibiotics. The physician is basically thinking, “given what we know about the common causes of this condition and antibiotic resistance, which antibiotic is likely to work best?”

Our proof of concept web app integrates data from a range of open data sources to visualise antibiotic resistance rates for common infections to help physicians prescribe faster and more accurately. If developed, the tool can potentially be used to inform national guidelines on how to treat common infections in many countries, particularly in low and middle income counties where data aren’t always available to inform empiric therapy at the local or hospital level.

app screenshot
Some visualisations from the team’s AR.IA app

Sounds very exciting. As a first year PhD student, what was it like to win a prize like this?

It was really unexpected. We didn’t expect to win, we just thought, ‘we’ll publish our findings anyway so let’s see how this goes’. The other entries for the prize were very specific while our entry was pretty broad so we weren’t very confident. It was a real surprise and a great effort from everyone on the team.

Team photo
Team photo (l to r): Gwen Knight, Quentin Leclerc, Nichola Naylor and Alexander Aiken
Missing: Francesc Coll

As a PhD student, it was an interesting experience overall. This project is very different from my PhD but working on this tool helped me to get used to the various datasets out there and to look at the big picture of antimicrobial resistance and antibiotic prescribing. It was an enlightening process.

Can you tell me a little bit more about the process of reusing existing data? What was it like?

It was surprising. The thing with data is that it’s collected for a purpose. When someone comes in trying to use that data for a different purpose, they start to see what’s missing. They start to make approximations and assumptions to use the data for something it wasn’t intended for. The ATLAS dataset is very accurate and it’s very rich but it suits its original purpose. For example, we needed to group the data in increasingly complex ways. Once we started doing this, the sample sizes started to look quite small. The dataset wasn’t suited to those kinds of groupings.

When we started comparing the ATLAS dataset to other datasets, the AMR data appeared to show slightly different information. So we started to ask, who collected this data? In what contexts would this data have been collected? Might there be a sampling bias that explains this difference we’re seeing between the datasets? There was a legitimate reason for the difference we were seeing, but that’s why it’s really important to think about why you’re using a dataset and exactly what you want to achieve because the data may not suit your purpose.

Also, we integrated data from a range of sources. When you start doing this, comparing available datasets, you realise the heterogeneity of the data that’s out there; they are all in different formats, they have different naming conventions, even the bacteria aren’t named in the same way and we had to work out exactly which bacteria different datasets were referring to. There aren’t any standards across the different sources to make integrating the datasets easy.

So there were a lot of challenges to reusing data that someone else created?

Yes, we needed to keep in mind that the data was not created to answer our research question. We also found that there was a lack of information in the available literature around the common causative pathogens of several infections to help us understand and use the data correctly.

What advice would you give to researchers wanting to reuse open datasets but are hesitant?

It is important to look at the dataset and really understand it. Ask yourself why it was collected, where it was collected, how it was collected. Don’t take anything for granted. Open datasets are incredible resources but you can’t blindly go in there.

Once you understand the dataset you’ll naturally get the confidence to use it and ask the right questions of it. You won’t be scared or overwhelmed by it. You’ll also save a lot of time once you start working on the data and better understand how to combine it with other datasets.

Quentin and his team’s winning entry, Antibiotic Resistance: Interdisciplinary Action (AR:IA), is openly available here. The team was led by Dr Gwen Knight at the London School of Hygiene and Tropical Medicine and included Nichola Naylor, Francesc Coll and Alexander Aiken.    

If you have any questions about finding and reusing open data contact Michelle Harricharan, Research Data Support Manager.

UPDATE 03/05/2019: You can read the official SGUL news release on this prize here.

Advertisements

St George’s Library in Numbers: 2018

2018 was a year of change for St George’s Library. We introduced a new library management system, which underpins the circulation of library items. In the summer, we upgraded Hunter’s interface for a more intuitive search tool. Alongside these changes, we introduced automatic renewals and additional loans. This means our users can now borrow more books for longer.

As well as improving access to resources, we continued to offer support to our users. Our new Subject Library Guides provide targeted online support to students and our refreshed information skills training sessions offer face-to-face workshops on a range of topics. Our institutional open access and research data repositories have continued to expand.

It’s not just the library staff who were busy in 2018. Our users made great use of the library: there was more footfall in the library, searches in Hunter and downloads of e-resources than in 2017. The info-graphic below shows some stats from the library in 2018. Click on the link underneath to download the PDF.

Resources

We developed our collection throughout the past year. We purchased 2246 new books which were added to the library shelves. After a successful trial of JOVE (Journal of Visualized Experiments) we added this resource to our subscribed databases. Library members ran 353,069 Hunter searches 2018 – that’s 29, 422* searches every single week! Well over half a million journal articles were downloaded, 691,858 to be precise, and 26,784 books were borrowed.

Services

Footfall was high last year and 45,000* of you visited the library every month. Over 1000 new students attended library inductions at the start of the year and many more students attended further library sessions throughout the year. The NHS Liaison team conducted 88 Cares searches to support clinical activity and decision making.

Research

St George’s Data Repository, powered by figshare, was launched in 2017. Last year, it gained 24 new public deposits and had 661* monthly views. St George’s Online Research Archive (SORA) had 2325* downloads per month and 2980 full-text items publicly available.

We’ve enjoyed looking back on 2018 but we’re also excited for what 2019 will bring. It’s not even mid-way through January and already we’ve seen the arrival of new self-service machines. These machines will make it easier to borrow multiple items – simply stack your books on top of each other and they will all be issued. As we increased the number of items that you could borrow last year, this new feature should come in handy!

*approximate average based on 2018 figures

Open Access: Green and Gold

St George’s researchers: read on to find out how to make research open access, and how to win a £30 Amazon voucher…

There are two different ways to make your research articles open access: the green route and the gold route.

Green Open Access

Green Open Access: What is it?

Green open access means making your research articles freely available via a subject or institutional repository (such as SORA, SGUL’s institutional repository), after any embargo period required by the publisher has passed.

What do I need to do?

When your article is accepted for publication, create a basic record in the CRIS (Current Research Information System for St George’s Researchers) and upload your author’s accepted manuscript to it. . (This is the version after any changes resulting from peer review, but before the publisher’s formatting and copy editing.) We will then check the record and apply any embargo period before making it live in SORA.

For more guidance, please log in to your CRIS profile and click on the Help tab at the top right hand side.

If you have any questions, see our website or contact sora@sgul.ac.uk

 

Gold Open Access

Gold Open Access: What is it?

Gold open access means making your research articles freely available on the publisher’s website when they’re published, usually under a license which allows for reuse.

What do I need to do?

Find out if the journal you’re publishing in has an open access option, and then see if you have any funding available to pay for it.

Some publishers offer discounts or waivers for SGUL researchers: check our page on open access fees to see if any of them apply to you.

If your research is funded:

RCUK and COAF (a partnership of six health research charities) have provided us with funds to make articles arising from that research open access. To find out if you’re eligible, see our website or email openaccess@sgul.ac.uk

If your research is funded by another grant, check with your grants officer to see if there are any funds in it for open access publications.

If your research is unfunded:

Consider applying to our new Institutional Fund for open access publication fees – see the link on our open access webpage.


 

Open Access Week Competition

Win a £30 Amazon voucher: follow our Twitter account @sgullibrary to enter our competition on this year’s OA week theme “Open in order to…”  – tell us why you think ‘Open’ is good. (See our blog post and Terms and Conditions for how to enter).


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

Open Access Open Research

SGUL’s open access institutional repository SORA now has over two thousand full text publications written by SGUL researchers freely available online, a great milestone for SGUL to celebrate in International Open Access Week 2017.

On average there are over 1800 downloads of papers per month from SORA, the papers are indexed in SGUL’s Hunter, and in Google for maximum discoverability:

Screenshot of St George's Online Research Archive website

Win a £30 Amazon voucher: follow the library’s Twitter account @sgullibrary to enter our competition on this year’s OA week theme “Open in order to…” – tell us why you think ‘Open’ is good. (See our blog post and Terms and Conditions for how to enter).

Open access publication is a requirement of many of the big funders in biomedical and life sciences research due to its role in making research more accessible, more discoverable and more impactful1.

On the 4th October the Wellcome Trust released a new science strategy, Improving health through the best research. In it, they reaffirm their commitment to open research:

“Scientific knowledge achieves its greatest value when it is readily available to be used by others. And if knowledge generated with Wellcome support can be used for the improvement of health, it should be.”

Open research is an umbrella term bringing together a variety of efforts to make scientific research transparent and reproducible, and to increase its impact on policy, practice and technological advances. Open access publication is an important part of open research, helping to make research outputs accessible and useable by anyone. Another key tenet of open research is open data, and St George’s has recently launched a data repository to enable researchers to share, store and preserve their research content.

Queen’s University, Belfast, has put together some examples of how open access has benefitted their researchers.


For further information, please visit our open access webpage or contact openaccess@sgul.ac.uk.

1 The Open Access Citation Advantage Service, SPARC (Scholarly Publishing and Academic Resources Coalition) Accessed 19 October 2017


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

Open in order to…

The theme of this year’s International Open Access Week, which runs from 23rd-29th October, is “Open in order to…”. This year the focus is on thinking about possibilities are opened up by making research outputs open access.

Win a £30 Amazon voucher: follow the library’s Twitter account @sgullibrary to enter our competition on this year’s OA week theme “Open in order to…” – tell us why you think ‘Open’ is good. (For terms and conditions, and how to enter, see the end of this post.)

Open in Order to Open Access banner for 2017

Here are some reasons why research is made “open in order to…”

…improve public health

Breakthroughs in medical science are frequently in the news, but the research publications underpinning the headlines are often locked away behind a publisher’s paywall. For example, the research article referred to in this recent article from the BBC  is currently only available to subscribers to the British Journal of Obstetrics and Gynecology, and many publications cited in the recent award of the Nobel Prizes for Chemistry and Physics are not publicly accessible. By contrast, a recent study by SGUL researchers on meningitis in children was published in an open access journal, meaning that the full article can be read by anyone, anywhere in the world, at any time.

Open access research allows anyone who is interested to read and evaluate the research for themselves. This might include:

  • Medical professionals wanting to improve patient care;
  • Members of the public wanting to learn more about a condition they have;
  • Journalists wanting to report more accurately on the story;
  • Policy makers;
  • Researchers whose institutions don’t subscribe to the journal the research is published in, or who are operating outside an institution.

Opening up research helps improve public health by increasing access to academic research.

 

…raise the visibility of my research

Studies1 have consistently shown a citation advantage for open access publications over closed access ones. Depositing your work in a repository increases the avenues by which your research can be discovered, as well as helping readers to follow your research from paper to paper more easily by collecting them all together.

 

…enable global participation in research

Making research open enables all researchers to access it and removes the financial barrier for those working in less well funded institutions, as well as independent researchers working outside institutions. Making your data and publications accessible for free and licensing it under terms which allow for reuse means that other researchers can pick up on and build on your research, benefitting the global research community as a whole.

 

…find new collaborators

Making your work open helps researchers on related topics find it and identify possibilities for collaboration. Open access can also promote cross-disciplinary working by making it easier for researchers to access work outside their own discipline.

1 The Open Access Citation Advantage Service, SPARC (Scholarly Publishing and Academic Resources Coalition) Accessed 19 October 2017

 

How to enter:

Follow @sgullibrary on Twitter and complete the phrase “Open in order to…” using the hashtag #openinorderto and @sgullibrary’s Twitter handle.

Terms and Conditions:

  1. The competition will run from Monday 23 October 2017 until Sunday 29 October 2017.
  2. The prize draw is open to anyone with a valid SGUL ID.
  3. Winners will be chosen from all valid entries once the competition has closed on Sunday 29 October 2017.
  4. Winners will be contacted via Twitter. Be sure to check your account.
  5. The prize can only be collected in person from St George’s Library on production of a valid ID card.
  6. Prizes must be collected within two weeks of notification.
  7. The Judges’ decision is final and no correspondence will be entered in to.
  8. Photos of the prize winners will be taken to be used in publicity on Library media channels.
  9. One prize winner will be selected, unless the prize is not collected by the deadline, in which case the uncollected prize will be reselected (once only).
  10. Your tweets may be reused by St George’s Library for future promotional or informational purposes.
  11. Entries must contain the hashtag #openinorderto and must tag the library’s Twitter account @sgullibrary.

 


To find out more about open access, contact openaccess@sgul.ac.uk or visit the Library open access webpages.


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

SORA has passed 1000 full text publications freely available online!

The logo for SORAWe’re pleased that SGUL’s open access institutional repository, SORA (St George’s Online Research Archive) has now made over one thousand full text publications written by researchers at SGUL freely available online.

Many of the big funders in biomedical and life sciences research require publications reporting the results of research they’ve funded to be available on open access, because open access will

  • Allow research to have maximum impact around the world, by letting researchers read and build on work already done ( You Tube How Open Access Empowered a 16-Year-Old to Make Cancer Breakthrough)
  • Increase citation advantage (PLOS One article on citation advantage)
  • Increase visibility and discoverability of your research (SHERPA FAQs explain how Google & Google Scholar search favours OAI-repository material and normally ranks it higher than an individuals’ websites)
  • Engage more members of the public with your research (podcast with Peter Suber)

For more information on open access please visit the Library open access webpages, or contact openaccess@sgul.ac.uk

Nora Mulvaney
Jennifer Smith
Research Publications and Open Access


If you are interested receiving updates from the Library on all things open access, open data and scholarly research communications, you can subscribe to the Library Blog using the Follow button or click here for further posts from us.

Useful books for Researchers

We’ve put together a selection of books that would be useful to researchers. Please click on the image to access the book details on the Library Catalogue.

Managing and sharing research data: a guide to practice
Managing and Sharing Research Data: H62 COR

Research design explained
Research Design Explained: H62 MIT

Doing a literature review in health and social care: a practical guide
Doing a Literature Review in Health and Social Care: LB2395 AVE

Conducting research literature reviews: from the internet to paper
Conducting Research   Literature Reviews:   LB2395 FIN

Getting Research Published: an A-Z of publication strategy
Getting research published: PE1475

How to write a grant application
How to write a grant application: W20.5 HAC

Introduction to research methods and data analysis
Introduction to Research Methods and Data Analysis in the Health Sciences: W25 HAG

Searching skills toolkit: finding the evidence
Searching Skills Toolkit: Finding    the Evidence:  WB25 DEB

Beyond Bibliometrics: harnessing multidimensional indicators of scholarly impact.
Beyond Bibliometrics:       Z669 CRO