Improve research discoverability to support health literacy

Catherine Skobe, Sally Dews

The World Health Organization defines health literacy as the achievement of a level of knowledge, personal skills and confidence to take action to improve personal and community health.1 Discoverability and accessibility initiatives that make health data easier to find and understand can support health literacy and improve patient empowerment. 

Discoverability and accessibility initiatives that make health data easier to find and understand can support health literacy and improve patient empowerment. 

How can metadata improve research discoverability?

The digital space is awash with health information and misinformation; differentiating between the two can be far from easy. Although concerns over predatory journals have diminished thanks to growing awareness of their existence across the research community, it remains important to distinguish between evidence-based, peer-reviewed information and less reliable sources. Appropriate and accurate use of metadata are instrumental in this pursuit.

Metadata literally means data about data.2 In the context of scientific publications, metadata are information about a publication that helps search engines and scientific databases find the research, such as: author name, affiliation and unique identifier, keywords, abstract, publication details, and peer review status.3

Metadata are an important component of the discoverability of a publication. For example, an established element of clinical study publication metadata is the inclusion of the National Clinical Trial (NCT) number within the abstract of trial publications. The NCT number is a unique identifier that the National Library of Health assigns to each trial registered in the registry. Clear citation of the NCT number in trial publications helps to identify and link the research outputs from a specific trial.

How can plain language summary (PLS) metadata support health literacy?

Metadata in the context of accessible summaries include use of appropriate PLS metadata tags when a scientific article with a PLS is published. Use of PLS metadata tags is important in the context of health literacy, as they can help non-expert readers to find available accessible summaries and better understand the data being reported.

Poor PLS tagging may result in these accessible summaries never reaching their intended audience or benefiting the health literacy of interested non-expert readers. Open Pharma research into PLS discoverability found that, of nearly 32 million medical articles indexed in the PubMed bibliographic database in 2022, only 3217 (a mere 0.01%) had a formal PLS metadata tag (i.e. an XML<plain-language-summary> tag in the ‘Other Abstract’ field).4 Half (51.1%) of all articles with a correct XML PLS tag had been published in 2021, however, suggesting that both the inclusion of publication PLS and the use of PLS metadata tags may be starting to improve.

How can authors contribute to improved metadata?

Metadata may not be the first thing that a researcher thinks of when developing their scientific publication, but authors play an integral role in providing complete and accurate metadata that will support the discoverability of their research.

At the time of manuscript submission, journals prompt authors to provide key metadata about their article. Journal prompts reflect the types of queries that readers might use when searching for published literature (e.g. author names, therapy area, publication type). Complete and accurate provision of these data by the authors is therefore critical to successful reader searches. There are some other, proactive, measures that authors can implement to improve publication discoverability.

For articles with a publication PLS, authors can include in the journal submission letter a request for use of the recommended XML PLS tags at the time of publication. The use of these tags will make it more likely that their PLS appears alongside their scientific (or technical) abstract on PubMed, aiding its discoverability and accessibility to a non-expert audience.

Another author-led method of increasing the discoverability of their research is by registering for an Open Researcher and Contributor ID (ORCID). An ORCID is a unique number that, if cited in all publications, links to an author’s research outputs and ensures they are correctly attributed. ORCID is a free and easy method of increasing discoverability of an author’s full body of work. It allows readers (patients, healthcare professionals and researchers) to identify a researcher’s back catalogue, which in turn helps in verifying the authenticity of a particular piece of work and discovering other publications by the same author that might further support the reader’s health literacy.

How can stakeholder collaboration help optimize metadata?

Collaborative working lies at the heart of initiatives to improve the metadata tagging of scientific publications. We have outlined some proactive steps that authors can take to support the metadata for their research, but journals and publishers need to act on these author requests if they are to translate into improved research discoverability. In addition, research sponsors can play a role in raising awareness within the community of the value of metadata and supporting authors to adopt easily implementable discoverability initiatives.

At Pfizer, we support the researchers whom we work with by providing resources that explain the benefits of ORCID, as well as outline the merits of open access publishing. We also established a health literacy community of practice to develop best practices across the enterprise, which led to a range of different in-house health literacy initiatives. These included the development of a health literacy style checker – a tool to help assess PLS from a health literacy perspective. In addition, we established Pfizer’s publications collaborative board with the objective of proactively engaging patients in the end-to-end publication process. The board includes global patient advocates from a variety of therapeutic areas and provides valuable guidance to ensure we meet the needs of our audiences. Lastly, we are also using our new publications management system to help ensure that the metadata for our scientific publications are accurate and robust so that they maximize the discoverability of our research. Through these various methods, we aim to meet the needs of the patients we serve.

Finally, we believe that publication discoverability, journal peer review and open access publishing are a trifecta in the pursuit of enhanced access to scientific research for all, supporting both health literacy and health equity.

Catherine Skobe is Senior Director and Pfizer Publications Innovative Solutions Lead, and Sally Dews is Senior Medical Affairs Manager at Pfizer Patient Partnerships.


  1. Nutbeam D. Health promotion glossary. Health Promot Int 1998;13:349–64.   
  2. Riley J. Understanding metadata: what is metadata, and what is it for? Washington DC: National Information Standards Organization. (Accessed 13 January 2023).
  3. de Waard A and Kircz J. Metadata in Science Publishing. (Accessed 8 February 2023)
  4. Rosenberg A et al. Assessing PubMed meta-tag usage for plain language summary discoverability [poster]. Presented at the 9th International Congress on Peer Review and Scientific Publications, 8–9 September 2022, Chicago, IL, USA.