Update 2026 - Interview with WP5 leader: Arthur Pratt


“For me, one of the most exciting aspects of SPIDeRR is its use of electronic health record (EHR) data — and specifically, the enormous potential these data hold for improving patient care”



We had the opportunity to speak once again with Dr. Arthur Pratt, the leader of work package 5, which focuses on collecting both genetic data and general clinical information — such as symptoms — across diverse populations and jurisdictions within SPIDeRR. In this interview, he discusses the progress made over the past year, key research findings from the work package, ongoing challenges, collaborative efforts, and the planned next steps for the year ahead.


Progress on genetic data integration 

What added value does genetic information provide to rheumatologists when making a diagnosis? To address this core question, Pratt and his team are assessing the availability and integration of genetic data and accompanying clinical information across various countries and collaborators. They now have a clearer understanding of how many participating partners possess sufficient genetic and clinical data to support meaningful analysis. “At present, we have been able to conduct analyses using data from Newcastle alone. Progress in integrating data from other centres has been a bit delayed due to the time-consuming process of finalising data sharing agreements — a necessary legal step that is advancing. We hope these agreements will be completed within the next two to three months, which would allow us to begin harmonising datasets from all participating institutions. This will enable us to explore our research question on a broader, multinational scale, which is very exciting”, says Dr. Pratt.


Refining RA diagnosis with genetics
The literature shows that polygenic risk scores (PRS) — which use genetic information to estimate the likelihood of specific outcomes — may distinguish individuals with long-established rheumatoid arthritis (RA) from healthy controls with reasonable reliability. However, these studies typically involve well-defined cases and healthy individuals, rather than the more complex presentations encountered in early arthritis clinics. In real-world clinical settings, patients may have RA or other musculoskeletal conditions, and clinicians must rely on a combination of signs and symptoms. According to Dr. Pratt, the added value of genetic information in this more nuanced context remains unclear. In his words: “Over the past year, we have begun addressing this question using data from just a few hundred individuals in the Newcastle dataset, roughly half of whom have RA and half other diagnoses. Our findings indicate that clinical variables alone can reasonably distinguish RA from non-RA cases — an encouraging sign that clinical judgment is robust. Adding genetic information does improve the model’s discriminatory ability slightly, but the incremental benefit is small”. Research is then also aiming on refining the statistical model that incorporates genetic data to determine whether its performance can be meaningfully enhanced, with the aim of making such tools more practical and informative in routine clinical care. The team will then test this out in the full, EU-wide dataset.


Integrating AI and genetics for better care
Ultimately, SPIDeRR aims to develop digital tools that help doctors to triage and refer patients to the right healthcare providers as quickly as possible, enabling earlier initiation of appropriate treatments. Currently, the research team focus is on secondary care, exploring whether genetic tools can help identify people with rheumatoid arthritis more rapidly than current methods allow. If successful, this could pave the way for applying genotyping earlier in the care pathway, such as in primary care settings. In parallel, work package 5 is exploring causal modelling, a statistical genetics approach, to better understand the underlying causes of different types of arthritis. By integrating genetic information with detailed clinical data, including laboratory results and lifestyle factors like smoking, the team hopes to distinguish genuine causal factors from mere associations. For example, genetic data could clarify whether smoking directly causes poor outcomes or simply correlates with them. Dr. Pratt mentions: “Ultimately, this could lead to models that identify modifiable risk factors — offering new opportunities for early intervention and prevention of adverse disease trajectories”.


Unexpected insights
There has been considerable excitement in the field about the potential clinical value of genetics, particularly PRS, as important tools for the future, notes Dr. Pratt. While PRS hold promise in certain areas, the team’s experience suggests that, in the context of diagnosing and prognosticating musculoskeletal conditions, their additive value to clinical information is relatively modest unless new applications emerge. Dr. Pratt explains: “This may be somewhat disappointing compared to initial expectations, but it is not entirely surprising — especially to those of us in the SPIDeRR group, where we are beginning to see signs that PRS have limited impact in this specific setting. However, novel approaches, such as tools like G-PROB (Genetic Probability tool, for calculating the probability of different diseases for a patient using genetic risk scores) published previously by Dr. Rachel Knevel, may enhance the utility of genetic scores. These approaches require rigorous evaluation in real-world clinical environments, which SPIDeRR is well-positioned to provide”.


Overcoming data access challenges
One of the work package’s main challenges has been navigating the legal and administrative requirements needed to access large-scale datasets. So far, the team’s work has relied on only Newcastle data, which limits the scope of their findings. “Incorporating datasets from other centres, particularly across different countries, would significantly enrich the data and provide a better representative view of the general population. Once we overcome the technical and legal hurdles — particularly around data-sharing agreements — we’ll be in a much stronger position to answer the key questions with greater confidence”, says Dr. Pratt.


Collaborations
Within work package 5, Dr. Pratt’s current collaboration is with colleagues at Karolinska Institute (Sweden), Leiden University Medical Centre (LUMC), and with partners at Manchester University and the University of East Anglia, who have joined SPIDeRR as associate partners. In addition to their core involvement, work package 5 also contributes to other work packages. For example, they plan to provide substantial data for work package 2 and, ultimately, for work packages 3 and 4. As mentioned before, the team also has a lively group of patient partners in Newcastle who collaborate with colleagues across the consortium.


Future steps
By this time next year, Pratt and his team hope to better understand how PRS can help diagnose musculoskeletal problems earlier, particularly in cases of early arthritis. He explains that they have built analysis tools and are now aiming to apply them to much larger datasets.
In October 2024, they welcomed Dr. Karina Patasova, a postdoctoral researcher who has been working on demonstrating how genetics can shed light on what triggers conditions like rheumatoid arthritis. Furthermore, the team is exploring an exciting area called causal modelling, which helps untangle what actually causes complex diseases. As Dr. Pratt explains: “Causal modelling may be complex, but its purpose is simple: to identify what truly causes disease. This helps ensure treatments target the real root of the problem, not just factors that seem related”.


Breaking new ground in health data
For Dr. Pratt, one of the most exciting aspects of SPIDeRR is its use of electronic health record (EHR) data — and specifically, the enormous potential this data holds for improving patient care. In the UK, for example, the EHR system of the National Health Service (NHS) contains a vast amount of latent information. “If harnessed effectively, this could fundamentally transform the patient journey through the healthcare system”, says Pratt, adding that SPIDeRR is at the forefront of efforts to unlock this untapped resource.
According to Dr. Pratt, it is logical that the very challenges faced by projects like SPIDeRR — from data access to securing permissions and engaging stakeholders — are what make the work both essential and compelling. “We shouldn’t be surprised that these efforts are difficult; in fact, the difficulty is a sign that we are truly breaking new ground. No one has attempted this at such scale or in quite the same way before. And in science, as is often the case, the hardest problems are the ones most worth solving.”


Concluding remarks
According to Dr. Pratt, unlocking the full potential of electronic health records and genetic data will require not just technological innovation, but also strong regulatory frameworks and trust structures that ensure data is used safely and meaningfully. He reflects: “Importantly, patients are often willing participants — eager to see their data used to advance research and improve care. Projects like SPIDeRR are paving the way by demonstrating how data can be used responsibly to deliver real clinical impact. By making the use of health and genomic data more routine, transparent, and patient-centered, we can begin to bridge the gap between scientific promise and tangible benefits in everyday care”.



Return