Procan can

A new way of using artificial intelligence to analyse thousands of patients’ cancer samples has been developed by scientists at the Children’s Medical Research Institute (CMRI) in an international collaboration published in the prestigious journal Cancer Discovery.
The ProCan cancer research program at CMRI is analysing thousands of different types of proteins (the proteome) in childhood and adult cancers to help cancer clinicians match their patients with the best treatment available.
They are a step closer to that goal with this study that involves 30 collaborating research groups in six countries (Austria, Australia, Canada, Greece, Spain and the USA), and cancer proteomic data obtained by the ProCan team from 7,525 cancers, which is the largest set of cancer proteomes generated in a single centre.
The reason that the size of the dataset matters is that predicting how a cancer will behave based on the proteome, including how the cancer will respond to treatment, requires advanced computational techniques. This includes AI, which needs to be trained on large datasets that include both the proteome and clinical information about the patient.
However, data privacy regulations and other restrictions on the transfer of data across geographical boundaries make it challenging to assemble large sets of patient data, especially when multiple countries are involved.
The ProCan team have shown how this problem can be overcome by simulating the situation where permission is given to use proteomic and clinical data, but with very restricted access.
Using an AI technique called federated deep learning, they trained AI models on datasets stored at several local sites held behind firewalls. Instead of sharing clinical data, these AI models were sent to a central server to update a global model. Repeating this process multiple times resulted in a diagnostic test that has essentially the same accuracy as when the data was all brought together in one centralised database.
Professor Roger Reddel, a senior author of the publication said: “It was a very exciting moment when we first saw that the results from data with highly restricted access were just as accurate as the results obtained when the data was all stored in one place.”
In addition, this work has overcome another problem regarding proteomic data that has made it very difficult to build large datasets. Different research institutions use different methods for obtaining proteomic data from cancer samples, and this is a major barrier to combining proteomic data from different research centres.
As part of the research reported in this publication, the team showed that federated deep learning made it possible to successfully combine the proteomic data generated at CMRI from the 7,525 cancer samples with proteomic data generated at other research institutions with different techniques, and that this further improved the accuracy of the cancer diagnosis.
These advances will speed up the achievement of ProCan’s mission to use proteomic data to improve outcomes for cancer patients.
Professor Reddel said: “The purpose of CMRI’s ProCan research program is to develop proteomic tests that will assist cancer clinicians to choose the best treatment available for each of their patients. By overcoming several major barriers to assembling and analysing large cancer proteomic datasets, we have made a major step towards achieving this goal.”

Open Forum is a policy discussion website produced by Global Access Partners – Australia’s Institute for Active Policy. We welcome contributions and invite you to submit a blog or follow us on Linkedin, Mastadon. and Bluesky.