Protons 101: Radiation Oncology OAR metrics: Which is the strongest?
I reached out and ran a "scientific poll". Today we discuss the results and how I mentally categorize them.
First off, Twitter limits polls to 4 options so I narrowed the field to those. My other top candidates were hippocampus, esophagus, parotid and LAD. Those 8 are top tier metrics with pretty good data for affecting clinical outcomes in radiotherapy.
Overall, I agree with the poll. Heart and pharyngeal constrictors are true front runners. Like any best of list, results are debatable. But heart clearly has traction in the broader literature for the past decade so not surprising it would win.
The poll was based on my prior OAR reference document linked here:
In this more complete reference document, I reviewed 17 of the best metrics with the opportunity to demonstrate differences between protons and photons - much as I proposed back in 2019 with the PRODOSC Trial structure.
Here’s how I categorize OAR metrics from a strength of data perspective and my thoughts on the simple poll I posted.
Radiation Oncology OAR Metrics: Data Tiers
Group 5: Retrospective Correlations:
Examples: CNS tissue, cochlea, bone marrow.
The weakest general category are OAR metrics like CNS tissue or cochlea, or bone marrow. There are based upon retrospective studies alone. We look back at a patient population and see if dose to the cochlea seemed to impact hearing and try to develop a metric in that fashion. They are interesting and likely relevant on some level but from a data perspective, they are rather unproven entities.
Group 4: Lower is better with validation in prospective datasets:
Examples: Heart and Immune System
The next tier would be the next weakest tier of data but both of these are quite unique which then blurs the OAR metric value. These two metrics that have endpoints that are general - lower is better. But then they have been validated in a prospective dataset. The “validation” actually comes from within the same trial - RTOG 0617 - the lung dose escalation trial showing no benefit to higher doses.
This trial then has two secondary analysis that show that both mean heart dose and dose to the immune system correspond to outcomes. So these are stronger than the first group in that they have been validated in a prospective setting. The other interesting part about these two metrics is that they were validated for showing that lower doses improve overall survival. So they are stronger based on validation but also for the strength of outcome. The next three tiers are stronger from a process / data perspective but ultimately are less important clinically than the gold standard of overall survival.
Between the two, I think the heart literature is more developed and has more depth. So even though the highest level data is equal between these two metrics, the breath of the data for heart relating to MACE and OS makes it the stronger choice of the two.
Group 3: Prospective studies in which lower is better, but with no specific pre-defined metric:
Examples: Parotids, small bowel, rectum
The classic example would be the parotid glands. They have Group 5 and Group 3 data. The group 3 data stems from the prospective HN trials comparing IMRT to 3D. The goal was to optimize (minimize) dose to the parotids with IMRT and by doing so, the trials demonstrated less xerostomia.
The trials did not require dose to the opposite parotid of <10 Gy for example. They just pushed for lower. Similar situations are in the small bowel where IMRT showed benefit over 3D in pelvic treatment and in the rectal spacing data for prostate where rectal spacing improves rectal dosimetry which then results in less rectal toxicity.
Group 2: Prospective data with pre-defined metrics, but in palliative setting:
Examples: Esophagus and Hippocampus
Actually two of our final 3 metrics have been achieved in the palliative setting. Pretty crazy really, but on my review that is the case.
Esophageal dose is from the palliative chest radiation trial. The pre-defined randomization was keeping Dmax to the esophagus to no more than 80% of the prescription dose (either 20 Gy / 5 fx or 30 Gy / 10 fx). Limiting dose limited esophagitis. They prioritized the OAR over tumor coverage which is far easier to do in a palliative setting but we are considering now in SABR chest treatments.
The other is more broadly known I think - the hippocampal whole brain sparing trial where sparing the hippocampal with predefined goals of D100<9Gy and Dmax<16. That arm resulted less cognitive failure. A clear step forward for patients who require whole brain radiation.
Group 1: Prospective data with pre-defined metrics in a definitive setting
Single Example: Pharyngeal Constrictors
There is one metric with arguably two trials that meet this level of data. Ironically, both studies have “issues” that then lessen their impact.
The first is the DARS trial in which HN patients were randomized to standard IMRT or dysphagia optimized IMRT where constrictors were kept to lower dose. To me, it is the clear gold standard but… the data to date is only presented at this time and present in literature in abstract form. When published, I think it clearly moves to the top of the pyramid of data OAR strength. Definitive treatment, prospectively studied with a clearly pre-defined metric objective that you either meet or do not - ie not just lower.
An Easter Egg: one extra new trial - maybe even Group 1 data: It is elegant but virtually unknown.
This final trial is one I bet you haven’t seen. Zero cross citations and only 500+ views and a good handful of those views are mine. This is a randomized study out of Egypt (ref 1) where patients with nasopharynx cancer were either treated with standard parotid sparing IMRT or swallowing sparing IMRT where they used specific OAR reference points as shown below.
The trial demonstrated improvements in swallowing function at 1,3, 6, and 12 months. They ran additional statistics that are very close but then not perfectly aligned with table above. For example middle and superior constrictors were separated in the table yet combined in the planning metrics. Middle constrictor dose alone was then evaluated at 55, 60, and 65 - not exactly just the cut point of 55Gy. IPCM was also evaluated at these doses but per the Supplemental data shown above, the constraint is 54 Gy. I think one can therefore argument it isn’t quite Gr1 but rather somewhere between Gr1 data and Gr3 data. The lack of any cross references also is a bit of concern, but it is out there and seems well done. On my review it certainly is as good as many studies in the IMRT vs. 3D conformal reference article I created earlier.
My Summary:
I agree with the poll. I’d choose either the pharyngeal constrictors or heart. Constrictors based on strength of data and heart for strength of endpoint. After all, OS is more important than dysphagia
Obviously (by the name of Substack) I’m looking at it with at least some proton therapy slant :). Hippocampus is arguably just as strong as the pharyngeal constrictor data - very similar trial structures, but from a proton therapy viewpoint, if you consider that the most likely option for protons is in glioma treatment (rather than a whole brain setting), then I think that makes to rank hippocampus below pharyngeal constrictors (and I don’t think the palliative vs definitive trial setting is trivial).
As I look back at this list, we must continue to improve our dosimetry metrics data. The DAHANCA 35 trial will utilize normal tissue complication modeling to choose and randomize patients. The UK breast trial only randomizes those at risk. The very premise of those trials depends on the above data discussed being correct. In fact, from a pure scientific isolated look, if these two trials are negative you have two options - protons aren’t superior to IMRT OR the metrics chosen do not differentiate patient populations that benefit.
And as we reviewed today, OAR metric data is quite limited. But for protons or any other technology to consistently demonstrate clinical value, they do so via improvements in dosimetry that impact outcomes. We should work to define these precisely and robustly. These metrics will determine the approaches we can utilize to demonstrate which patients benefit the most from improvements in technology moving forward. This is critical to the long-term success of our technology based field.
REFERENCES:
Swallowing sparing intensity modulated radiotherapy versus standard parotid sparing intensity-modulated radiotherapy for treatment of head and neck cancer: a randomized clinical trial
https://www.tandfonline.com/doi/full/10.1080/0284186X.2021.2022198Further references are contain in the post below: