Prospective Evaluation of Health Care Provider and Patient Assessments in Chemotherapy-Induced Peripheral Neurotoxicity

Background and Objective There is no agreement on the gold standard for detection and grading of chemotherapy-induced peripheral neurotoxicity (CIPN) in clinical trials. The objective is to perform an observational prospective study to assess and compare patient-based and physician-based methods for detection and grading of CIPN. Methods Consecutive patients, aged 18 years or older, candidates for neurotoxic chemotherapy, were enrolled in the United States, European Union, or Australia. A trained investigator performed physician-based scales (Total Neuropathy Score–clinical [TNSc], used to calculate Total Neuropathy Score–nurse [TNSn]) and supervised the patient-completed questionnaire (Functional Assessment of Cancer Treatment/Gynecologic Oncology Group–Neurotoxicity [FACT/GOG-NTX]). Evaluations were performed before and at the end of chemotherapy. On participants without neuropathy at baseline, we assessed the association between TNSc, TNSn, and FACT/GOG-NTX. Considering a previously established minimal clinically important difference (MCID) for FACT/GOG-NTX, we identified participants with and without a clinically important deterioration according to this scale. Then, we calculated the MCID for TNSc and TNSn as the difference in the mean change score of these scales between the 2 groups. Results Data from 254 participants were available: 180 (71%) had normal neurologic status at baseline. At the end of the study, 88% of participants developed any grade of neuropathy. TNSc, TNSn, and FACT/GOG-NTX showed good responsiveness (standardized mean change from baseline to end of chemotherapy >1 for all scales). On the 153 participants without neuropathy at baseline and treated with a known neurotoxic chemotherapy regimen, we verified a moderate correlation in both TNSc and TNSn scores with FACT/GOG-NTX (Spearman correlation index r = 0.6). On the same sample, considering as clinically important a change in the FACT/GOG-NTX score of at least 3.3 points, the MCID was 3.7 for TNSc and 2.8 for the TNSn. Conclusions MCID for TNSc and TNSn were calculated and the TNSn can be considered a reliable alternative objective clinical assessment if a more extended neurologic examination is not possible. The FACT/GOG-NTX score can be reduced to 7 items and these items correlate well with the TNSc and TNSn. Classification of Evidence This study provides Class III evidence that a patient-completed questionnaire and nurse-assessed scale correlate with a physician-assessed scale.

clinically important a change in the FACT/GOG-NTX score of at least 3.3 points, the MCID was 3.7 for TNSc and 2.8 for the TNSn.

Conclusions
MCID for TNSc and TNSn were calculated and the TNSn can be considered a reliable alternative objective clinical assessment if a more extended neurologic examination is not possible. The FACT/GOG-NTX score can be reduced to 7 items and these items correlate well with the TNSc and TNSn.

Classification of Evidence
This study provides Class III evidence that a patient-completed questionnaire and nurse-assessed scale correlate with a physician-assessed scale.
Chemotherapy-induced peripheral neurotoxicity (CIPN) from widely used anticancer drugs is a major issue in oncology daily practice. [1][2][3] CIPN has a significant effect on participants both during 4 and after antineoplastic treatment. [5][6][7][8][9][10][11][12] Prevention or treatment of CIPN are important unmet clinical needs. 13 A major reason for the lack of effective treatments is the incomplete knowledge of CIPN pathogenesis. [6][7][8][9]14 However, another issue in clinical trials is the lack of a gold standard for CIPN detection and grading, 15 leading to multiple and different rating instruments. To fill these gaps, we performed a longitudinal study on a real-life population of participants with cancer from baseline (i.e., before chemotherapy administration) to treatment completion. Based on several previous methodologic studies, [16][17][18][19][20][21][22][23][24] a combination of clinician-reported outcome (CRO) as well as patient-reported outcome (PRO) measures seems to be the most reliable approach. Based on these results, our aim was to address several questions about currently used assessment tools: Are the National Cancer Institute Common Terminology Criteria for Adverse Events (NCI-CTCAE), the Total Neuropathy Score clinical version (TNSc) and its novel nurse-assessed version (TNSn), and the Functional Assessment of Cancer Treatment/Gynecologic Oncology Group-Neurotoxicity (FACT/GOG-NTX) scales responsive to the occurrence of CIPN in this population? How do the TNSc and TNSn compare? What are the correlations among the variation from baseline to end of treatment of TNSc, TNSn, and FACT/GOG-NTX in a population of participants receiving anticancer drugs? Are there shorter versions of FACT/GOG-NTX that might be as valuable as the complete version? What is the minimal clinically important difference (MCID) for the TNSc and TNSn?

Study Design
This is an international, multicenter (14 sites) trial involving European, American, and Australian centers primarily aimed at definition of the MCID for TNSc and TNSn and at the assessment of the possibility to used reduced FACT/GOG-NTX versions at the same level of reliability of the full version.
Standard Protocol Approvals, Registrations, and Patient Consents Adult participants were enrolled at each participating center after approval from local institutional review boards/ethics committees and written informed consent was obtained from each participant before entering the study.

Study Design
Consecutive participants were age 18 years or older and candidates for neurotoxic chemotherapy for colorectal, breast, or lung cancers with noninvestigational drugs. Participants with potential confounding factors for CIPN were excluded (i.e., brain metastases, peripheral nerve damage due to other cause). At each center, a specifically trained investigator performed the selected health care provider-assessed scales, NCI-CTCAE (items "peripheral neuropathy-motor" and "peripheral neuropathy-sensory" of NCI-CTCAE v4.0 were used) and TNSc, and supervised the patient-completed questionnaire, FACT/GOG-NTX (version 4, items NTX1-9 and item HI12 and item An6), at baseline (before first chemotherapy cycle, T0) and at the end of all chemotherapy cycles (T1). Participants were evaluated before chemotherapy initiation and at its completion. Demographic and medical history were recorded. As TNSn is calculated from 5 of the 7 items of the TNSc, the TNSn was calculated for each participant at each visit where the TNSc was obtained. eTable 1 (available at Bicocca Open Archive Research Database [BOARD], 25 board.unimib.it/research-data/) provides a detailed description of TNSc and TNSn items and eTable 2 (available at BOARD 25 ) FACT-GOG-NTX items.
We first assessed the internal responsiveness of NCI-CTCAE, TNSc, TNSn, and FACT/GOG-NTX on the complete sample including participants with neuropathy at entry. We then used the sample of participants without neuropathy at entry (i.e., TNSc score 0 at baseline) to compare TNSc and TNSn; to assess correlations among TNSc, TNSn, and FACT/GOG-NTX in a population of participants receiving platinum, taxanes, or a combination of the 2 drugs; to assess whether shorter versions of FACT/ GOG-NTX might provide the same information as the complete version; and to calculate the MCID for TNSc and TNSn.

Statistical Analysis
Characteristics of the participants were summarized using numbers and percentages for categorical variables and mean with SD for continuous variables. The flow chart depicted in eFigure 1 (available at BOARD, 25 board.unimib.it/researchdata/) describes the size of the subsample of participants used in each analysis.
The responsiveness of TNSc, TNSn, and FACT/GOG-NTX scales was assessed by estimating several measures of the effect size of the score change between baseline and the end of treatment (table 1). The analysis of the internal responsiveness of NCI-CTCAE, TNSc, TNSn, and FACT/ GOG-NTX was performed using all the available information, i.e., for each scale, data of participants with nonmissing values on every item at all visits were used. For NCI-CTCAE, a binomial test comparing the proportion of worsened participants according to each item was applied. For the other scales, a paired t test was performed and the effect size measures described in Husted et al. 26 were estimated together with a 95% confidence interval (CI). All these measures consist of a ratio between the mean score change from T0 to T1 and an estimate of the score variability.
The following analyses were performed on the 153 participants treated with a specified neurotoxic chemotherapy regimen, with a normal neurologic status at baseline, and with  nonmissing items of TNSc, TNSn, and FACT/GOG-NTX at every time point. We compared the TNSc and the TNSn at T1 both graphically and using the Spearman correlation index. Differences in neurologic deterioration at the end of the follow-up according to TNSc and TNSn (categorization was based on a TNSc severity group subdivision, 27 as follows: score 0, score 1-8, score 9-16, higher than 16; however, the highest score in our population was 15, therefore, we had 3 groups: 0, 1-8, and 9-15 according to Total Neuropathy Score [TNS]) between chemotherapy regimens were assessed using Fisher test. The association between deterioration according to FACT/GOG-NTX and TNSc or TNSn groups at T1 was checked using Kruskal-Wallis test and drawing boxplots. This analysis was repeated after stratifying by chemotherapy regimen. We then assessed whether shorter versions of FACT/GOG-NTX might provide the same information as the complete version. This was done by checking the association between deterioration of each single FACT/GOG-NTX item and TNSc or TNSn groups at T1, using the χ 2 test for trend. Lastly, an anchorbased approach was applied to assess the MCID for TNSc and TNSn scale. This approach is recommended over distribution-based approaches (focusing purely on a "statistically relevant" change) when at least 1 external indicator of the smallest clinically meaningful change, serving as the anchor, is available. 28 The idea consists of defining a group of participants with a relevant change based on the anchor measure and then comparing values of the scale of interest in this group with the group of participants where no change was observed. The direction of change (i.e., participants getting worse or getting better) should be taken into account. We relied on a previously established MCID for FACT/GOG-NTX to identify participants with and without a clinically important deterioration according to this scale. Then, we calculated the MCID for TNSc and TNSn as the difference in the mean change score of these scales between the 2 groups.
All analysis was carried out using R statistical package (version 3.6.0).

Data Availability
Data will be made available upon request to the corresponding author.

Description of the Study Population
Among the whole sample of 254 participants, about 50% had breast cancer (eTable 3, available at BOARD, 25 board. unimib.it/research-data/). About 80% of participants were women with a mean age of ≈56 years. Colorectal cancer made up the next largest group, with about 22% in each population. About 50% of participants received a taxane alone, ≈34% received a platinum-containing agent, and just under 20% received both.
Analysis of the Internal Responsiveness of NCI-CTCAE, TNSc, TNSn, and FACT/GOG-NTX Based on the Whole Sample of Participants With Nonmissing Values of the Scales at T0 and T1 As an initial analysis, we evaluated on participants of the whole population with completely measured scales at T0 and T1 the internal responsiveness of NCI-CTCAE (218 participants), TNSc (231), TNSn (231), and FACT/GOG-NTX (214) scales selected as study outcome measures. A description of the overall population and of the populations analyzed for each scale is provided in eTable 3 and eFigure 1 (available at BOARD, 25 board.unimib.it/research-data/). Concerning the responsiveness of NCI-CTCAE, the percentage of participants with an increased score was 23.4% and 78.0% for motor and sensory items, respectively (both significantly higher than 0) (table 1). For all scales, the final score consistently increased on average by more than 1 SD, regardless of which type of SD is considered in the calculation (SD of the score at T0, SD at T1, an average of the previous 2, or SD of the change T1-T0). In other words, all effect sizes were greater than 1 and all the lower bounds of the corresponding 95% CIs were above 0.8, which is commonly considered as a threshold for large responsiveness. 26 Descriptive Statistics of the Selected Study Population ( Relationship Between Physician and Patient-Reported Outcome Measure As shown in table 3, we then explored the association between the deterioration of each single FACT/GOG-NTX item and TNSc or TNSn. Table 3 shows data for the overall population; to see data stratified for drug class, see eTable 5 (available at BOARD, 25 board.unimib.it/research-data/): significance is the same as for the overall population, even when analyzing each class. Again, the triple categorization of TNSc or TNSn was used while deterioration for FACT/ GOG-NTX items was intended as a score at end of treatment higher than baseline by at least 1 point. Only the first 4 items of FACT/GOG-NTX (items Ntx1-4) and the last 3 items (Ntx8, Ntx9, and An6) showed a moderate grade of association with TNSc and TNSn, both in the whole population and in each chemotherapy regimen subgroup. A strong association between deterioration of FACT/GOG-NTX taken as a whole and TNSc was observed. As shown in figure 2, the number of deteriorated FACT/GOG-NTX items tended to increase along with TNSc score, overall and in all the chemotherapy regimen subgroups. Again, these findings largely overlap with results regarding the association between FACT/GOG-NTX and TNSn (figure 3).

Minimal Clinically Important Difference
Using an anchor-based approach, considering as clinically important a change in the FACT/GOG-NTX score of at least 3.3 points (0.3 per item) as described by Yost     Haryani et al. 38 performed a detailed psychometric evaluation of different available assessment tools in CIPN addressing validity (criterion, construct, discriminant validity), reliability, and practicability; by their extensive investigation, 2 tools emerged as most adequate: a PRO (the FACT/GOG-NTX 45 ) and a CRO (the TNS or one of its versions such as the TNSc). 17,31,38 FACT/GOG-NTX-with respect to other scales such as EORTC CIPN20-has been suggested to be easier to use, 38,46 and the TNS has been recognized as a fair option for CIPN evaluation by a Delphi survey, 30 as well as reviews by CIPN experts. 15,33 Therefore, in our study we focused our attention on these 2 assessment tools.
The original TNS was designed to be performed by trained neuromuscular physicians and included the results of nerve conduction studies (NCS) and a specific quantitative sensory testing (QST) device. 47  Another important concept is emerging in the assessment of CIPN and the effects of treatments (MCID; i.e., the smallest difference in score in the domain of interest that participants perceive as important), either beneficial or harmful, and which would lead the clinician to consider a change in the patient's management. e9 The MCID has recently been calculated for FACT/GOG-NTX and EORTC CIPN20, 48,e10 but this has not been done for any physician-based assessment in CIPN, including any TNS version.
Our data are intended to explore all these issues related to CIPN assessment and to shed light on the best clinimetric approach to this nosographic entity in clinical trials; in the same population of patients with cancer undergoing neurotoxic chemotherapy, we used one of the most recommended PROs, the FACT/GOG-NTX, and the most recommended physician-based outcome scale, TNSc, together. Because of its frequent use in industry and government-sponsored trials, we also employed the NCI-CTCAE.
All 3 scales show that CIPN is a frequent occurrence in this population. We confirmed the internal responsiveness of the 3 outcome measures. However, other studies have shown that the NCI-CTCAE neurotoxicity scales, commonly used in clinical trials, are poorly informative in terms of quality of neurologic impairment. 15 Thus, we would endorse the growing consensus that FACT/GOG-NTX and a form of the TNS be the primary assessment tools in CIPN without NCI-CTCAE.
The original version of the FACT/GOG-NTX is an 11-item questionnaire aimed at exploring positive and negative neuropathy symptoms in CIPN and the consequent functional impairment. 45,e11 Its clinimetric properties are known e10 and the MCID for the FACT/GOG-NTX has been calculated. 48 Huang et al. e11 reexamined the scale with the hypothesis that some of the 11 items might be redundant. They validated a reduced version of the questionnaire based on the first 4 items only (positive and negative neuropathy symptoms in upper and lower limbs). To verify whether other questions might better characterize CIPN, we tested the association between neurologic examination, as assessed by the TNSc, and all single FACT/GOG-NTX items. We confirmed the results obtained by Huang et al., e11 who described significant association between worsening of neurologic status and the first 4 items of the FACT/GOG-NTX; moreover, we verified that there the same association is present with the last 3 items of FACT/GOG-NTX, the ones exploring fine sensory perception and sensory ataxia (i.e., loss of proprioception, relevant to hamper manipulation and balance): Ntx8 "having trouble buttoning buttons," Ntx9 "having trouble feeling the shape of small objects," and An6 "having trouble walking." No association was observed with the remaining items. We conclude that the complete 11-  Abbreviation: CI = confidence interval. For each scale, this measure was calculated as the difference between the mean change score in the group of patients who had a clinically important increment according to the Functional Assessment of Cancer Treatment/Gynecologic Oncology Group-Neurotoxicity (FACT/GOG-NTX) (group "deterioration") and the mean change score in the group of patients who did not have a clinically important increment according to the FACT/GOG-NTX (group "no change"). a Patients with a change of <3.3 points in the total FACT/GOG-NTX score from T0 to T1. b Patients with a change (increase) of at least 3.3 points in the total FACT/GOG-NTX score from T0 to T1.
item FACT/GOG-NTX is not needed, but rather a 7-item reduced version is the most informative.
We also tested a shorter version of the TNSc-the TNSnthat could be easily and rapidly employed in any oncologic center by a trained health care professional. The TNSn had significant responsiveness and showed the same association with FACT/GOG-NTX items as observed with the full TNSc.
As a final analysis aimed at providing information regarding a widely used physician-based outcome measure in CIPN, we defined the MCID for both the TNSc and TNSn scales. This provides cutoff values for a relevant change that could drive clinical practice and allow better definition of relevant endpoints in CIPN clinical trials. In order to perform this analysis, we used the MCID for the FACT/GOG-NTX 48 as a reference. As expected, the MCID was higher using the TNSc if compared with the TNSn (approximatively 3.7 vs 2.8), reflecting the different value range of the 2 scales (0-28 vs 0-20, respectively) maintaining indeed a similar "relative" MCID (3.7/28 = 1.3% vs 2.8/20 = 1.4%).
This study provides Class III evidence that for participants receiving neurotoxic chemotherapy, a patient-completed questionnaire and nurse-assessed scale moderately correlate with a physician-assessed neuropathy scale. Our study adds important and new information to an evidence-based selection of the most appropriate tools in the assessment of CIPN. We show that both FACT/GOG-NTX and TNSc can measure neuropathy in a real-life population of participants with cancer recruited in a multisite, international study. These results were consistent among different drugs and drug combinations, suggesting they could be used across multiple cancer treatment regimens. Our data support the use of a shorter FACT/GOG-NTX scale, indicating that a 7-item scale would be the most suitable option to capture sensory ataxia and its effect on daily life activities. Lastly, we defined the MCID for the TNSc and demonstrated that the TNSn can be considered a reliable alternative if a formal neurologic examination by physicians or specifically trained nurses are not possible in a specific center. The selected simple set of measures for CIPN are clinimetrically valid, do not need complex training, and can be used easily in trials anywhere.