Gene signature predictive of hepatocellular carcinoma patient response to transarterial chemoembolization

Transarterial chemoembolization (TACE) is a commonly used treatment modality in hepatocellular carcinoma (HCC). The ability to identify patients who will respond to TACE represents an important clinical need, and tumor gene expression patterns may be associated with TACE response. We investigated whether tumor transcriptome is associated with TACE response in patients with HCC. We analyzed transcriptome data of treatment-naïve tumor tissues from a Chinese cohort of 191 HCC patients, including 105 patients who underwent TACE following resection with curative intent. We then developed a gene signature, TACE Navigator, which was associated with improved survival in patients that received either adjuvant or post-relapse TACE. To validate our findings, we applied our signature in a blinded manner to three independent cohorts comprising an additional 130 patients with diverse ethnic backgrounds enrolled in three different hospitals who received either adjuvant TACE or palliative TACE. TACE Navigator stratified patients into Responders and Non-Responders which was associated with improved survival following TACE in our test cohort (Responders: 67 months vs Non-Responders: 39.5 months, p<0.0001). In addition, multivariable Cox model demonstrates that TACE Navigator was independently associated with survival (HR: 9.31, 95% CI: 3.46-25.0, p<0.001). In our validation cohorts, the association between TACE Navigator and survival remained robust in both Asian patients who received adjuvant TACE (Hong Kong: 60 months vs 25.6 months p=0.007; Shandong: 61.3 months vs 32.1 months, p=0.027) and European patients who received TACE as primary therapy (Mainz: 60 months vs 41.5 months, p=0.041). These results indicate that a TACE-specific molecular classifier is robust in predicting TACE response. This gene signature can be used to identify patients who will have the greatest survival benefit after TACE treatment and enable personalized treatment modalities for patients with HCC.


Test Cohort
The training/validation cohort was derived from the Liver Cancer Institute (LCI) cohort in which a total of 247 HCC patients were prospectively recruited and underwent radical resection at the Liver Cancer Institute and Zhongshan Hospital (Fudan University) between 2002 and 2003. Microarray profiling of LCI cohort patients was performed previously. Briefly, gene expression using RNA extracted from flash-frozen tumor tissue were previously profiled using Affymetrix Human Genome U133 2.0 microarray platform in two formats, Affymetrix GeneChip HG-U133A 2.0 or 96 HT HG-U133A 2.0 microarray platform, each containing the same probesets, as described (NCBI GEO accession number GSE14520). Data were processed by combining the CEL files from the two Affymetrix series using the matchprobes package in the R programming environment. Thereafter, the RMA method in the R affy package was used to obtain probe set expression summaries. Both raw and processed data are available in the GSE14520 at NCBI GEO. This dataset contains 488 samples: 247 tumor samples and 239 non-tumor samples, with expression information of the 13,101 genes in which signal could be measured. Of the 488 tumor and non-tumor samples contained in this data set, all 247 patients with tumor tissue available were considered for this study. Archived RNA extracted from the flash-frozen tumor tissue was stored at -80°C.
All TACE patients from the LCI cohort received a combination of cisplatin, fluorouracil and mitomycin C. Of the remaining patients, 86 received no additional therapy (Resection Only) and 51 received other forms of therapy (Other Therapy), not including TACE, following surgical resection. Patients who were administered TACE as adjuvant therapy following resection were those who were deemed to have a high probability of relapse (e.g. tumor size > 10 cm; >1 tumor nodules; or with vascular invasion, etc.). In this context, TACE is used for both diagnosis and treatment, in which digital subtraction angiography is performed to identify any tumor staining in the liver following resection. If tumor stains are noted, the size, location and number of stains are evaluated and TACE treatment is performed with superselective catheterization. If no tumor stains are noted, 1/3 of the standard dose of chemotherapy and lipiodol are injected into the hepatic artery. Patients designated as Other Therapy did not receive TACE during their treatment. Following surgical resection, Other Therapy patients received portal vein chemotherapy, interferon alfa therapy, radiofrequency ablation, percutaneous ethanol injection, or traditional Chinese medicine, or a combination thereof, and were treated outside usual clinical guidelines.

Validation
In the Hong Kong test cohort, patients who received TACE were those who were judged to have a high risk of recurrence following resection by the operating surgeons. The presence of tumor vs. non-tumor tissue was verified by H&E staining, and tumor tissue was collected by scraping five 5μm tumor sections for each patient. Total RNA was isolated using the Roche High Pure FFPET RNA Isolation Kit (Indianapolis, IN) according to manufacturer's instructions. All patients in the Hong Kong test cohort received cisplatin during the TACE procedure.
For the Shandong test cohort, patients who received TACE were those who were judged to have a high risk of recurrence following resection by the operating surgeons. The presence of tumor tissue was verified by H&E staining. Tumor tissue was collected by scraping five 10μm tumor sections for each patient. Total RNA was isolated using a MasterPure RNA Purification Kit (Epicenter, Madison, WI) according to manufacturer's instructions. For patients in the Shandong cohort, doxorubicin and cisplatin-based regimens were predominantly used.
For the Mainz test cohort, patients were treated with palliative TACE in accordance with BCLC guidelines. Total RNA was isolated using a peqGOLD Total RNA Kit (VWR, Darmstadt, Germany) according to manufacturer's instructions. For patients in the Mainz cohort, most patients received doxorubicin with drug-eluting beads (DEB TACE), while a minority of patients received TACE with Mitomycin C.

Signature Development and Patient Assignment
Bioinformatic analyses, including class comparison and survival risk prediction algorithms, were then used to identify genes that were predictive of overall survival in the group of 105 patients receiving TACE, but not in 86 other patients who received no additional therapy following resection. All bioinformatic analyses were performed using BRB-ArrayTools (Bethesda, MD). TACE Navigator was developed using a custom nCounter Gene Expression Codeset from NanoString (Seattle, Washington), consisting of 15-signature genes and six control genes. NanoString Digital Gene Expression Analysis was performed by the Center for Cancer Research Genomics Core in 93 TACE patients from the training/validation cohort. A prognostic index equation prediction module based on the expression of each signature gene was created using the survival risk prediction function in BRB-ArrayTools. Validation was performed using 10-fold cross validation. NanoString analysis was then performed in a double-blind manner in the test and verification cohorts. Gene expression, measured by NanoString counts, was Log2 transformed and then converted to Z-score within each cohort. Patients were assigned into predicted Responders or Non-Responder groups using the prognostic index equation. Data were subsequently decoded and clinical data for each patient was obtained.

Univariable and Multivariable Analysis
Univariable and multivariable analyses were performed with Cox proportional hazards regression analysis using STATA 14.0 (College Station, TX). The association of each clinical variable on survival was first evaluated with univariable analysis, followed by multivariable analysis, which included clinical variables that were significantly associated with survival in the univariable analysis. Age grouping was chosen by median age in the training/validation cohort. Alanine aminotransferase and alpha-fetoprotein groupings were chosen based on commonly used normal vs. abnormal clinical values. For TNM staging, stage I indicates a single tumor with no vascular invasion whereas stage II and greater indicates that multiple tumors or vascular invasion has taken place, thus groups II and III were grouped together. No multicollinearity of covariates was found, and the proportional hazards assumption was met in the final models.

Supplementary Figure 1. Affymetrix expression of TACE Navigator genes is correlated to NanoString expression
Correlation between gene expression (Log2), as measured by Affymetrix chip and NanoString, is shown for (A) TACE Navigator signature genes and (B) accompanying housekeeping genes. P and R values shown in each panel were calculated by Pearson Correlation, with a P value of less than 0.05 indicating statistical significance.

Supplementary Figure 2. The TACE Navigator gene signature does not predict overall survival in patients who did not receive TACE
HCC patients from two independent cohorts who did not receive TACE: (A) TIGER-LC and (B) Korean Cohort were assigned into predicted Responder or Non-Responder groups using our developed prognostic index equation and prognostic threshold. In both cohorts, no significant difference in overall survival was seen in patients assigned to the two groups, as shown by Kaplan-Meier curve.

Supplementary Figure 3. Responders and Non-Responders exhibit differential expression of hypoxia-related genes
Heat map of 155 hypoxia target genes in TACE Responders and Non-Responders with columns representing individual patients and rows representing expression of each variable gene (A). Both patients and genes were clustered using Pearson Correlation distance and average linkage using the Genesis program. 100% concordance of TACE Responder and Non-Responder groups were observed following clustering. Expression values are Log2, and yellow indicates relative underexpression and purple indicates relative over-expression of each gene.  *A P value of less than 0.05 was considered to indicate statistical significance. P values were calculated with the use of Fisher's exact tests, except for age, which was calculated with 2-tailed Student's t-test, and survival, which was calculated with the log-rank test.  PPP5C  SERPINE1  SLC2A1  TERT  TF  TFF3  TFRC  TGFA  TGFB3  TGM2  TPI1  VEGFA  VIM   RAB8B  RARA  RBPJ  RRAGD  RSBN1  SEC61G  SFRS7  SLC16A1  SLC7A6  SNRPD1  SPAG4  STC2  TGFBR1  TMEM45A  TPCN1  TUBG1  VDAC1  WSB1 * Genes denoted with an asterisk indicate genes from the core hypoxia response that are previously known HIF-1α targets. Other genes in this column are predicted hypoxia response genes.