|
|
Blood, Vol. 113, Issue 12, 2765-3775, March 19, 2009

Molecular profiling of classical Hodgkin lymphoma tissues uncovers variations in the tumor microenvironment and correlations with EBV infection and outcome
Blood Chetaille et al.
113: 2765
Supplemental materials for: Chetaille et al
Gene expression analysis Total RNA was extracted from frozen samples using guanidium isothiocynanate and cesium chloride gradient, as described (23). Its integrity was controlled by Agilent analysis (Bioanalyzer, Palo Alto, CA). For each sample, synthesis of the first-strand cDNA was done from 3 µg total RNA by T7-oligo(dT)priming, followed by second-strand cDNA synthesis. After purification, in vitro transcription associated with amplification generated cRNA containing biotinylated pseudo-uridine. Biotinylated cRNA was purified, quantified, and chemically fragmented at 95°C for 35 minutes, then hybridized to microarrays in 130 µL hybridization buffer at 45°C for 16 hours. Automated washes and staining with streptavidin-phycoerythrin were done as recommended. Double signal amplification was done by biotinylated antistreptavidin antibody with goat-IgG blocking antibody. Scanning was done with Affymetrix GeneArray scanner and quantification with Affymetrix GCOS software. Gene expression data were analyzed by the Robust Multichip Average method in R using Bioconductor and associated packages (25). Robust Multichip Average did background adjustment, loess normalization, and summarization of 11 oligonucleotides per gene. Before analysis, a filtering process removed from the data set the genes with low and poorly measured expression as defined by an expression value inferior to 100 units in all 73 tissue samples, retaining 10,394 genes/ESTs. Data were then log2-transformed before unsupervised and supervised analyses. Unsupervised hierarchical clustering was done using the Cluster program (26) with data median-centered on genes, Pearson correlation as similarity metric and centroid linkage clustering. Results were displayed using the TreeView program (26). To identify the gene clusters most responsible for the resulting subdivision of samples, we used the method of quality-threshold (QT) clustering (27) and selected the gene clusters with minimal number of 50 genes and minimal correlation of 0.75. To identify and rank genes differentially expressed between two subgroups of samples, supervised analysis was applied using a discriminating score (DS), except for survival analysis where we used Cox analysis. DS was calculated for each gene (28) as DS = (M1 − M2) / (S1 + S2), where M1 and S1 represent mean and standard deviation (SD) of expression levels of the gene in subgroup 1, and M2 and S2 in subgroup 2, respectively (28). Confidence levels were estimated by 100 random permutations of samples as previously described (29). Several analyses were done centered on different tumour features. Depending on the analysis done, a different threshold of DS was applied, but in each case, it was selected to ensure that the number of genes selected by chance never exceeded the number of identified differential genes. To identify genes associated with survival, the Cox proportional hazards model was applied with continuous survival data and with a significance threshold of 5% of false discovery rate (FDR). Based on the significant genes, the distinction between subgroups was assessed by linear discriminant analysis (LDA) (30). Briefly, this method computes a discriminating function using linear combination of variables that maximizes the ratio of between-group variance over the within-group variance according to subgroups information.
Files in this Data Supplement:
- Table S1. Clinical characteristics of the training set of 63 cHL patients analyzed by DNA microarray (PDF, 24.4 KB)
- Table S2. Probe sets associated with clinical outcome within the 61 adult cHL samples (XLS, 106 KB)
- Table S3. Genes differentially expressed between the 18 EBV+ cHL samples and the 35 EBV− cHL samples (XLS, 99.5 KB)
- Table S4. Genes differentially expressed between the 42 nodular sclerosis cHL samples and the 17 mixed cellularity cHL samples (XLS, 27 KB)
- Table S5. Genes differentially expressed between the 5H/TCRBCL samples and the 63 cHL samples (XLS, 146 KB)
- Figure S1. Supervised classification of cHL samples based on the histological type signature (JPG, 56.8 KB)
-
(A) illustrates the classification of 59 cHL samples using the 76 probe sets identified as differentially expressed between the 42 nodular sclerosis (NS) samples and the 17 mixed cellularity (MC) samples. Expression data are depicted as in Fig. 1, and histological type is coloured in white for NS and in black for MC. Probe sets are ordered from top to bottom by their decreasing discriminating score. Tumor samples are ordered from left to right according to their LDA function score. The solid orange line indicates the threshold 0 that separates the two predicted classes of samples referred to as “predicted NS group” (at the left of the line) and “predicted MC group” (right to the line). The bottom panel represents the LDA posterior probability for each sample to belong to the NS group (y axis), with the samples (x axis) ranked according to their LDA function score. The observed histological type is indicated by coloured dots: green for NS, and red for NS. There was a statistically significant correlation between the observed histological type and the predicted histological type of the cHL samples based on the corresponding expression signature, as shown by the resulting cross table (B).

- Figure S2. Supervised classification of cHL samples based on the H/TCRBCL signature (JPG, 78.7 KB)
-
Legend similar to Fig. S1 but applied to 666 probe sets differentially expressed between the 5 H/TCRBCL and the 63 cHL samples. The histology is coloured in white for cHL and in black for H/TCRBCL samples. The solid orange line indicates the threshold 0 that separates the two predicted classes of samples referred to as “predicted cHL group” (at the left of the line) and “predicted H/TCRBCL group” (right to the line). The bottom panel represents the LDA posterior probability for each sample to belong to the cHL group (y axis), with the samples (x axis) ranked according to their LDA function score.

|
|