Global reach, higher impact
2. Materials and Methods
3 Results and Discussion
Int J Biol Sci 2009; 5(4):331-337. doi:10.7150/ijbs.5.331
Short Research Communication
Identification of candidate genes for congenital splay leg in piglets by alternative analysis of DNA microarray data
1. Research Institute for the Biology of Farm Animals (FBN) Dummerstorf, D-18196 Dummerstorf, Germany
This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY-NC) License. See http://ivyspring.com/terms for full terms and conditions.
How to cite this article:
Maak S, Boettcher D, Tetens J, Wensch-Dorendorf M, Nürnberg G, Wimmers K, Swalve HH, Thaller G. Identification of candidate genes for congenital splay leg in piglets by alternative analysis of DNA microarray data. Int J Biol Sci 2009; 5(4):331-337. doi:10.7150/ijbs.5.331. Available from http://www.ijbs.com/v05p0331.htm
The congenital splay leg syndrome in piglets is characterized by a temporarily impaired functionality of the hind leg muscles immediately after birth. Etiology and pathogenetic mechanisms for the disease are still not well understood. We compared genome wide gene expression of three hind leg muscles (M. adductores, M. gracilis and M. sartorius) between affected piglets and their healthy littermates with the GeneChip® Porcine Genome Array (Affymetrix) in order to identify candidate genes for the disease. Data analysis with standard algorithms revealed no significant differences between both groups. By application of an alternative approach, we identified 63 transcripts with differences in two muscles and 5 genes differing between the groups in three muscles. The expression of six selected genes (SQSTM1, SSRP1, DDIT4, ENAH, MAF, and PDK4) was investigated with SYBRGreen RT - Real time PCR. The differences obtained with the microarray analysis could be confirmed and demonstrate the validity of the alternative approach to microarray data analysis. Four genes with different expression levels in at least two muscles (SQSTM1, SSRP1, DDIT4, and MAF) are assigned to transcriptional cascades related to cell death and may thus indicate pathways for further investigations on congenital splay leg in piglets.
Keywords: congenital splay leg, piglet, microarray, RT - Real time PCR
Congenital splay leg syndrome in newborn piglets is the most frequent observation amongst hereditary disorders in swine (1). The phenotype is characterized by an impaired ability to stand and walk due to a muscular weakness of the hind limbs (2). Losses among affected piglets can amount to 50%, making congenital splay leg to a source of considerable economic losses in pig production (3).
Despite numerous investigations, the pathogenesis and the etiology of the disease are still poorly understood. Histomorphological investigations, analysis of biochemical criteria as well as investigations on putative candidate genes lead to contradictory results (4 -8). Ooi et al. (2006) described congenital splay leg as muscle fiber atrophy characterized by an increased expression of the atrophy marker FBXO32 (atrogin, MAFbx) and histological signs of a generalized muscle fiber hypoplasia in skeletal muscles of splay leg piglets (6). However, our investigations could not confirm these observations to full extent (8). Instead, we observed a large individual variability in FBXO32 expression as well as in histological characteristics of hind limb muscles, which has previously been described (9, 10). A congenital, impaired functionality of skeletal hind limb muscles due to immaturity and/or atrophic properties is therefore likely to be the major pathomorphological feature in splay leg syndrome.
Recent advances in transcriptomics in swine have opened new opportunities for a global survey on the genetic background of complex traits [see (11) for review, (12)]. Consequently, we employed comparative, genome wide expression profiling of individual hind leg muscles derived from affected piglets and their healthy littermates. The objective of this investigation was to detect expression differences and, subsequently to identify candidate genes for further investigations on congenital splay leg in piglets.
2. Materials and Methods
2.1 Sample collection and extraction of total RNA
Three male splay leg piglets and 3 healthy littermates of similar birth weight (1,795 ± 217 g) were euthanized immediately post partum in accordance with German animal protection legislation. Samples of M. sartorius, M. gracilis and Mm. adductores were prepared from each animal, snap frozen and stored at -70°C for further preparation.
Total RNA was isolated using TRIzol Reagent (Sigma, Taufkirchen, Germany) according to the manufacturer's protocol. After DNaseI treatment the RNA was cleaned up with the RNeasy Kit (Qiagen, Hilden, Germany). The quantity of RNA was established using the NanoDrop ND-1000 spectrophotometer (Peqlab, Erlangen, Germany) and the integrity was checked by running 1 μg of RNA on a 1% agarose gel. In addition, absence of DNA contamination was checked using the RNA as a template in standard PCR amplifying fragments of PRL32 and HPRT. The RNA samples were stored at -70°C until processing.
2.2 Array analysis
Muscle expression patterns were assessed using the GeneChip® Porcine Genome Array (Affymetrix, St. Clara, USA). This Array contains 24,123 probe sets representing transcripts from 20,201 Sus scrofa genes. Tsai et al. (2006) improved the annotation of the array by assigning approximately 82% of the transcripts to 11,265 different porcine genes (13). The fragmentation and labeling was performed with the GeneChip® Terminal Labeling Kit (Affymetrix, St. Clara, USA) according to the manufacturer's recommendations. Five µg of total RNA per sample were used for preparation of antisense, biotinylated RNA for hybridization.
Hybridization, washing and scanning of the arrays as well as primary data analysis with Affymetrix GCOS 1.3 software was done by Atlas Biolabs (Berlin, Germany). The raw data files were provided along with a summary of the analysis containing probe set identification, quality measures for the hybridization, the relative expression value and a qualitative measure for the probe sets (present, absent or marginal) for each individual array.
2.3 Data analysis
Bioinformatic analysis of the microarray data was done in 3 steps: (A) quality control of all arrays, (B) preprocessing of all arrays (background correction, normalization, summary measures for probe sets), and (C) identification of differently expressed genes.
Quality control, data preprocessing and statistical analysis were performed using the R statistical language (Bioconductor Packages, http://www.bioconductor.org/) - employing methods described by Bolstad et al. (14, 15).
After quality control all arrays could be used for further analysis. Background correction was performed using GCRMA (16), normalization by quantile normalization, and summary measures for probe sets were obtained by median polish.
Affymetrix IDs were mapped to the belonging gene symbols based on the assignments available from the Ensembl database (http://www.ensembl.org), and mean values over all belonging Affymetrix IDs were calculated. Because pairs of “Control” and “Splay leg” piglets are full siblings, a paired t-test was used to assess statistically significant differentially expressed genes (p = 0.05). These test results were adjusted for multiple testing using the false discovery rate (FDR), q-value method (17).
As an alternative method for the detection of differentially expressed transcripts we used the data processed with Affymetrics GCOS1.3 for further analysis with different options of SAS (v. 9.1; 18). (I) Probe sets with "present" signals in all six samples per muscle were filtered and used for subsequent processing. (II) A rank sum test (Wilcoxon, p < 0.05) identified those probe sets with no overlap of the values between the both groups ("Control", n=3 vs. "Splay leg", n=3). (III) This data subset was further reduced by pairwise comparisons of the means within each probe set between the experimental groups (Student's t-test; p < 0.05). The resulting lists for each of the three muscles (M. sartorius, M. gracilis and Mm. adductores) were compared for overlaps and analyzed with different options of the DAVID Bioinformatic resources (19) for the identification of functional pathways overrepresented in the data set. A total of 6 probe sets were selected for validation experiments with reverse transcription - real time PCR (RT - Real time PCR).
2.4 RT - Real time PCR
Each 250 ng of the RNA preparation used for Array analysis was subjected to reverse transcription with TaqMan® Reverse Transcription Reagents (Applied Biosystems, Darmstadt, Germany) essentially as described by the manufacturer with the supplied random hexamer primers in a ThermalCycler TC1 (Biometra, Weiterstadt, Germany). The resulting cDNA samples were used for Real time PCR amplification of six genes.
The primers were derived from the same expressed sequence tags (ESTs) used for the development of the respective probe sets on the Affymetrics GeneChip® Porcine Genome Array. A fragment of the porcine S18 rRNA gene was used for normalization (Table 1).
Primer sequences for the genes analyzed with RT - Real time PCR
* These porcine ESTs are also the basis for following probe sets on the array: SQSTM1: Ssc.3612.1.S1_at; SSRP1: Ssc.4170.1.1S1_at; ENAH: Ssc.3771.1.A1_at; PDK4: Ssc.10131.1.A1_at; MAF: Ssc.15325.1.S1_at; DDIT4: Ssc.4104.1.S1_at; 18S rRNA: AFFX-SSC-18SrRNA_at.
Real time PCR amplification was performed for all genes under following conditions on an ABI Prism 7000 SDS (Applied Biosystems, Darmstadt, Germany): 2 min at 50 ºC, 15 min at 95 °C followed by 45 cycles of 15 s at 95 ºC, 30 s at Ta (Table 1), and 30 s at 72 ºC. Melting curve analysis (60-95 °C) and gel electrophoresis (3% agarose) were used for assessing amplification specificity. The reaction volume of 25 µl contained 12.5 µl ImmoMix™ (Bioline, Luckenwalde, Germany) with SYBR® Green and ROX as internal standard, 300 nM of the respective primers (3µl each), 0.5 µl UNG (Uracil-DNA-glycosilase), 3.5 µl nuclease free water and 2.5 µl cDNA. All samples were run in triplicates. Analysis of the expression data was done according to the relative standard curve method (20). A standard curve was derived for each single gene from a serial dilution of the cDNA. Expression values were normalized to the individual expression of 18S rRNA.
3 Results and Discussion
3.1 Gene expression microarray analysis
The analysis of microarrays in different skeletal muscles of the hind legs of splay leg piglets aimed at the generation of a list of genes differentially expressed between affected and healthy muscle. We initially selected two muscles involved in adduction of the hind legs (Mm. adductores, M. gracilis) for the analysis of gene expression profiles immediately after birth. Additionally, we investigated a muscle involved in abduction (M. sartorius) in order to detect more generalized defects in functional differing muscles of the hind leg.
Standard data analysis (PLIER or GCRMA with correction for multiple testing, false detection rate [FDR] < 0.10) revealed no significant differences in gene expression between both groups of piglets. Although, we expected significant differences between affected piglets and controls, similar results are reported from other species. Bye et al. (2008) investigated soleus muscle of divergently selected rats (maximum oxygen uptake) under exercise conditions and found only three genes regulated in this experiment (21). Thus, we employed an alternative approach for the identification of potential candidate genes.
After three consecutive filter steps, 230 (Mm. adductores: 88 upregulated and 142 downregulated in splay leg piglets), 300 (M. gracilis: 118 up, 172 down), and 412 genes (M. sartorius: 204 up, 208 down) remained for further analyses, respectively (Figure 1a). Merging the three lists resulted in 18 - 24 genes being regulated in two different muscles and 5 genes with differences in three muscles (Figure 1b).
Interestingly, none of six atrophy related genes on the array (SMN1, CBLB, CAST, FBXO32, FOXO1A, SGCD) was differentially expressed in the investigated muscles. Mutations in the human SMN1 gene cause spinal muscular atrophy (22) whereas defects in the genes for sarcoglycans (e.g. SGCD) are related to human limb-girdle muscular dystrophy (23). FBXO32, FOXO1A and CBLB were found highly upregulated in patients with atropy of skeletal muscle (24-26). Upregulation of Calpastatin (CAST) was demonstrated to slow atrophic processes in transgenic mice (27). The porcine homologs of these genes were unaltered in our investigation, thus making the assumed atrophy as the cause for congenital splay leg unlikely (6).
Functional clustering of the filtered genes [DAVID, (19)] revealed different pathways significantly regulated in the individual muscles. The biological processes with the highest enrichment scores are "programmed cell death" (GO:0012501 in Mm. adductores), "regulation of transcription, DNA-dependent" (GO:0006355 in M. gracilis) and "cellular protein catabolic process" (GO:0044257 in M. sartorius). However, the term "programmed cell death" ranked second in M. gracilis indicating similarities between both adducting muscles. We selected one gene from the top ranked pathways in each muscle (DDIT4, SSRP1 and SQSTM1).
(A) Three-step filtering of the array data. The GeneChip® Porcine Genome Array contains 23,903 probe sets excluding controls. The solid line encircles the number of probe sets identified as present in all 6 samples per muscle. The dotted line encloses the probe sets with significant differences (p<0.05) between the groups „Control“ and „Splay leg“ within each muscle according to the rank sum test (Wilcoxon). The dashed lines include the probe sets with significant differences (p<0.05) of the means between the groups (t-test) within each muscle. (B) Overlapping genes in different hind limb muscles with differential expression between control and splay leg piglets. Of the 412 (M. sartorius), 300 (M. gracilis) and 230 (Mm. adductores) probe sets identified as different by the filtering procedure, between 18 and 24 genes (boxed) showed overlap between two muscles whereas 5 probe sets were identified in all three muscles (below the line in the boxes). The official gene symbols are given in the boxes. In case of false or unclear annotation of the porcine array, the GenBank accessions of the respective porcine ESTs (1) are given.(Click on the image to enlarge.)
DDIT4 (DNA-damage-inducible transcript 4) was shown to be a p63 target gene and is part of an universal expression response to oxidative stress (28). SQSTM1 (sequestosome 1) was among the few genes that were upregulated in skeletal muscle after severe undernutrition in cattle. The putative function of SQSTM1 in skeletal muscle is the regulation of protein degradation (29). SSRP1 (structure specific recognition protein 1) was identified as a co-activator of the transcriptional activator p53 (30).
Additionally, we have chosen three further genes with expression ratios > 2 from M. gracilis (MAF), Mm. adductores (ENAH), and M. sartorius (PDK4) for confirmation of different expression by reverse transcription - real time PCR. MAF (v-maf musculoapneurotic fibrosarcoma oncogene homolog) is important for developmental processes and is able to induce p53-mediated cell death (31, 32), thus being part of the same signal cascade like SSRP1. In contrast, ENAH (enabled homolog) is required for neural development (33) and has not yet been described in skeletal muscle. PDK4 encodes a subunit of the pyruvate dehydrogenase kinase and was recently associated with muscle water content and intramuscular fat in swine (34).
MAF was identified as different in M. adductores and M. gracilis in the array experiment, DDIT4 was upregulated in Mm. adductores and M. sartorius of splay leg piglets, whereas SQSTM1 and SSRP1 were differently expressed in all investigated muscles.
Comparison of relative expression values from microarray analysis (solid bars) and RT Real time - PCR (dashed bars) for six potential candidate genes for congenital splay leg in piglets. The array data represent the means (relative expression x 10³) from three samples per muscle for control (blue) and splay leg piglets (red). The RT Real time - PCR results base on triplicate measurements (relative expression) for each sample and three samples per group (control: blue dashed, splay leg: red dashed). Significant group differences (p < 0.05) are denoted by asterisks, trends (0.05 < p < 0.10) are indicated by hash marks.(Click on the image to enlarge.)
3.2 Confirmation of differential expression by RT - Real time PCR
RT Real time - PCR was performed with 6 genes identified as potential candidate genes for congenital splay leg from the alternative analysis of the genome wide expression data. The fragments used for primer design were derived from the same porcine ESTs underlying the respective, differently expressed probe sets on the microarray. Before primer design, the porcine ESTs were tested against the current human genome build 36.3 on correctness of the annotation. This could be confirmed for the investigated ESTs (Table 1). Whereas the SSRP1 and DDIT4 were represented by a single probe set on the array, two (SQSTM1, PDK4), three (MAF) and four (ENAH) probe sets were accordingly annotated. The second probe set annotated as SQSTM1 (Ssc.6231.1.A1_at) however, revealed no similarity with the gene but with ORAI2. This demonstrates that errors in the current annotation of the Affymetrix GeneChip® Porcine Genome Array (13) could bias subsequent analyses.
The comparison of the results from microarray analysis and RT - Real time PCR is given in Figure 2. The numerical trend (up- or down-regulation) of the differences could be confirmed for all investigated genes. However, due to higher dynamic range of the RT - Real time PCR results compared to the array data, only the differences for ENAH (M. gracilis) and MAF (Mm. adductores) reached significance level (Figure 2).
The RT - Real time PCR for MAF was exemplarily performed in two independent experiments to test the reproducibility of the results (Figure 3). The correlation between the two series over three muscles was 0.97 (n = 18) with a range from 0.92 (Mm. adductores) to 0.99 (M. sartorius). Relative expression values measured with RT - Real time PCR were correlated to the relative expression data on the array in the range between 0.73 - 0.90. This is in accordance with investigations of Arikawa et al. (2008) demonstrating high comparability of data derived from SYBR Green based Real time PCR with high density microarray results (35). The confirmed different expression of the selected genes in this study provides the basis for further evaluation as candidate genes for congenital splay leg in piglet.
Repeatability of RT Real time - PCR results for MAF. Two independent experiments resulted in comparable differences in the relative mRNA abundance of MAF between control (C: solid bars) and splay leg piglets (S: dashed bars) in three muscles. The numbers above the brackets denote the p-values for the comparison of the means within each muscle (see legend).(Click on the image to enlarge.)
This study demonstrates a successful application of alternative analysis methods to detect differentially expressed genes in a genome wide expression study when standard methods fail. SYBR Green RT - Real time PCR proved as a reliable method to confirm the results of the array analysis. Four genes with different expression levels in at least two muscles (SQSTM1, SSRP1, DDIT4, and MAF) are assigned to transcriptional cascades related to cell death and may thus indicate pathways for further investigations on congenital splay leg in piglets.
This study was supported in part by the German Federal Ministry of Education and Research (BMBF), Program FUGATO - Functional Genome Analysis in Animal Organisms (Project HeDiPig - Hereditary Diseases in Pig; grant no. 0313392D). The helpful comments of three anonymous reviewers are acknowledged.
Conflict of interests
The authors declare that no conflict of interest exists.
1. Partlow GD, Fisher KR, Page PD. et al. Prevalence and types of birth defects in Ontario swine determined by mail survey. Can J Vet Res. 1993;57:67-73
2. Thurley DC, Gilbert FR, Done JT. Congenital splayleg of piglets: myofibrillar hypoplasia. Vet Rec. 1967;80:302-304
3. Dobson KJ. Congenital splayleg of piglets. Aust Vet J. 1968;44:26-28
4. Ward PS, Bradley R. The light microscopical morphology of the skeletal muscles of normal pigs and pigs with splayleg from birth to one week of age. J Comp Pathol. 1980;90:421-431
5. Jirmanova I. The splayleg disease: a form of congenital glucocorticoid myopathy?. Vet Res Commun. 1983;6:91-101
6. Ooi PT, da Costa N, Edgar J, Chang KC. Porcine congenital splayleg is characterised by muscle fibre atrophy associated with relative rise in MAFbx and fall in P311 expression. BMC Vet Res. 2006;2:23
7. Boettcher D, Paul S, Bennewitz J. et al. Exclusion of NFYB as candidate gene for congenital splay leg in piglets and radiation hybrid mapping of further five homologous porcine genes from human chromosome 12 (HSA12). Cytogenet Genome Res. 2007;118:67-71
8. Boettcher D, Schmidt R, Rehfeldt C. et al. Evaluation of MAFbx expression as a marker for congenital splay leg in piglets. Dev Biol (Basel). 2008;132:301-306
9. Björklund NE, Svendsen J, Svendsen LS. Histomorphological studies of the perinatal pig: comparison of five mortality groups with unaffected pigs. Acta Vet Scand. 1987;28:105-116
10. Curvers P, Ducatelle R, Vandekerckhove P. et al. Morphometric evaluation of myofibrillar hypoplasia in splayleg piglets. Dtsch Tierarztl Wochenschr. 1989;96:189-191
11. Tuggle CK, Wang Y, Couture O. Advances in swine transcriptomics. Int J Biol Sci. 2007;3:132-152
12. Ponsuksili S, Jonas E, Murani E. et al. Trait correlated expression combined with expression QTL analysis reveals biological pathways and candidate genes affecting water holding capacity of muscle. BMC Genomics. 2008;9:367
13. Tsai S, Cassady JP, Freking BA. et al. Annotation of the Affymetrix porcine genome microarray. Anim Genet. 2006;37:423-424
14. Bolstad BM, Irizarry RA, Åstrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics. 2003;19:185-193
15. Bolstad BM, Collin F, Brettschneider J. et al. Quality Assessment of Affymetrix GeneChip Data in Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Gentleman R, Carey V, Huber W, Irizarry R, Dudoit S, eds. Statistics for Biology and Health, Heidelberg: Springer. 2005:13-47
16. Storey JD. A direct approach to false discovery rates under dependence. J R Stat Soc Ser B. 2002;64:479-498
17. Wu Z, Irizarry RA, Gentleman R. et al. A Model-Based Background Adjustment for Oligonucleotide Expression Arrays. J Am Stat Ass. 2004;99:909-917
18. SAS Institute Inc. SAS Version 9.1. Cary, NC, USA: SAS Institute Inc. 2003
19. Dennis GJr, Sherman BT, Hosack DA. et al. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 2003;4:P3
20. Applied Biosystems (User Bulletin # 2). http://www3.appliedbiosystems.com/cms/groups/mcb_support/documents/generaldocuments/cms_040980.pdf
21. Bye A, Høydal MA, Catalucci D. et al. Gene expression profiling of skeletal muscle in exercise-trained and sedentary rats with inborn high and low VO2max. Physiol Genomics. 2008;35:213-221
22. Gangwani L, Mikrut M, Theroux S. et al. Spinal muscular atrophy disrupts the interaction of ZPR1 with the SMN protein. Nat Cell Biol. 2001;3:376-383
23. Duggan DJ, Gorospe JR, Fanin M. et al. Mutations in the sarcoglycan genes in patients with myopathy. N Engl J Med. 1997;336:618-624
24. Csibi A, Leibovitch MP, Cornille K. et al. MAFbx/Atrogin-1 controls the activity of the initiation factor eIF3-f in skeletal muscle atrophy by targeting multiple C-terminal lysines. J Biol Chem. 2009;284:4413-4421
25. Ogawa T, Furochi H, Mameoka M. et al. Ubiquitin ligase gene expression in healthy volunteers with 20-day bedrest. Muscle Nerve. 2006;34:463-469
26. Léger B, Cartoni R, Praz M. et al. Akt signalling through GSK-3beta, mTOR and Foxo1 is involved in human skeletal muscle hypertrophy and atrophy. J Physiol. 2006;576:923-933
27. Tidball JG, Spencer MJ. Expression of a calpastatin transgene slows muscle wasting and obviates changes in myosin isoform expression during murine muscle disuse. J Physiol. 2002;545:819-828
28. Han ES, Muller FL, Pérez VI. et al. The in vivo gene expression signature of oxidative stress. Physiol Genomics. 2008;34:112-126
29. Lehnert SA, Byrne KA, Reverter A. et al. Gene expression profiling of bovine skeletal muscle in response to and during recovery from chronic and severe undernutrition. J Anim Sci. 2006;84:3239-3250
30. Zeng SX, Dai MS, Keller DM, Lu H. SSRP1 functions as a co-activator of the transcriptional activator p63. EMBO J. 2002;21:5487-5497
31. Blank V, Andrews NC. The Maf transcription factors: regulators of differentiation. Trends Biochem Sci. 1997;22:437-441
32. Hale TK, Myers C, Maitra R. et al. Maf transcriptionally activates the mouse p53 promoter and causes a p53-dependent cell death. J Biol Chem. 2000;275:17991-17999
33. Urbanelli L, Massini C, Emiliani C. et al. Characterization of human Enah gene. Biochim Biophys Acta. 2006;1759:99-107
34. Lan J, Lei MG, Zhang YB. et al. Characterization of the porcine differentially expressed PDK4 gene and association with meat quality. Mol Biol Rep. 2008 [Epub ahead of print]
35. Arikawa E, Sun Y, Wang J. et al. Cross-platform comparison of SYBR Green real-time PCR with TaqMan PCR, microarrays and other gene expression measurement technologies evaluated in the MicroArray Quality Control (MAQC) study. BMC Genomics. 2008;9:328
Correspondence to: Dr. Steffen Maak, Tel: +49-38208-68850; Fax: +49-38208-68852; E-mail: maak