3.5

CiteScore

2.3

Impact Factor
  • ISSN 1674-8301
  • CN 32-1810/R
Honggang Yi, Hongmei Wo, Yang Zhao, Ruyang Zhang, Junchen Dai, Guangfu Jin, Hongxia Ma, Tangchun Wu, Zhibin Hu, Dongxin Lin, Hongbing Shen, Feng Chen. Comparison of dimension reduction-based logistic regression models forcase-control genome-wide association study: principalcomponents analysis vs. partial least squares[J]. The Journal of Biomedical Research, 2015, 29(4): 298-307. DOI: 10.7555/JBR.29.20140043
Citation: Honggang Yi, Hongmei Wo, Yang Zhao, Ruyang Zhang, Junchen Dai, Guangfu Jin, Hongxia Ma, Tangchun Wu, Zhibin Hu, Dongxin Lin, Hongbing Shen, Feng Chen. Comparison of dimension reduction-based logistic regression models forcase-control genome-wide association study: principalcomponents analysis vs. partial least squares[J]. The Journal of Biomedical Research, 2015, 29(4): 298-307. DOI: 10.7555/JBR.29.20140043

Comparison of dimension reduction-based logistic regression models forcase-control genome-wide association study: principal components analysis vs. partial least squares

  • With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logisticregression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism(SNP)setregion.WefoundthatPC-LRandPLScanreasonablycontroltypeIerrorundernullhypothesis. Oncontrast,LR,whichiscorrectedbyBonferronimethod,wasmoreconservedinallsimulationsettings.Inparticular,we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return