CopySeq is a small, simple software that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can integrate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes.
CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others.
Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ~15% and ~20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread.
Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies.
CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing.
Requirements:
· Java 1.6 or later
· R 2.11 or later
· Jim Kent`s twoBitToFa program
· Human genome (hg18) in .2bit file format