Variant Calling

Last updated: Nov 15th, 2021

Alignment

Codes
  • Run single-end

    bwa mem <ref_genome.fasta> <sample.R1.fq> > <alignment.sam>

  • Run paired-end

    bwa mem <ref_genome.fasta> <sample.R1.fq> <sample.R2.fq> > <alignment.sam>

  • References
  • BWA
  • Variant Calling

    Codes
  • Map reads to the genome

    gatk HaplotypeCaller -R <ref_GENOME.fasta> -I <output.bam> -O <output.gatk.vcf>

  • References
  • GATK
  • Variant Annotation

    Codes
  • Run SnpEff java script

    java -Xmx8g -jar snpEff.jar -v <ref_genome> <sample_variant.vcf> > <sample_variant_annotation.vcf>

  • References
  • SnpEff
  • Examples

    IMB Bioinformatics Core Pipeline: bbsc_portal
    fastp -w 16 -h sample.fastp.html -j sample.fastp.json -i ./ -o clean_read.fastq
    bwa mem -t 16 path_to_bwa_index/ref clean_read.fastq | samtools sort -@ 16 -O BAM -o sample.sorted.bam -
    java -jar picard.jar MarkDuplicates -I sample.sorted.bam -O sample.mark.bam -M sample.metrics.txt -ASSUME_SORTED -VALIDATION_STRINGENCY LENIENT
    java -jar picard.jar AddOrReplaceReadGroups -I sample.mark.bam -O sample.rg.bam -RGID 1 -RGLB lib1 -RGPL illumina -RGPU unit1 -RGSM MyRead
    java -jar picard.jar BuildBamIndex -I sample.rg.bam
    gatk HaplotypeCaller --native-pair-hmm-threads 16 -R ref.fa -I sample.rg.bam -O sample.gatk.vcf
    java -Xmx8g -jar snpEff.jar ref.fa sample.gatk.vcf > sample.gatk.vcf.SnpEff