Post by Danny Antaki on Dec 13, 2019 1:53:42 GMT
./truffle/truffle --vcf tmp.vcf --segments --nofiltering
This is a sample of my pruned VCF file
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##fileDate=20191212
##source=PLINKv1.90
##contig=<ID=1,length=248945710>
##contig=<ID=2,length=242183319>
##contig=<ID=3,length=198191229>
##contig=<ID=4,length=190204446>
##contig=<ID=5,length=181476003>
##contig=<ID=6,length=170744274>
##contig=<ID=7,length=159335484>
##contig=<ID=8,length=145077617>
##contig=<ID=9,length=138252007>
##contig=<ID=10,length=133787095>
##contig=<ID=11,length=135075817>
##contig=<ID=12,length=133264677>
##contig=<ID=13,length=114353979>
##contig=<ID=14,length=106882741>
##contig=<ID=15,length=101980909>
##contig=<ID=16,length=90226010>
##contig=<ID=17,length=83243353>
##contig=<ID=18,length=80262556>
##contig=<ID=19,length=58607417>
##contig=<ID=20,length=64334014>
##contig=<ID=21,length=46699705>
##contig=<ID=22,length=50807922>
##contig=<ID=23,length=156029657>
##contig=<ID=24,length=56886150>
##contig=<ID=26,length=16292>
##INFO=<ID=PR,Number=0,Type=Flag,Description="Provisional reference allele, may not be based on real reference genome">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##bcftools_annotateVersion=1.9-259-gbd769ac-dirty+htslib-1.9-425-g565560e
##bcftools_annotateCommand=annotate -x ID -I +%CHROM:%POS0:%END:%REF:%ALT pruned.vcf; Date=Thu Dec 12 17:39:47 2019
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 1635_1635 1641_1641 1661_1661
1 14464 1:14463:14464:A:T A T . . PR GT 0/1 0/1 0/0
1 14653 1:14652:14653:C:T C T . . PR GT 0/0 0/0 0/1
1 14677 1:14676:14677:G:A G A . . PR GT 0/0 0/0 0/1
1 14907 1:14906:14907:A:G A G . . PR GT 0/1 0/1 0/1
1 14930 1:14929:14930:A:G A G . . PR GT 0/1 0/1 0/0
1 15118 1:15117:15118:A:G A G . . PR GT 0/1 0/1 0/0
##FILTER=<ID=PASS,Description="All filters passed">
##fileDate=20191212
##source=PLINKv1.90
##contig=<ID=1,length=248945710>
##contig=<ID=2,length=242183319>
##contig=<ID=3,length=198191229>
##contig=<ID=4,length=190204446>
##contig=<ID=5,length=181476003>
##contig=<ID=6,length=170744274>
##contig=<ID=7,length=159335484>
##contig=<ID=8,length=145077617>
##contig=<ID=9,length=138252007>
##contig=<ID=10,length=133787095>
##contig=<ID=11,length=135075817>
##contig=<ID=12,length=133264677>
##contig=<ID=13,length=114353979>
##contig=<ID=14,length=106882741>
##contig=<ID=15,length=101980909>
##contig=<ID=16,length=90226010>
##contig=<ID=17,length=83243353>
##contig=<ID=18,length=80262556>
##contig=<ID=19,length=58607417>
##contig=<ID=20,length=64334014>
##contig=<ID=21,length=46699705>
##contig=<ID=22,length=50807922>
##contig=<ID=23,length=156029657>
##contig=<ID=24,length=56886150>
##contig=<ID=26,length=16292>
##INFO=<ID=PR,Number=0,Type=Flag,Description="Provisional reference allele, may not be based on real reference genome">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##bcftools_annotateVersion=1.9-259-gbd769ac-dirty+htslib-1.9-425-g565560e
##bcftools_annotateCommand=annotate -x ID -I +%CHROM:%POS0:%END:%REF:%ALT pruned.vcf; Date=Thu Dec 12 17:39:47 2019
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 1635_1635 1641_1641 1661_1661
1 14464 1:14463:14464:A:T A T . . PR GT 0/1 0/1 0/0
1 14653 1:14652:14653:C:T C T . . PR GT 0/0 0/0 0/1
1 14677 1:14676:14677:G:A G A . . PR GT 0/0 0/0 0/1
1 14907 1:14906:14907:A:G A G . . PR GT 0/1 0/1 0/1
1 14930 1:14929:14930:A:G A G . . PR GT 0/1 0/1 0/0
1 15118 1:15117:15118:A:G A G . . PR GT 0/1 0/1 0/0
The segments output and the IDB file are empty.
ID1 ID2 NMARK NCOMMON IBD0 IBD1_MAX IBD1_NSEGS IBD1 IBD2_MAX IBD2_NSEGS IBD2 SEX
1635_1635 1661_1661 5882723 0 1.000000 3492 0 0.000000 1183 0 0.000000 00
1635_1635 1641_1641 5882723 0 1.000000 2823 0 0.000000 1284 0 0.000000 00
1641_1641 1661_1661 5882723 0 1.000000 5497 0 0.000000 2319 0 0.000000 00
1635_1635 1661_1661 5882723 0 1.000000 3492 0 0.000000 1183 0 0.000000 00
1635_1635 1641_1641 5882723 0 1.000000 2823 0 0.000000 1284 0 0.000000 00
1641_1641 1661_1661 5882723 0 1.000000 5497 0 0.000000 2319 0 0.000000 00
[*] Options in effect:
- Input file: tmp.vcf
- Number of CPUs: 4
- Reporting threshold: all pairs
- Segment reporting: YES
- Input file name: tmp.vcf
- Opening output file truffle.ibd
- Opening output file truffle.segments
- Number of samples: 3
- Allocation genotype vector npeople=19 nvars=200000
- GenotypeMatrix: allocating 3 MB of memory
- Excluding variants with missing rate > 0.020 (0 samples)
- Excluding variants with allele frequency < -100.000
- Reading chromosome 1 (pos=0)
- Allocation genotype vector npeople=19 nvars=400000
- GenotypeMatrix: allocating 7 MB of memory
- Allocation genotype vector npeople=19 nvars=800000
- GenotypeMatrix: allocating 14 MB of memory
- Reading chromosome 2 (pos=442217)
- Allocation genotype vector npeople=19 nvars=1600000
- GenotypeMatrix: allocating 28 MB of memory
- Reading chromosome 3 (pos=942887)
- Reading chromosome 4 (pos=1347126)
- Allocation genotype vector npeople=19 nvars=3200000
- GenotypeMatrix: allocating 57 MB of memory
- Reading chromosome 5 (pos=1768116)
- Reading chromosome 6 (pos=2142478)
- Reading chromosome 7 (pos=2526643)
- Reading chromosome 8 (pos=2865423)
- Reading chromosome 9 (pos=3183857)
- Allocation genotype vector npeople=19 nvars=6400000
- GenotypeMatrix: allocating 115 MB of memory
- Reading chromosome 10 (pos=3441790)
- Reading chromosome 11 (pos=3739283)
- Reading chromosome 12 (pos=4037772)
- Reading chromosome 13 (pos=4309506)
- Reading chromosome 14 (pos=4540944)
- Reading chromosome 15 (pos=4732534)
- Reading chromosome 16 (pos=4909416)
- 5113144 variants read - 5027804 variants kept - 311 MB - speed: 30.29 MB/sec
- Reading chromosome 17 (pos=5089858)
- Reading chromosome 18 (pos=5250158)
- Reading chromosome 19 (pos=5420553)
- Reading chromosome 20 (pos=5556621)
- Reading chromosome 21 (pos=5696921)
- Reading chromosome 22 (pos=5786828)
- Chromosome 1 Markers 442217 Cumulative 442217
- Chromosome 2 Markers 500670 Cumulative 942887
- Chromosome 3 Markers 404239 Cumulative 1347126
- Chromosome 4 Markers 420990 Cumulative 1768116
- Chromosome 5 Markers 374362 Cumulative 2142478
- Chromosome 6 Markers 384165 Cumulative 2526643
- Chromosome 7 Markers 338780 Cumulative 2865423
- Chromosome 8 Markers 318434 Cumulative 3183857
- Chromosome 9 Markers 257933 Cumulative 3441790
- Chromosome 10 Markers 297493 Cumulative 3739283
- Chromosome 11 Markers 298489 Cumulative 4037772
- Chromosome 12 Markers 271734 Cumulative 4309506
- Chromosome 13 Markers 231438 Cumulative 4540944
- Chromosome 14 Markers 191590 Cumulative 4732534
- Chromosome 15 Markers 176882 Cumulative 4909416
- Chromosome 16 Markers 180442 Cumulative 5089858
- Chromosome 17 Markers 160300 Cumulative 5250158
- Chromosome 18 Markers 170395 Cumulative 5420553
- Chromosome 19 Markers 136068 Cumulative 5556621
- Chromosome 20 Markers 140300 Cumulative 5696921
- Chromosome 21 Markers 89907 Cumulative 5786828
- Chromosome 22 Markers 95894 Cumulative 5882722
[*] Finished reading input files: 11744.93 ms
- Expected IBS for unrelated pairs: p.IBS0 = 0.0662 p.IBS1 = 0.3971 p.IBS2 = 0.5366
- X-chromosome: pIBS2(X/X) = 0.6690 pIBS2(X/XX) = 0.1655 Mean MAF = 0.2501
- (ibs_est) 547 p=0.933764 perr=0.000000
- (ibs_est) 61 p=0.536638 perr=0.000000
- (ibs_est) 34 p=0.331034 perr=0.000000
- (ibs_est) 207 p=0.834483 perr=0.000000
- (ibs_est) Thresholds for defining regions as IBD1 = 547, IBD2 = 61
- (ibs_est) Inflation at edges IBS1 = 28.2, IBS2 = 1.7
- Genome size: 2765514.816 kb
- gs_threshold=23281.893794
- (ibs_est) Adjust segment size to 23281 markers (from 547)
- IBS1 threshold : 0.40 % of genome (23281 markers)
- IBS2 threshold : 0.07 % of genome (3880 markers)
[*] Starting IBD segment analysis
[*] Estimated heterozygosity rate = 0.33025
- Allocation genotype vector transposed npeople=4 nvars=5882723
- GenotypeMatrixTransposed: allocating 22 MB of memory
[*] Reading sex information file: (null)
- No sex file specified
- Warning: Sample 1635_1635 has no available sex information.
- Warning: Sample 1641_1641 has no available sex information.
- Warning: Sample 1661_1661 has no available sex information.
[*] Genotype pre-processing duration: 12784.79 ms
- Compute IBD by IBS: (cpu=4/4) Nind = 3 Nvar = 5882723
- Compute IBD by IBS: (cpu=1/4) Nind = 3 Nvar = 5882723
- Compute IBD by IBS: (cpu=3/4) Nind = 3 Nvar = 5882723
- Compute IBD by IBS: (cpu=2/4) Nind = 3 Nvar = 5882723
[*] Finished processing
- Total time for analysis was 0.21 minutes (12.9 seconds)
- Input file: tmp.vcf
- Number of CPUs: 4
- Reporting threshold: all pairs
- Segment reporting: YES
- Input file name: tmp.vcf
- Opening output file truffle.ibd
- Opening output file truffle.segments
- Number of samples: 3
- Allocation genotype vector npeople=19 nvars=200000
- GenotypeMatrix: allocating 3 MB of memory
- Excluding variants with missing rate > 0.020 (0 samples)
- Excluding variants with allele frequency < -100.000
- Reading chromosome 1 (pos=0)
- Allocation genotype vector npeople=19 nvars=400000
- GenotypeMatrix: allocating 7 MB of memory
- Allocation genotype vector npeople=19 nvars=800000
- GenotypeMatrix: allocating 14 MB of memory
- Reading chromosome 2 (pos=442217)
- Allocation genotype vector npeople=19 nvars=1600000
- GenotypeMatrix: allocating 28 MB of memory
- Reading chromosome 3 (pos=942887)
- Reading chromosome 4 (pos=1347126)
- Allocation genotype vector npeople=19 nvars=3200000
- GenotypeMatrix: allocating 57 MB of memory
- Reading chromosome 5 (pos=1768116)
- Reading chromosome 6 (pos=2142478)
- Reading chromosome 7 (pos=2526643)
- Reading chromosome 8 (pos=2865423)
- Reading chromosome 9 (pos=3183857)
- Allocation genotype vector npeople=19 nvars=6400000
- GenotypeMatrix: allocating 115 MB of memory
- Reading chromosome 10 (pos=3441790)
- Reading chromosome 11 (pos=3739283)
- Reading chromosome 12 (pos=4037772)
- Reading chromosome 13 (pos=4309506)
- Reading chromosome 14 (pos=4540944)
- Reading chromosome 15 (pos=4732534)
- Reading chromosome 16 (pos=4909416)
- 5113144 variants read - 5027804 variants kept - 311 MB - speed: 30.29 MB/sec
- Reading chromosome 17 (pos=5089858)
- Reading chromosome 18 (pos=5250158)
- Reading chromosome 19 (pos=5420553)
- Reading chromosome 20 (pos=5556621)
- Reading chromosome 21 (pos=5696921)
- Reading chromosome 22 (pos=5786828)
- Chromosome 1 Markers 442217 Cumulative 442217
- Chromosome 2 Markers 500670 Cumulative 942887
- Chromosome 3 Markers 404239 Cumulative 1347126
- Chromosome 4 Markers 420990 Cumulative 1768116
- Chromosome 5 Markers 374362 Cumulative 2142478
- Chromosome 6 Markers 384165 Cumulative 2526643
- Chromosome 7 Markers 338780 Cumulative 2865423
- Chromosome 8 Markers 318434 Cumulative 3183857
- Chromosome 9 Markers 257933 Cumulative 3441790
- Chromosome 10 Markers 297493 Cumulative 3739283
- Chromosome 11 Markers 298489 Cumulative 4037772
- Chromosome 12 Markers 271734 Cumulative 4309506
- Chromosome 13 Markers 231438 Cumulative 4540944
- Chromosome 14 Markers 191590 Cumulative 4732534
- Chromosome 15 Markers 176882 Cumulative 4909416
- Chromosome 16 Markers 180442 Cumulative 5089858
- Chromosome 17 Markers 160300 Cumulative 5250158
- Chromosome 18 Markers 170395 Cumulative 5420553
- Chromosome 19 Markers 136068 Cumulative 5556621
- Chromosome 20 Markers 140300 Cumulative 5696921
- Chromosome 21 Markers 89907 Cumulative 5786828
- Chromosome 22 Markers 95894 Cumulative 5882722
[*] Finished reading input files: 11744.93 ms
- Expected IBS for unrelated pairs: p.IBS0 = 0.0662 p.IBS1 = 0.3971 p.IBS2 = 0.5366
- X-chromosome: pIBS2(X/X) = 0.6690 pIBS2(X/XX) = 0.1655 Mean MAF = 0.2501
- (ibs_est) 547 p=0.933764 perr=0.000000
- (ibs_est) 61 p=0.536638 perr=0.000000
- (ibs_est) 34 p=0.331034 perr=0.000000
- (ibs_est) 207 p=0.834483 perr=0.000000
- (ibs_est) Thresholds for defining regions as IBD1 = 547, IBD2 = 61
- (ibs_est) Inflation at edges IBS1 = 28.2, IBS2 = 1.7
- Genome size: 2765514.816 kb
- gs_threshold=23281.893794
- (ibs_est) Adjust segment size to 23281 markers (from 547)
- IBS1 threshold : 0.40 % of genome (23281 markers)
- IBS2 threshold : 0.07 % of genome (3880 markers)
[*] Starting IBD segment analysis
[*] Estimated heterozygosity rate = 0.33025
- Allocation genotype vector transposed npeople=4 nvars=5882723
- GenotypeMatrixTransposed: allocating 22 MB of memory
[*] Reading sex information file: (null)
- No sex file specified
- Warning: Sample 1635_1635 has no available sex information.
- Warning: Sample 1641_1641 has no available sex information.
- Warning: Sample 1661_1661 has no available sex information.
[*] Genotype pre-processing duration: 12784.79 ms
- Compute IBD by IBS: (cpu=4/4) Nind = 3 Nvar = 5882723
- Compute IBD by IBS: (cpu=1/4) Nind = 3 Nvar = 5882723
- Compute IBD by IBS: (cpu=3/4) Nind = 3 Nvar = 5882723
- Compute IBD by IBS: (cpu=2/4) Nind = 3 Nvar = 5882723
[*] Finished processing
- Total time for analysis was 0.21 minutes (12.9 seconds)