The pould package (pronounced “pooled”) calculates four linkage disequilibrium (LD) statistics – D’, Wn and the conditional asymmetric LD (cALD) measures, WA/B and WB/A, for genotype data from pairs of genetic loci, and can treat these data as either phased or unphased for these calculations. The package includes a wrapper function that parses either column-formatted genotypes or multi-locus haplotypes in the 17th International HLA and Immunogenetics Workshop (IHIW) Family Haplotype Project’s HaplObserve output format. This wrapper function generates output files in a user-defined directory.
The package includes a function that applies a sign test to LD values
for phased and unphased haplotypes generated by the wrapper function for
haplotype- or genotype-formatted datasets, and a function that generates
heat-maps for each LD measure. For more information, see:
Osoegawa
et al. Hum
Immunol. 2019;80(9):633-643.
Osoegawa et al. Hum Immunol.
2019;80(9):644-660.
For information about these LD measures, see:
Hedrick PW. Genetics;
1987,117,331-41.
Cramér H. Mathematical Models of Statistics,
1946, Princeton University Press, Princeton NJ.
Thomson G, Single
RM. Genetics.
2014;198(1):321-31.
Pould accepts genotype and haplotype data for individual subjects as input data. To calculate the cALD measures using haplotype frequency data, try the asymLD package. See: Single et al. Hum Immunol. 2016;77(3):288-294 for more information about asymLD.
The cALD() function calculates the LD measures from genotype
data for pairs of loci. The drb1.dqb1.demo
dataset, shown
below, illustrates the input format. The function accepts a 4-column
data frame or tab-delimited text file with data for the first locus in
columns 1 and 2, and data for the second locus in columns 3 and 4.
> drb1.dqb1.demo[1:6,]
## DRB1 DRB1 DQB1 DQB1
## 1 15:01 07:01 06:02 03:03
## 2 04:05 13:01 03:02 06:03
## 3 04:01 13:02 03:02 06:09
## 4 16:02 04:05 05:02 03:02
## 5 15:01 07:01 06:02 02:02
## 6 04:01 04:01 03:02 03:02
The locus names should be used as column headers, with one allele/variant in each column. Each row represents a single subject. The headers for columns 1 and 2 must be identical, as must the headers for columns 3 and 4. If phase is known, cALD() assumes that columns 1 and 3 represent one haplotype, and that columns 2 and 4 represent the second haplotype.
While HLA data are shown above, cALD() can accept any genetic data. The example below combines data for HLA-DQA1 and rs7743506.
HLA-DQA1 HLA-DQA1 rs7743506 rs7743506
02:01 01:02 C C
04:01 01:02 A C
04:01 05:01 A C
01:01 01:02 C C
04:01 01:01 A C
05:01 01:02 C C
The LDWrap() function parses either genotype data formatted in a two-column per locus format, or haplotype data formatted using the 17th International HLA and Immunogenetic Workshop Family Haplotype Project’s GL String-based format. The function accepts a data frame, a tab-formatted (.txt or .tsv) columnar genotype file, or a comma-separated value formatted (.csv) GL String haplotype file, and passes genotype data for all pairs of loci in that dataset to cALD() for LD analysis.
A minimal LDWrap() haplotype data file or data frame
contains two columns named “Relation” and “Gl String”. Other columns are
allowed, but are ignored. The hla.hap.demo
dataset (shown
below in edited form) illustrates the input format. Each row contains
data for a single subject. The “Relation” column can contain any text
string; however, values such as “mother”, “father” and “child” are
standard for the Family HLA Data Project. LDWrap() will ignore
all rows in which Relation=child
; rows with any other value
in the “Relation” column will be processed.
Relation Gl String
Subject HLA-A*02:01~HLA-C*07:02~HLA-B*07:02+HLA-A*01:01~HLA-C*06:02~HLA-B*57:01
Subject HLA-A*03:01~HLA-C*07:01~HLA-B*49:01+HLA-A*01:01~HLA-C*07:01~HLA-B*08:01
Subject HLA-A*11:01~HLA-C*04:01~HLA-B*15:01+HLA-A*03:01~HLA-C*08:02~HLA-B*14:02
Subject HLA-A*68:01~HLA-C*15:02~HLA-B*40:06+HLA-A*68:01~HLA-C*06:02~HLA-B*45:01
The “Gl String” column contains a GL
String formatted multi-locus HLA haplotype. In GL String format, the
~ operator denotes phase, and the + operator denotes copies of genes (in
this case diploidy). While GL Strings can be used to describe ambiguous
alleles and genotypes using the / and | operators, ambiguous data cannot
be included in an LDWrap() data file. LDWrap()
requires that alleles be described as LOCUS*
VARIANT (e.g.,
HLA-DRB1*01:01
); locus prefixes (e.g., HLA-
)
are not required, but if locus refixes are included, all loci must be
described using the same prefix. Allele data described without a locus
(e.g., 01:01
) are not allowed. Unusual allele names
(HLA-A*NULL
, HLA-DRB1*NoMatch
,
HLA-DPB1*NT
) and truncated versions of allele names
(HLA-A*01
, HLA-A*01:01
,
HLA-A*01:01:01
, etc.) will be analyzed as distinct alleles,
and may skew analytic results. LDWrap() includes an option to
truncate colon-delimited allele names to specific numbers of fields for
analysis.
A minimal LDWrap() genotype data file or data frame contains two columns per locus, with one allele in each column, as for cALD(), but accommodating more than two loci. Columns for the the same locus must be adjacent and can have identical names, or can be suffixed with “_1” and “_2”. Columns named “SampleID” and “Disease” are permitted, but not required. No other columns are allowed. Allele names in each column can include a locus name (e.g., locus*allele); if locus names are not included, the locus name in the header will be associated with each allele in that column.
The order in which the loci appear, either in columns or in the GL String haplotype, affects the identification of haplotype locus pairs for analysis. For example, if HLA loci are organized alphabetically, haplotypes of the HLA-B and HLA-C loci will be analyzed as B~C haplotypes; if they are organized by map order, those haplotypes will be analyzed as C~B haplotypes. B~C and C~B haplotypes will not be recognized as the same by LD.heat.map(). To avoid this, it is recommended to use the same organization of loci for all analyses, and to use map order for HLA or KIR loci.
The LD.sign.test() function applies the R Stats Package’s binom.test() function to pairs of LD values (D’, Wn, WLoc1/Loc2, WLoc2/Loc1), as well as the number of haplotypes, for phased and unphased haplotypes. The *_LD_results.csv* files generated by LDWrap() are the input files for this function, and generally, LDWrap() must be used before this function can be applied. See the LDWrap() Outputs section below for an example.
The LD.heat.map() function generates heat-map plots of the LD values generated for each LD measure (D’, Wn, WLoc1/Loc2, and WLoc2/Loc1) for phased and unphased haplotpes. If LD values for only phased or unphased haplotpes are available, half-matrix heat-maps will be generated. The *_LD_results.csv* files generated by LDWrap() are the input files for this function, and generally, LDWrap() must be used before this function can be applied. See the LDWrap() Outputs section below for an example.
By default, cALD() operates in “verbose” mode, and will write five lines of output to the console describing the phase-status of the LD analysis (phased or unphased), the loci and number of haplotypes analyzed, and the four LD measures calculated, as shown below.
> cALD(drb1.dqb1.demo)
## Calculating D', Wn and conditional ALD for 53 unphased genotypes at the DRB1 and DQB1 loci.
## D' for DRB1~DQB1 haplotypes: 0.95892767844544 (0.9589)
## Wn for DRB1~DQB1 haplotypes: 0.811250972337927 (0.8113)
## Variation of DQB1 conditioned on DRB1 (WDQB1/DRB1) = 0.904035615838528 (0.904)
## Variation of DRB1 conditioned on DQB1 (WDRB1/DQB1) = 0.778712696009626 (0.7787)
When verbose=FALSE
, cALD() returns a vector of
D’, Wn,
WB/A, WA/B and the number of
haplotypes, as below.
> cALDres <- cALD(drb1.dqb1.demo, verbose=FALSE)
> cALDres
## [1] "0.958463650196244" "0.811184752436694" "0.903300938910147" "0.778712697633606" "53"
In addition, when saveVector=TRUE, cALD() will write a text
file, containing a vector of all haplotypes, their frequencies and
counts for the analyzed locus pair, to a user-specified directory.
Unless specified via the vecDir
parameter, this file is
written to the directory specified by tempdir(). This vector
file also includes information on the dataset and phase status applied
to the genotype data for the analysis. An example generated for the
drb1.dqb1.demo
dataset is shown below.
> cALD(drb1.dqb1.demo,saveVector = TRUE)
Haplotype vector file contents:
Dataset Phase DRB1~DQB1 Frequency Count
DRB1~DQB1_haplotype_Vector_2018-05-02_16-53-12 FALSE 01:01~02:01 0 0
DRB1~DQB1_haplotype_Vector_2018-05-02_16-53-12 FALSE 01:02~02:01 0 0
DRB1~DQB1_haplotype_Vector_2018-05-02_16-53-12 FALSE 01:03~02:01 0 0
DRB1~DQB1_haplotype_Vector_2018-05-02_16-53-12 FALSE 03:01~02:01 0.094272076372315 79
DRB1~DQB1_haplotype_Vector_2018-05-02_16-53-12 FALSE 04:01~02:01 0 0
DRB1~DQB1_haplotype_Vector_2018-05-02_16-53-12 FALSE 04:02~02:01 0 0
DRB1~DQB1_haplotype_Vector_2018-05-02_16-53-12 FALSE 04:03~02:01 0 0
DRB1~DQB1_haplotype_Vector_2018-05-02_16-53-12 FALSE 04:04~02:01 0 0
.
.
.
LDWrap() sends data to cALD(), captures the vector
of LD results returned by cALD(), directs cALD() to
write vectors of haplotypes for each locus pair into a user-specified
directory, and writes a single table of LD results for all locus pairs
to that same directory. As a single haplotype vector file is written for
each locus pair, LDWrap() directs cALD() to write
n(n-1)/2 haplotype vector files, where n is the number of loci in a
haplotype. The only information LDWrap() returns to the console
is a notification that the analysis has completed
(LD Analysis Complete
) and notifications that the provided
dataset is missing the required columns. If the user specifies no
destination for these files, they are written to the directory specified
by tempdir().
When all locus pairs have been analyzed by cALD(),
LDWrap() writes a six-column CSV file
(*LD_results.csv
) of aggregated LD result vectors collected
from cALD() to the specified directory. The column headers in
this file are Loc1~Loc2 (identifying the locus pair),
D’, Wn,
WB/A, WA/B and N_Haplotypes. An
example of this file is shown below.
> LDWrap(hla.hap.demo)
LD Analysis Complete
LD results file contents:
Loc1~Loc2,D',Wn,WLoc1/Loc2,WLoc2/Loc1,N_Haplotypes
A~C,0.469024805013898,0.362566555750013,0.366359427624652,0.384413960789992,191
A~B,0.540780240853345,0.446662593270748,0.36839918931955,0.471334711300434,241
A~DRB1,0.400002012804198,0.335434108343871,0.27413544158564,0.320399398449896,233
A~DRB3,Not Calculated,Subject Threshold=10,Complete subjects=8,.,
.
.
.
LDWrap() attemtps to peform these LD calculations for all
pairs of loci in the LDWrap() datafile. If a haplotype dataset
includes locus pairs for which the number of subjects is below the
threshold
value (see “Parameters”, below), the
*LD_results.csv
file will include rows for locus pairs for
which no LD calculations were performed. As shown above, those rows
contain data similar to that shown for the A~DRB3 haplotpe –
Not Calculated
, Subject Threshold=10
,
Complete subjects=8
, .
, and ’ ’.
As shown below, in cases where at least one locus in a pair is
monomorphic, no LD calculations are performed and the pertinent rows in
the *LD_results.csv
file will contain,
Not Calculated
, Subject Threshold=10
,
Complete subjects=0
,
"locusName" is monomorphic.
, and ’ ’.
Loc1~Loc2,D',Wn,WLoc1/Loc2,WLoc2/Loc1,N_Haplotypes
B~DRB3,Not Calculated,Subject Threshold=10,Complete subjects=130,DRB3 is monomorphic.,
DRB3~DRB4,Not Calculated,Subject Threshold=10,Complete subjects=130,DRB3 is monomorphic. DRB4 is monomorphic.,
DRB3~DQA1,Not Calculated,Subject Threshold=10,Complete subjects=93,DRB3 is monomorphic.,
For each LD measure and the number of haplotypes for phased and unphased versions of the same genotype data, LD.sign.test() reports the p-value of the sign test, comparing the number of locus pairs for which the value of the measure is higher in unphased haplotypes than phased haplotypes to the number of locus pairs for which that value is lower or equal. The function also reports the total number of locus pairs evaluated, and the number of locus pairs with equal values. These data can be reported in three ways; as a returned data frame, as a table written to the console, and as a CSV file written to a user-specified directory. All three report formats are illustrated below. Note, only the significance of the sign test is reported; when a significant trend is indicated, the directionality of the trend is not reported.
> LD.res <- LD.sign.test("hla-family-data")
> View(LD.res)
Returned Data Frame
D' Wn WLoc1/Loc2 WLoc2/Loc1 N_Haplotypes
#unphased > phased 1.500000e+01 1.400000e+01 1.500000e+01 1.500000e+01 0.000000e+00
#unphased = phased 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
#locus pairs 1.500000e+01 1.500000e+01 1.500000e+01 1.500000e+01 1.500000e+01
p-values 6.103516e-05 9.765625e-04 6.103516e-05 6.103516e-05 6.103516e-05
> LD.sign.test("hla-family-data", returnFrame = FALSE)
Console Table
Sign Test results for the hla-family-data dataset for 15 locus pairs.
Measure #U > P #U = P p-value
D' 15 0 6.104e-05
Wn 14 0 0.0009766
WLoc1/Loc2 15 0 6.104e-05
WLoc2/Loc1 15 0 6.104e-05
# Haplotypes 0 0 6.104e-05
CSV File
,D',Wn,WLoc1/Loc2,WLoc2/Loc1,N_Haplotypes
#unphased > phased,15,14,15,15,0
#unphased = phased,0,0,0,0,0
#locus pairs,15,15,15,15,15
p-values,6.10351562500001e-05,0.0009765625,6.10351562500001e-05,6.10351562500001e-05,6.10351562500001e-05
The LD.heat.map() function generates heat-map plots for each
LD measure, visualizing the range of LD values for each locus-pair
analzed using LDWrap(). LD values for phased data are presented
in the upper half of the plot matrix, while values for unphased data are
presnted in the lower half of the plot matrix. Color (blue to white to
red) or greyscale (dark grey to light grey) plots can be specified. When
writePlot=TRUE
, PNG-formatted LD plots will be written to a
directory identified using the writeDir
parameter; this
parameter defaults to the directory specified by tempdir().
When the dataName
parameter is provided, heat-map plots are
written with the name
<dataName>_<LD measure>_heatmap.png
. When the
phasedData
and unphasedData
parameters are
provided, the heat-map plot for each LD measure is named
<phasedData>-<unphasedData>_<LD measure>_heatmap.png
.
When no LDwrap() results are available, or when the loci differ
between the datasets specified by phasedData
and
unphasedData
, a notification is written to the console.
There are no other outputs.
> LD.heat.map("family-data")
Console Notification
Files family-data_Phased_LD_results.csv and family-data_Unphased_LD_results.csv are not found.
> LD.heat.map(phasedData="my-data_phased.csv",unphasedData="my-data_unphased.csv")
Console Notification
The specified phased and unphased datafiles are not found.
> LD.heat.map(phasedData="my-ABC-data_phased.csv",unphasedData="my-ABDRB-data-unphased.csv")
Console Notification
Different loci included in my-ABC-data_phased.csv and my-ABDRB-data-unphased.csv. Cannot produce heatmaps.
cALD(dataSet, inPhase = FALSE, verbose = TRUE, saveVector = FALSE, vectorName = "", vectorPrefix = "", vecDir = tempdir())
dataSet
Class: Character. Required. No Default.
e.g., dataSet="foo.txt"
or
dataSet=drb1.dqb1.demo
Identifies the genotype data to be analyzed. dataSet
should identify a four-column data-frame or tab-delimited text file.
Columns 1 and 2 contain genotype data for one genetic locus, with one
allele in each column. Columns 3 and 4 contain genotype data for the
second locus. The headers for columns 1 and 2 identify the first locus,
and must be identical. Similarly, the headers for columns 3 and 4
identify the second locus, and must be identical.
inPhase
Class: Logical. Required. Default=FALSE.
Identifies how the genotype data should be analyzed. When
inPhase=FALSE
, the expectation-maximization (EM) algorithm
is applied to the genotype data to generate estimated haplotypes. LD
values are calculated for those EM haplotypes. When
inPhase=TRUE
, the genotype data are treated as phased, with
the alleles in column 1 in phase with those in column 3, and the alleles
in column 2 in phase with those in column 4. LD values are calculated
for those phased haplotypes.
verbose
Class: Logical. Required. Default=TRUE.
Identifies how LD values should be reported. When
verbose=TRUE
, values for D’,
Wn, WB/A,
WA/B and the number of haplotypes evaluated are
written to the console in a human readable form. When
verbose=FALSE
, those values are reported in a vector of
length 5 and mode of “character”.
saveVector
Class: Logical. Required. Default=FALSE.
Specifies if a file containing the haplotype vector for the analyzed
locus pair should be exported. If saveVector=FALSE
no
vector is written. If saveVector=TRUE
a tab-delimited text
file consisting of five columns is written to the working directory. The
file includes the columns “Dataset”, “Phase”, the name of the analyzed
locus pair, “Frequency”, and “Count”; the latter two describe the
frequency and count data for all possible haplptypes.
vectorName
Class: Character. Optional. No Default. (saveVector=TRUE
specific).
Provides a name for the tab-delimited haplotype vector file written
when saveVector=TRUE
. When vectorName="foo"
and saveVector=TRUE
, the name of the haplotype vector file
will be “foo.txt”. When vectorName=""
and
saveVector=TRUE
, the name of the haplotype vector file will
include the loci analyzed and a time-stamp, formatted as
“Locus1~Locus2_haplotype_Vector_yyyy-MM-dd-HH-mm-ss.txt”.
vectorPrefix
Class: Character. Optional. No Default. (saveVector=TRUE
specific).
Applies a prefix that includes the applied phase-status to the name
of the tab-delimited haplotype vector file written when
saveVector=TRUE
, when vectorName=""
. If any
text string is provided for vectorName, this parameter is ignored. E.g.,
when saveVector=TRUE
, vectorName=""
,
vectorPrefix="foo"
and inPhase=FALSE
, the
haplotype vector file will be named
“foo_unphased_Locus1~Locus2_haplotype_Vector_yyyy-MM-dd-HH-mm-ss.txt”.
When the LDWrap()
function directs cALD()
to write files containing haplotype vectors for each locus pair,
LDWrap()
provides the file information to cALD
via the vectorPrefix
parameter. The resulting files will
contain the name of the family haplotype dataset processed by
LDWrap()
, the phase-status, haplotype pair-name, and a
timestamp. If a positive trunc
parameter was provided to
LDWrap()
, the truncation level will appear in the names of
these haplotype vector files.
vecDir
Class: Character. Optional. Default=tempdir().
(saveVector=TRUE
specific).
Specifies the directory into which the haplotype vector file should be written.
LDWrap(famData, threshold = 10, phased = TRUE, frameName = "hla-family-data")
famData
Class: Character. Required. No Default.
e.g., famData="foo.csv"
or
famData=hla.hap.demo
Identifies the haplotype or genotype dataset to be analyzed. For
haplotype data, famData
should identify a data frame or CSV
file. This dataset must inlcude columns with the headers “Relation” and
“Gl String”. If either (or both) of these column headers is not found in
the dataset, or if the data file is not a CSV file, LDWrap()
will halt the analysis with a notification about the missing header(s).
For genotype data, famData
should identify a data frame or
tab-delimited text file with two columns of allele data for each locus.
See the Functions and Input Formats section above for additional details
about these dataset formats.
threshold
Class: Numeric. Required. Default=10.
Identifies the minimum number of subjects with haplotype data for a
given locus pair required for the analysis of that locus pair. Analysis
for that locus pair is not performed if the threshold is not met, and
the LD results file will identify the threshold value and the number of
subjects with data for that locus pair. If threshold
is set
to less than 1, it is automatically set to 1.
phased
Class: Logical. Required. Default=TRUE.
Specifies whether the haplotype data should be treated as phased
(phased=TRUE
) or unphased (phased=FALSE
) for
analysis.
frameName
Class: Character. Optional. Default=“hla-family-data”.
Provides a name that will be included in the names of the result
files if famData
specifies a data frame. The value of
frameName
is passed to cALD() as the
vectorPrefix
parameter. If famData
specifies a
file, frameName
is ignored.
trunc
Class: Numeric. Required. Default=0.
Specifies the number of fields to which colon-delimited allele names
in famdData
should be truncated. The default value of 0
indicates no truncation. A value higher than the number of fields in the
supplied allele data will result in no truncation. When a positive value
of trunc
is provided, the names of the output files will
include the specified truncation level.
writeTo
Class: Character. Optional. Default=tempdir().
Specifies the directory into which LDWrap() should write files.
LD.sign.test(dataName,verbose = TRUE,returnFrame = FALSE)
dataName
Class: Character. Required. No Default.
e.g., dataName="foo"
The “base” name of the “_LD_result.csv” files generated by
LDWrap(), with the “_Phased_LD_results.csv” or
“_Unphased_LD_results.csv” suffixes removed. This corresponds to the
value of the LDWrap() frameName
parameter when the
LDWrap() famData
parameter does not specify a
file; e.g., when specifying the “_LD_results.csv” files generated by
LDWrap() for the hla.hap.demo
data included with
this package, dataName="hla-family-data"
. If the
corresponding “
verbose
Class: Logical. Required. Default=TRUE.
Identifies if messages about function progress and results should be displayed in the console (verbose=TRUE) or not (verbose=FALSE). The default is verbose=TRUE. In addition to the table of results and messages about missing input files, messages regarding locus pairs for which no LD values are available, and about discrepancies between locus pairs with available data in the phased and unphased datasets are also suppressed with verbose=FALSE.
returnFrame
Class: Logical. Required. Default=TRUE.
Identifies if a data frame of results should be returned
(returnFrame=TRUE). If returnFrame=FALSE
, a CSV file of
results named “resultDir
parameter.
resultDir
Class: Character. Optional. Default=tempdir().
Specifies the directory into which LD.sign.test() should write the CSV file of results.
LD.heat.map(dataName)
dataName
Class: Character. Optional. Default=““.
e.g., dataName="foo"
The “base” name of the “_LD_result.csv” files generated by
LDWrap(), with the “_Phased_LD_results.csv” or
“_Unphased_LD_results.csv” suffixes removed. This corresponds to the
value of the LDWrap() frameName
parameter when the
LDWrap() famData
parameter does not specify a
file; e.g., when specifying the “_LD_results.csv” files generated by
LDWrap() for the hla.hap.demo
data included with
this package, dataName="hla-family-data"
. If the
corresponding “phasedData
and/or
unphasedData
must be provided.
phasedData
Class: Character. Optional. Default=““.
The complete name of a file of phased LD results generated by
LDWrap(). Provide this filename if no “base” name is provided
for dataName
and you want to generate heat-maps for a
specific set of phased LD values.
e.g., phasedData="phased-data.csv"
unphasedData
Class: Character. Optional. Default=““.
The complete name of a file of unphased LD results generated by
LDWrap(). Provide this filename if no “base” name is provided
for dataName
and you want to generate heat-maps for a
specific set of unphased LD values.
e.g., phasedData="unphased-data.csv"
phasedLabel
Class: Character. Required. Default=“Phased”.
e.g., phasedLabel="Pedigree Phased"
Specifies the label that should appear on the heat-map plots for the upper (presumed phased) half of the plot.
unphasedLabel
Class: Character. Required. Default=“EM-estimated”.
e.g., unphasedLabel="EM Haplotypes"
Specifies the label that should appear on the heat-map plots for the lower (presumed unphased) half of the plot.
color
Class: Logical. Required. Default=TRUE.
e.g., color=FALSE
Identifies if the heat-map plots should be generated in color
(color=TRUE
) or greyscale (color=FALSE
). The
default is color=TRUE
. Color heat-map plots will range from
blue (low LD values) to white (LD values of 0.5) to red (high LD
values). Greyscale heat-map plots will range from dark grey (Low LD
values) to light grey (high LD values).
writePlot
Class: Logical. Required. Default=FALSE.
Identifies if the heat-map plots should be automatically saved after they are generated.
writeDir
Class: Character. Optional. Default=tempdir().
The directory into which the heat-map plots should be saved when
writePlot=TRUE
. The default is the directory specified by
tempdir().
###cALD()
# Analyzing the included HLA-DRB1 HLA-DQB1 genotype data and reporting results in the console
cALD(drb1.dqb1.demo)
# Alternatively returning a vector of LD results, with nothing reported in the console
LDvec <- cALD(drb1.dqb1.demo,verbose=FALSE)
LDvec
# Enforcing phase between columns 1 and 3 and between columns 2 and 4 for analysis
cALD(drb1.dqb1.demo,inPhase=TRUE)
# Writing the haplotype vector to a file in the temporary directory that will
# have the time-stamped name DRB1~DQB1_haplotype_Vector_yyyy-MM-dd_HH-mm-ss.txt
cALD(drb1.dqb1.demo,saveVector = TRUE)
# Writing the haplotype vector to a file named "foo.txt" in the temporary directory
cALD(drb1.dqb1.demo,saveVector = TRUE,vectorName = "foo")
# Writing the haplotype vector to a file in the temporary directory with the prefix "foo_"
cALD(drb1.dqb1.demo,saveVector = TRUE,vectorPrefix = "foo")
# Writing the haplotype vector to a file in the working directory
cALD(drb1.dqb1.demo,saveVector = TRUE,vecDir = getwd())
###LDWrap()
# Analyzing the included HLA haplotype data
# This will create 15 haplotype vector files and one LD results file in the temporary directory
LDWrap(hla.hap.demo)
# Specifying the prefix "foo_Phased" for the LD results file, and "foo_phased" for the haplotype vector files
LDWrap(hla.hap.demo,frameName = "foo")
# Truncating the alleles in hla.hap.demo to 1 field for analysis.
LDWrap(hla.hap.demo,frameName = "foo", trunc=1)
# Analyzing the included HLA genotype data
LDWrap(drb1.dqb1.demo,frameName="hla-genotype-data")
# Writing the resulting files to the working directory
LDWrap(hla.hap.demo,writeTo = getwd())
###LD.sign.test()
# Generating LDWrap() results files for the example data included with this package in the temporary directory
LDWrap(hla.hap.demo)
LDWrap(hla.hap.demo,phased=FALSE)
# Analyzing the results files generated by LDWrap(), with a CSV of the results written to the temporary directory.
LDdata <- paste(tempdir(),"hla-family-data",sep=.Platform$file.sep)
LD.sign.test(LDdata, returnFrame=FALSE)
# Returning only a data frame for the same analysis.
LD.res <- LD.sign.test(LDdata,verbose=FALSE)
View(LD.res)
# Writing the CSV file to the working directory
LD.sign.test(LDdata,returnFrame = FALSE,resultDir = getwd())
###LD.heat.map()
# Generating LDWrap() results files for the example data included with this package in the working directory
LDWrap(hla.hap.demo, writeTo=getwd())
LDWrap(hla.hap.demo,phased=FALSE, writeTo=getwd())
# Generating color heat-map plots based on the LD result files in the working directory
LD.heat.map("hla-family-data")
# Generating greyscale heat-map plots based on the LD result files in the working directory
LD.heat.map("hla-family-data",color=FALSE)
# Generating heat-map plots for phased data alone based on the LD result files in the working directory
LD.heat.map(phasedData="hla-family-data_Phased_LD_results.csv",unphasedLabel="")
# Writing color PNG-formatted heat-map plots to the working directory, using the LD files in the working directory
LD.heat.map("hla-family-data",writePlot = TRUE,writeDir = getwd())
End of vignette.