Projects and Data Formats
Everything you do with Flapjack is stored within a project file; imported data, sort orders, trait information, colour schemes, etc. A Flapjack project is active at all times when using the application - even at startup, when a default new project is already created and waiting for data to be imported into it.
The structure of a project file is as follows:
At the top level is the project itself. A project can store zero or more data sets.
Data sets
A data set contains an amalgamation of an imported map file and genotype file.
The map file should contain information on the markers, the chromosome they are on, and their position within that chromosome. The markers do not need to be in any particular order as Flapjack will group and sort them by chromosome and distance once they are loaded. A short example is shown below:
# fjFile = MAP Marker1 1H 32 Marker2 1H 45 Marker3 2H 23
The genotype file should contain a list of variety lines, with allele data per marker for that line. It also requires a header line specifying the marker information for each column.
# fjFile = GENOTYPE
Marker1 Marker2 Marker3
Line1 A G G
Line2 A - G/T
Line3 T A C
Both the map file and the genotype file must be in plain-text, tab-delimited format.
Views
Flapjack stores the lines and markers internally in a structure and form that can never be modified. A default view upon this data is created whenever an import is successful, and any subsequent operations upon the lines or markers will happen to the view, not to the data set.
Each view (and you can create as many as you like) will hold the set of chromosomes for that data set. Each chromosome is displayed independently, but the lines are obviously common to all chromosomes and any modification to the order or display of lines on one chromosome will be reflected across all the others too.
Colour scheme information is generally specific to a view although some settings will be chromosome-specific, such as colouring by marker.
Traits
A data set can optionally also store information on one or more traits that are associated with the lines. Trait information is imported from a file with the following tab-delimited format:
# fjFile = PHENOTYPE
Trait1 Trait1 Trait2
Experiment1 Experiment2 Experiment1
Line1 50 High Short
Line2 2.3 High Medium
Line3 99.3 Low Long
Trait data for a single trait can be either numerical or categorical. The line containing experiment information for each trait is optional.
QTLs
A data set can also optionally store information on one or more QTLs that are associated with the map. QTL information is imported from a file with the following tab-delimited format:
# fjFile = QTL Name Chromosome Position Pos-Min Pos-Max Trait Experiment [optional_1] .. [optional_n] QTL1 1H 10 8 12 Height Exp1 25.5 high QTL2 1H 20 19 26 Height Exp1 34.8 low QTL3 2H 10 8 13.5 Temp Exp1 99.2 low
The "Name" to "Experiment" columns are required and must be included and listed in the order shown. After that, each QTL may have zero or more optional columns of numerical or textual data that can be included too.
Graphs
A data set can also optionally store information on one or more Graphs that are associated with the map. Graph information is imported from a file with the following tab-delimited format:
# fjFile = GRAPH SIGNIFICANCE_THRESHOLD Graph1 5.1 SIGNIFICANCE_THRESHOLD Graph2 7.5 Marker1 Graph1 1.3 Marker1 Graph2 4.3 ... Marker2 Graph1 1.8 Marker2 Graph2 3.9
Any number of graphs can be stored in a single file with data points per marker. The SIGNIFICANCE_THRESHOLD entry is optional (per graph) but defines the significance threshold for that graph if included which will be drawn on Flapjack's display.