GEMS Genotyping Score Examples

From ICISWiki

Jump to: navigation, search

Contents

Example 1. Diversity dataset using SSRs run on a capillary sequencer

A diversity dataset using SSRs where is one diploid germplasm is measured per SSR marker, run on a capillary sequencer. All germplasm was tested using all markers (complete matrix).

This example dataset is from 2 germplasm and 2 markers.

The first germplasm (456) is homozygote for the first marker (34), with allele (35). For the second marker (37) the germplasm is heterozygote with alleles (38,39)

The second germplasm (457) is heterozygote for the first marker (34), with alleles (35,36). For the second marker no allele was detected due to a unknown problem.

Example 1a.

In this example the data were extracted from the DMS where no samples were recorded in the IMS. The sample_id, indivdual and gdvariantno fields are not used, and the incidence field is redundant.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1 234 235 34 null null 456 34 32
123/123
0 35 35 1
2 236
null
34 null null
456 35 33
204/207
0 38
39
1
3 237
238
34 null
null
457
34
32
126/129
0
35
36
1
4 239
null
34 null
null
457
35
33
X
0
null
null
-1
Example 1b.

Same as example 1a except sample ids were stored in DMS. The indivdual and gdvariantno fields are not used, and the incidence field is redundant.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1 234 235 34 -45 null 456 34

0 35 35 1
2 236
null
34 -45 null
456 35

0 38
39
1
3 237
238
34 -46
null
457
34


0
35
36
1
4 239
null
34 -46
null
457
35


0
null
null
-1
Example 1c.

Same as example 1b except data were not stored in DMS, but uploaded directly. The ounitid1, ounitid2, indivdual and gdvariantno fields are not used, and the incidence field is redundant.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay
gdvariant_label
gdvariantno gdvariantid1 gdvariantid2 incidence
1 234 235 34 -45 null 456 34

0 35 35 1
2 236
null
34 -45 null
456 35

0 38
39
1
3 237
238
34 -46
null
457
34


0
35
36
1
4 239
null
34 -46
null
457
35


0
null
null
-1

Example 2. A diversity dataset using SSRs using pooled individuals

A diversity dataset using SSRs where a mixture of 15 plants are pooled from a population measured per SSR marker run on a capillary sequencer and ratio of markers is determined by peak size. All germplasm was tested using all markers (complete matrix).

This example dataset is from 2 germplasm and 4 markers.

The first population (205) is homozygote for the first marker (89), with only one allele (90). For the second marker (92) the population has two alleles (93,94). For the third marker (95) there are three alleles (96,97,98). For forth marker (99) the sample is homozygote with only one allele (100)

The second population (206) for the first marker (89) has alleles (90,91). For the second marker (92) no allele was detected due to a unknown problem. For the third marker (95) there are two alleles (96,97). For forth marker (99) three alleles (100,101,102) are present.

Example 2a.

In this example the data were extracted from the DMS where no samples were recorded in the IMS. The ounitid1, ounitid2, sample_id, indivdual and gdvariantid2 fields are not used.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1 1004 null 583 null null 205 89

1 90 null 1
2 1005
null
583 null null
205 92

1 93
null
0.5
3 1006
null
583 null
null
205
92


2
94
null
0.5
4 1007
null
583 null
null
205
95


1
96
null
0.33
5 1008 null 583 null null 205 95

2 97 null 0.33
6 1009 null 583 null null 205 95

3 98 null 0.33
7 1010 null 583 null null 205 99

1 100 null 1
8 1011 null 583 null null 206 89

1 90 null 0.5
9 1012 null 583 null null 206 89

2 91 null 0.5
10 1013 null 583 null null 206 92

0 null null -1
11 1014 null 583 null null 206 95

1 96 null 0.5
12 1015 null 583 null null 206 95

2 97 null 0.5
13 1016 null 583 null null 206 99

1 100 null 0.33
14 1017 null 583 null null 206 99

2 101 null 0.33
15 1018 null 583 null null 206 99

3 102 null 0.33
Example 2b.

Same as example 2a except sample ids were stored in DMS. The ounitid2, indivdual and gdvariantid2 fields are not used.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1 1004 null 583 -56 null 205 89

1 90 null 1
2 1005
null
583 -56 null
205 92

1 93
null
0.5
3 1006
null
583 -56
null
205
92


2
94
null
0.5
4 1007
null
583 -56
null
205
95


1
96
null
0.33
5 1008 null 583 -56 null 205 95

2 97 null 0.33
6 1009 null 583 -56 null 205 95

3 98 null 0.33
7 1010 null 583 -56 null 205 99

1 100 null 1
8 1011 null 583 -57 null 206 89

1 90 null 0.5
9 1012 null 583 -57 null 206 89

2 91 null 0.5
10 1013 null 583 -57 null 206 92

0 null null -1
11 1014 null 583 -57 null 206 95

1 96 null 0.5
12 1015 null 583 -57 null 206 95

2 97 null 0.5
13 1016 null 583 -57 null 206 99

1 100 null 0.33
14 1017 null 583 -57 null 206 99

2 101 null 0.33
15 1018 null 583 -57 null 206 99

3 102 null 0.33
Example 2c.

Same as example 2b except data were not stored in DMS, but uploaded directly. The ounitid1, indivdual and gdvariantid2 fields are not used.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1 null null 583 -56 null 205 89 1 90 null 1
2 null
null
583 -56 null
205 92 1 93
null
0.5
3 null
null
583 -56
null
205
92
2
94
null
0.5
4 null
null
583 -56
null
205
95
1
96
null
0.33
5 null null 583 -56 null 205 95 2 97 null 0.33
6 null null 583 -56 null 205 95 3 98 null 0.33
7 null null 583 -56 null 205 99 1 100 null 1
8 null null 583 -57 null 206 89 1 90 null 0.5
9 null null 583 -57 null 206 89 2 91 null 0.5
10 null null 583 -57 null 206 92 0 null null -1
11 null null 583 -57 null 206 95 1 96 null 0.5
12 null null 583 -57 null 206 95 2 97 null 0.5
13 null null 583 -57 null 206 99 1 100 null 0.33
14 null null 583 -57 null 206 99 2 101 null 0.33
15 null null 583 -57 null 206 99 3 102 null 0.33



Example 3. F2 population dataset

A dataset taken from a F2 population where several individuals for each population are measured per marker using a mixture of dominant and co-dominant PCR based markers, run on an agarose gel. Not all germplasm was tested using all markers (sparse matrix) and there is the possibility of have more than 2 alleles for a locus.

This example dataset is from 3 germplasm and 2 markers. Were 2 indivduals were tested for each germplasm.

For first germplasm (333), the first individual is homozygote for the first marker (56), with allele (57) and the second individual is heterozygote with alleles (57,58). For the second marker (59) the germplasm is heterozygote for first individual with alleles (60,61) and the second individual is homozygote with allele (61)

The second germplasm (334), the first individual is hetrozygote for the first marker (56), with alleles (57,58) and the second individual is homezygote with allele (58). For the second marker the germplasm, in the first individual three alleles (bands) were detected (60,61,62), but no alleles were detected with the second individual due to an unknown problem.

For the third germplasm (335), the first marker was not used. The second marker was homozygote with allele (61) for the first individual and homozygote with allele (60) for the second individual.

Example 3a.

In this example the data were extracted from the DMS where no samples were recorded in the IMS. The sample_id, ounitid2 and gdvariantid2 fields are not used.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1
498
null
89
null
1
333
56
1
57
null
1
2
499
null
89
null
2
333
56
1
57
null
1
3 500 null 89 null 2
333
56
2
58
null
1
4 501 null 89 null 1
333
59
1
60
null
1
5 502 null 89 null 1 333 59 2 61 null 1
6 503 null 89 null 2 333 59 1 61 null 1
7 504 null 89 null 1 334 56 1 57 null 1
8 505 null 89 null 1 334 56 2 58 null 1
9 506 null 89 null 2 334 56 1 58 null 1
10 507 null 89 null 1 334 59 1 60 null 1
11 508 null 89 null 1 334 59 2 61 null 1
12 509 null 89 null 1 334 59 3 62 null 1
13 510 null 89 null 2 334 59 0 null null -1
14 511 null 89 null 1 335 59 1 61 null 1
15 512 null 89 null 2 335 59 1 60 null 1
Example 3b.

Same as Example 1a except sample ids were stored in DMS. The ounitid2 and gdvariantid2 fields are not used, and the incidence field is redundant.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1
498
null
89
-46
1
333
56
1
57
null
1
2
499
null
89
-47
2
333
56
1
57
null
1
3 500 null 89 -47 2
333
56
2
58
null
1
4 501 null 89 -46 1
333
59
1
60
null
1
5 502 null 89 -46 1 333 59 2 61 null 1
6 503 null 89 -47 2 333 59 1 61 null 1
7 504 null 89 -48 1 334 56 1 57 null 1
8 505 null 89 -48 1 334 56 2 58 null 1
9 506 null 89 -49 2 334 56 1 58 null 1
10 507 null 89 -48 1 334 59 1 60 null 1
11 508 null 89 -48 1 334 59 2 61 null 1
12 509 null 89 -48 1 334 59 3 62 null 1
13 510 null 89 -49 2 334 59 0 null null -1
14 511 null 89 -50 1 335 59 1 61 null 1
15 512 null 89 -51 2 335 59 1 60 null 1
Example 3c.

Same as Example 1b except data were not stored in DMS, but uploaded directly. The ounitid1, ounitid2 and gdvariantid2 and gdvariantno fields are not used, and the incidence field is redundant.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1
null
null
89
-46
1
333
56
1
57
null
1
2
null
null
89
-47
2
333
56
1
57
null
1
3 null null 89 -47 2
333
56
2
58
null
1
4 null null 89 -46 1
333
59
1
60
null
1
5 null null 89 -46 1 333 59 2 61 null 1
6 null null 89 -47 2 333 59 1 61 null 1
7 null null 89 -48 1 334 56 1 57 null 1
8 null null 89 -48 1 334 56 2 58 null 1
9 null null 89 -49 2 334 56 1 58 null 1
10 null null 89 -48 1 334 59 1 60 null 1
11 null null 89 -48 1 334 59 2 61 null 1
12 null null 89 -48 1 334 59 3 62 null 1
13 null null 89 -49 2 334 59 0 null null -1
14 null null 89 -50 1 335 59 1 61 null 1
15 null null 89 -51 2 335 59 1 60 null 1

Example 4. A diversity dataset using DArT

A diversity dataset using DArT where is one diploid germplasm is measured per marker (complete matrix).

This example dataset is from 4 germplasm and 4 markers.

In the first germplasm (355) first marker (34) and third markers (38) are present and the second (36) and forth (40) markers are absence.

In the second germplasm (356) first marker (34) and second markers (36) are present and the third (38) and forth (40) markers are absence.

In the third germplasm (357) first marker (34), second (36) and third markers (38) are present and the forth (40) markers is absence.

In the forth germplasm (358) first marker (34) and forth markers (40) are present and the second (36) and third (38) markers are absence.

Example 4a.

In this example the data was entered directly into the table where samples were recorded in the IMS and there one germplasm is measured per SSR marker.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gfassay gdvariant_label gdvariantno gdvariantid1 gdvariantid2 incidence
1
45
null
12
-78
null
355
34 1 35
null
1
2
46
null
12
-78
null
355
36
1
37
null
0
3
47
null
12
-78
null
355
38 1 39
null
1
4
48
null
12
-78
null
355
40
1
41
null
0
5
49
null
12
-79
null
356
34 1 35
null
1
6
50
null
12
-79
null
356
36
1
37
null
1
7
51
null
12
-79
null
356
36 1 39
null
0
8
52
null
12
-79
null
356
40
1
41
null
0
9
53
null
12
-60
null
357
34 1 35
null
1
10
54
null
12
-60
null
357
36
1
37
null
1
11
55
null
12
-60
null
357
38 1 39
null
1
12
56
null
12
-60
null
357
40
1
41
null
0
13
57
null
12
-78
null
358
34 1 35
null
1
14
58
null
12
-78
null
358
36
1
37
null
0
15
59
null
12
-78
null
358
38 1 39
null
0
16
60
null
12
-78
null
358
40
1
41
null
1

Example 5. A diversity dataset using SNPs

A diversity dataset using SNPs (complete matrix).

This example dataset is from 2 germplasm and 3 markers.

The first germplasm (12345) is heterozygote for the first marker (57), with alleles (58, 59). For the second marker (60) the germplasm is homozygote with alleles (61). For the third marker (63) the germplasm is heterozygote with alleles (64,65)

The second germplasm (12345) is heterozygote for the first marker (34), with alleles (58.59). For the second marker no allele was detected due to a unknown problem. For the t marker (60) the germplasm is homozygote with alleles (61).

Example 5a

This example the data was entered directly to the table where samples were recorded in the IMS and there one germplasm is measured per SSR marker. The ounitid1, ounitid2, individual and gdvariantno fields are not used. Incidence is redundant.

dpid ounitid1 ounitid2 dataset_id sample_id individual gid gfdetectorid gdvariantno gdvariantid1 gdvariantid2 incidence
1 null null 4 -356 null 12345 57 0 58 59 1
2 null null 4 -356 null 12345 60 0 61 61 1
3 null null 4 -356 null 12345 63 0 64 65 1
4 null null 4 -357 null 12346 57 0 58 59 1
5 null null 4 -357 null 12346 60 0 null null -1
6 null null 4 -357 null 12346 63 0 64 64 1
Personal tools