GEMS Genotyping Score Examples
From ICISWiki
Contents |
Example 1. Diversity dataset using SSRs run on a capillary sequencer
A diversity dataset using SSRs where is one diploid germplasm is measured per SSR marker, run on a capillary sequencer. All germplasm was tested using all markers (complete matrix).
This example dataset is from 2 germplasm and 2 markers.
The first germplasm (456) is homozygote for the first marker (34), with allele (35). For the second marker (37) the germplasm is heterozygote with alleles (38,39)
The second germplasm (457) is heterozygote for the first marker (34), with alleles (35,36). For the second marker no allele was detected due to a unknown problem.
Example 1a.
In this example the data were extracted from the DMS where no samples were recorded in the IMS. The sample_id, indivdual and gdvariantno fields are not used, and the incidence field is redundant.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | 234 | 235 | 34 | null | null | 456 | 34 | 32 | 123/123 | 0 | 35 | 35 | 1 |
| 2 | 236 | null | 34 | null | null | 456 | 35 | 33 | 204/207 | 0 | 38 | 39 | 1 |
| 3 | 237 | 238 | 34 | null | null | 457 | 34 | 32 | 126/129 | 0 | 35 | 36 | 1 |
| 4 | 239 | null | 34 | null | null | 457 | 35 | 33 | X | 0 | null | null | -1 |
Example 1b.
Same as example 1a except sample ids were stored in DMS. The indivdual and gdvariantno fields are not used, and the incidence field is redundant.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | 234 | 235 | 34 | -45 | null | 456 | 34 | | | 0 | 35 | 35 | 1 |
| 2 | 236 | null | 34 | -45 | null | 456 | 35 | | | 0 | 38 | 39 | 1 |
| 3 | 237 | 238 | 34 | -46 | null | 457 | 34 | | | 0 | 35 | 36 | 1 |
| 4 | 239 | null | 34 | -46 | null | 457 | 35 | | | 0 | null | null | -1 |
Example 1c.
Same as example 1b except data were not stored in DMS, but uploaded directly. The ounitid1, ounitid2, indivdual and gdvariantno fields are not used, and the incidence field is redundant.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | 234 | 235 | 34 | -45 | null | 456 | 34 | | | 0 | 35 | 35 | 1 |
| 2 | 236 | null | 34 | -45 | null | 456 | 35 | | | 0 | 38 | 39 | 1 |
| 3 | 237 | 238 | 34 | -46 | null | 457 | 34 | | | 0 | 35 | 36 | 1 |
| 4 | 239 | null | 34 | -46 | null | 457 | 35 | | | 0 | null | null | -1 |
Example 2. A diversity dataset using SSRs using pooled individuals
A diversity dataset using SSRs where a mixture of 15 plants are pooled from a population measured per SSR marker run on a capillary sequencer and ratio of markers is determined by peak size. All germplasm was tested using all markers (complete matrix).
This example dataset is from 2 germplasm and 4 markers.
The first population (205) is homozygote for the first marker (89), with only one allele (90). For the second marker (92) the population has two alleles (93,94). For the third marker (95) there are three alleles (96,97,98). For forth marker (99) the sample is homozygote with only one allele (100)
The second population (206) for the first marker (89) has alleles (90,91). For the second marker (92) no allele was detected due to a unknown problem. For the third marker (95) there are two alleles (96,97). For forth marker (99) three alleles (100,101,102) are present.
Example 2a.
In this example the data were extracted from the DMS where no samples were recorded in the IMS. The ounitid1, ounitid2, sample_id, indivdual and gdvariantid2 fields are not used.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | 1004 | null | 583 | null | null | 205 | 89 | | | 1 | 90 | null | 1 |
| 2 | 1005 | null | 583 | null | null | 205 | 92 | | | 1 | 93 | null | 0.5 |
| 3 | 1006 | null | 583 | null | null | 205 | 92 | | | 2 | 94 | null | 0.5 |
| 4 | 1007 | null | 583 | null | null | 205 | 95 | | | 1 | 96 | null | 0.33 |
| 5 | 1008 | null | 583 | null | null | 205 | 95 | | | 2 | 97 | null | 0.33 |
| 6 | 1009 | null | 583 | null | null | 205 | 95 | | | 3 | 98 | null | 0.33 |
| 7 | 1010 | null | 583 | null | null | 205 | 99 | | | 1 | 100 | null | 1 |
| 8 | 1011 | null | 583 | null | null | 206 | 89 | | | 1 | 90 | null | 0.5 |
| 9 | 1012 | null | 583 | null | null | 206 | 89 | | | 2 | 91 | null | 0.5 |
| 10 | 1013 | null | 583 | null | null | 206 | 92 | | | 0 | null | null | -1 |
| 11 | 1014 | null | 583 | null | null | 206 | 95 | | | 1 | 96 | null | 0.5 |
| 12 | 1015 | null | 583 | null | null | 206 | 95 | | | 2 | 97 | null | 0.5 |
| 13 | 1016 | null | 583 | null | null | 206 | 99 | | | 1 | 100 | null | 0.33 |
| 14 | 1017 | null | 583 | null | null | 206 | 99 | | | 2 | 101 | null | 0.33 |
| 15 | 1018 | null | 583 | null | null | 206 | 99 | | | 3 | 102 | null | 0.33 |
Example 2b.
Same as example 2a except sample ids were stored in DMS. The ounitid2, indivdual and gdvariantid2 fields are not used.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | 1004 | null | 583 | -56 | null | 205 | 89 | | | 1 | 90 | null | 1 |
| 2 | 1005 | null | 583 | -56 | null | 205 | 92 | | | 1 | 93 | null | 0.5 |
| 3 | 1006 | null | 583 | -56 | null | 205 | 92 | | | 2 | 94 | null | 0.5 |
| 4 | 1007 | null | 583 | -56 | null | 205 | 95 | | | 1 | 96 | null | 0.33 |
| 5 | 1008 | null | 583 | -56 | null | 205 | 95 | | | 2 | 97 | null | 0.33 |
| 6 | 1009 | null | 583 | -56 | null | 205 | 95 | | | 3 | 98 | null | 0.33 |
| 7 | 1010 | null | 583 | -56 | null | 205 | 99 | | | 1 | 100 | null | 1 |
| 8 | 1011 | null | 583 | -57 | null | 206 | 89 | | | 1 | 90 | null | 0.5 |
| 9 | 1012 | null | 583 | -57 | null | 206 | 89 | | | 2 | 91 | null | 0.5 |
| 10 | 1013 | null | 583 | -57 | null | 206 | 92 | | | 0 | null | null | -1 |
| 11 | 1014 | null | 583 | -57 | null | 206 | 95 | | | 1 | 96 | null | 0.5 |
| 12 | 1015 | null | 583 | -57 | null | 206 | 95 | | | 2 | 97 | null | 0.5 |
| 13 | 1016 | null | 583 | -57 | null | 206 | 99 | | | 1 | 100 | null | 0.33 |
| 14 | 1017 | null | 583 | -57 | null | 206 | 99 | | | 2 | 101 | null | 0.33 |
| 15 | 1018 | null | 583 | -57 | null | 206 | 99 | | | 3 | 102 | null | 0.33 |
Example 2c.
Same as example 2b except data were not stored in DMS, but uploaded directly. The ounitid1, indivdual and gdvariantid2 fields are not used.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | null | null | 583 | -56 | null | 205 | 89 | 1 | 90 | null | 1 | ||
| 2 | null | null | 583 | -56 | null | 205 | 92 | 1 | 93 | null | 0.5 | ||
| 3 | null | null | 583 | -56 | null | 205 | 92 | 2 | 94 | null | 0.5 | ||
| 4 | null | null | 583 | -56 | null | 205 | 95 | 1 | 96 | null | 0.33 | ||
| 5 | null | null | 583 | -56 | null | 205 | 95 | 2 | 97 | null | 0.33 | ||
| 6 | null | null | 583 | -56 | null | 205 | 95 | 3 | 98 | null | 0.33 | ||
| 7 | null | null | 583 | -56 | null | 205 | 99 | 1 | 100 | null | 1 | ||
| 8 | null | null | 583 | -57 | null | 206 | 89 | 1 | 90 | null | 0.5 | ||
| 9 | null | null | 583 | -57 | null | 206 | 89 | 2 | 91 | null | 0.5 | ||
| 10 | null | null | 583 | -57 | null | 206 | 92 | 0 | null | null | -1 | ||
| 11 | null | null | 583 | -57 | null | 206 | 95 | 1 | 96 | null | 0.5 | ||
| 12 | null | null | 583 | -57 | null | 206 | 95 | 2 | 97 | null | 0.5 | ||
| 13 | null | null | 583 | -57 | null | 206 | 99 | 1 | 100 | null | 0.33 | ||
| 14 | null | null | 583 | -57 | null | 206 | 99 | 2 | 101 | null | 0.33 | ||
| 15 | null | null | 583 | -57 | null | 206 | 99 | 3 | 102 | null | 0.33 |
Example 3. F2 population dataset
A dataset taken from a F2 population where several individuals for each population are measured per marker using a mixture of dominant and co-dominant PCR based markers, run on an agarose gel. Not all germplasm was tested using all markers (sparse matrix) and there is the possibility of have more than 2 alleles for a locus.
This example dataset is from 3 germplasm and 2 markers. Were 2 indivduals were tested for each germplasm.
For first germplasm (333), the first individual is homozygote for the first marker (56), with allele (57) and the second individual is heterozygote with alleles (57,58). For the second marker (59) the germplasm is heterozygote for first individual with alleles (60,61) and the second individual is homozygote with allele (61)
The second germplasm (334), the first individual is hetrozygote for the first marker (56), with alleles (57,58) and the second individual is homezygote with allele (58). For the second marker the germplasm, in the first individual three alleles (bands) were detected (60,61,62), but no alleles were detected with the second individual due to an unknown problem.
For the third germplasm (335), the first marker was not used. The second marker was homozygote with allele (61) for the first individual and homozygote with allele (60) for the second individual.
Example 3a.
In this example the data were extracted from the DMS where no samples were recorded in the IMS. The sample_id, ounitid2 and gdvariantid2 fields are not used.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | 498 | null | 89 | null | 1 | 333 | 56 | 1 | 57 | null | 1 | ||
| 2 | 499 | null | 89 | null | 2 | 333 | 56 | 1 | 57 | null | 1 | ||
| 3 | 500 | null | 89 | null | 2 | 333 | 56 | 2 | 58 | null | 1 | ||
| 4 | 501 | null | 89 | null | 1 | 333 | 59 | 1 | 60 | null | 1 | ||
| 5 | 502 | null | 89 | null | 1 | 333 | 59 | 2 | 61 | null | 1 | ||
| 6 | 503 | null | 89 | null | 2 | 333 | 59 | 1 | 61 | null | 1 | ||
| 7 | 504 | null | 89 | null | 1 | 334 | 56 | 1 | 57 | null | 1 | ||
| 8 | 505 | null | 89 | null | 1 | 334 | 56 | 2 | 58 | null | 1 | ||
| 9 | 506 | null | 89 | null | 2 | 334 | 56 | 1 | 58 | null | 1 | ||
| 10 | 507 | null | 89 | null | 1 | 334 | 59 | 1 | 60 | null | 1 | ||
| 11 | 508 | null | 89 | null | 1 | 334 | 59 | 2 | 61 | null | 1 | ||
| 12 | 509 | null | 89 | null | 1 | 334 | 59 | 3 | 62 | null | 1 | ||
| 13 | 510 | null | 89 | null | 2 | 334 | 59 | 0 | null | null | -1 | ||
| 14 | 511 | null | 89 | null | 1 | 335 | 59 | 1 | 61 | null | 1 | ||
| 15 | 512 | null | 89 | null | 2 | 335 | 59 | 1 | 60 | null | 1 |
Example 3b.
Same as Example 1a except sample ids were stored in DMS. The ounitid2 and gdvariantid2 fields are not used, and the incidence field is redundant.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | 498 | null | 89 | -46 | 1 | 333 | 56 | 1 | 57 | null | 1 | ||
| 2 | 499 | null | 89 | -47 | 2 | 333 | 56 | 1 | 57 | null | 1 | ||
| 3 | 500 | null | 89 | -47 | 2 | 333 | 56 | 2 | 58 | null | 1 | ||
| 4 | 501 | null | 89 | -46 | 1 | 333 | 59 | 1 | 60 | null | 1 | ||
| 5 | 502 | null | 89 | -46 | 1 | 333 | 59 | 2 | 61 | null | 1 | ||
| 6 | 503 | null | 89 | -47 | 2 | 333 | 59 | 1 | 61 | null | 1 | ||
| 7 | 504 | null | 89 | -48 | 1 | 334 | 56 | 1 | 57 | null | 1 | ||
| 8 | 505 | null | 89 | -48 | 1 | 334 | 56 | 2 | 58 | null | 1 | ||
| 9 | 506 | null | 89 | -49 | 2 | 334 | 56 | 1 | 58 | null | 1 | ||
| 10 | 507 | null | 89 | -48 | 1 | 334 | 59 | 1 | 60 | null | 1 | ||
| 11 | 508 | null | 89 | -48 | 1 | 334 | 59 | 2 | 61 | null | 1 | ||
| 12 | 509 | null | 89 | -48 | 1 | 334 | 59 | 3 | 62 | null | 1 | ||
| 13 | 510 | null | 89 | -49 | 2 | 334 | 59 | 0 | null | null | -1 | ||
| 14 | 511 | null | 89 | -50 | 1 | 335 | 59 | 1 | 61 | null | 1 | ||
| 15 | 512 | null | 89 | -51 | 2 | 335 | 59 | 1 | 60 | null | 1 |
Example 3c.
Same as Example 1b except data were not stored in DMS, but uploaded directly. The ounitid1, ounitid2 and gdvariantid2 and gdvariantno fields are not used, and the incidence field is redundant.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | null | null | 89 | -46 | 1 | 333 | 56 | 1 | 57 | null | 1 | ||
| 2 | null | null | 89 | -47 | 2 | 333 | 56 | 1 | 57 | null | 1 | ||
| 3 | null | null | 89 | -47 | 2 | 333 | 56 | 2 | 58 | null | 1 | ||
| 4 | null | null | 89 | -46 | 1 | 333 | 59 | 1 | 60 | null | 1 | ||
| 5 | null | null | 89 | -46 | 1 | 333 | 59 | 2 | 61 | null | 1 | ||
| 6 | null | null | 89 | -47 | 2 | 333 | 59 | 1 | 61 | null | 1 | ||
| 7 | null | null | 89 | -48 | 1 | 334 | 56 | 1 | 57 | null | 1 | ||
| 8 | null | null | 89 | -48 | 1 | 334 | 56 | 2 | 58 | null | 1 | ||
| 9 | null | null | 89 | -49 | 2 | 334 | 56 | 1 | 58 | null | 1 | ||
| 10 | null | null | 89 | -48 | 1 | 334 | 59 | 1 | 60 | null | 1 | ||
| 11 | null | null | 89 | -48 | 1 | 334 | 59 | 2 | 61 | null | 1 | ||
| 12 | null | null | 89 | -48 | 1 | 334 | 59 | 3 | 62 | null | 1 | ||
| 13 | null | null | 89 | -49 | 2 | 334 | 59 | 0 | null | null | -1 | ||
| 14 | null | null | 89 | -50 | 1 | 335 | 59 | 1 | 61 | null | 1 | ||
| 15 | null | null | 89 | -51 | 2 | 335 | 59 | 1 | 60 | null | 1 |
Example 4. A diversity dataset using DArT
A diversity dataset using DArT where is one diploid germplasm is measured per marker (complete matrix).
This example dataset is from 4 germplasm and 4 markers.
In the first germplasm (355) first marker (34) and third markers (38) are present and the second (36) and forth (40) markers are absence.
In the second germplasm (356) first marker (34) and second markers (36) are present and the third (38) and forth (40) markers are absence.
In the third germplasm (357) first marker (34), second (36) and third markers (38) are present and the forth (40) markers is absence.
In the forth germplasm (358) first marker (34) and forth markers (40) are present and the second (36) and third (38) markers are absence.
Example 4a.
In this example the data was entered directly into the table where samples were recorded in the IMS and there one germplasm is measured per SSR marker.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gfassay | gdvariant_label | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | 45 | null | 12 | -78 | null | 355 | 34 | 1 | 35 | null | 1 | ||
| 2 | 46 | null | 12 | -78 | null | 355 | 36 | 1 | 37 | null | 0 | ||
| 3 | 47 | null | 12 | -78 | null | 355 | 38 | 1 | 39 | null | 1 | ||
| 4 | 48 | null | 12 | -78 | null | 355 | 40 | 1 | 41 | null | 0 | ||
| 5 | 49 | null | 12 | -79 | null | 356 | 34 | 1 | 35 | null | 1 | ||
| 6 | 50 | null | 12 | -79 | null | 356 | 36 | 1 | 37 | null | 1 | ||
| 7 | 51 | null | 12 | -79 | null | 356 | 36 | 1 | 39 | null | 0 | ||
| 8 | 52 | null | 12 | -79 | null | 356 | 40 | 1 | 41 | null | 0 | ||
| 9 | 53 | null | 12 | -60 | null | 357 | 34 | 1 | 35 | null | 1 | ||
| 10 | 54 | null | 12 | -60 | null | 357 | 36 | 1 | 37 | null | 1 | ||
| 11 | 55 | null | 12 | -60 | null | 357 | 38 | 1 | 39 | null | 1 | ||
| 12 | 56 | null | 12 | -60 | null | 357 | 40 | 1 | 41 | null | 0 | ||
| 13 | 57 | null | 12 | -78 | null | 358 | 34 | 1 | 35 | null | 1 | ||
| 14 | 58 | null | 12 | -78 | null | 358 | 36 | 1 | 37 | null | 0 | ||
| 15 | 59 | null | 12 | -78 | null | 358 | 38 | 1 | 39 | null | 0 | ||
| 16 | 60 | null | 12 | -78 | null | 358 | 40 | 1 | 41 | null | 1 |
Example 5. A diversity dataset using SNPs
A diversity dataset using SNPs (complete matrix).
This example dataset is from 2 germplasm and 3 markers.
The first germplasm (12345) is heterozygote for the first marker (57), with alleles (58, 59). For the second marker (60) the germplasm is homozygote with alleles (61). For the third marker (63) the germplasm is heterozygote with alleles (64,65)
The second germplasm (12345) is heterozygote for the first marker (34), with alleles (58.59). For the second marker no allele was detected due to a unknown problem. For the t marker (60) the germplasm is homozygote with alleles (61).
Example 5a
This example the data was entered directly to the table where samples were recorded in the IMS and there one germplasm is measured per SSR marker. The ounitid1, ounitid2, individual and gdvariantno fields are not used. Incidence is redundant.
| dpid | ounitid1 | ounitid2 | dataset_id | sample_id | individual | gid | gfdetectorid | gdvariantno | gdvariantid1 | gdvariantid2 | incidence |
| 1 | null | null | 4 | -356 | null | 12345 | 57 | 0 | 58 | 59 | 1 |
| 2 | null | null | 4 | -356 | null | 12345 | 60 | 0 | 61 | 61 | 1 |
| 3 | null | null | 4 | -356 | null | 12345 | 63 | 0 | 64 | 65 | 1 |
| 4 | null | null | 4 | -357 | null | 12346 | 57 | 0 | 58 | 59 | 1 |
| 5 | null | null | 4 | -357 | null | 12346 | 60 | 0 | null | null | -1 |
| 6 | null | null | 4 | -357 | null | 12346 | 63 | 0 | 64 | 64 | 1 |

