Download table for SNPs/variants dataset and alignment files of 3K RG accessions to 16 reference genomes.
|
Reference Genome* |
Acronyms |
Internal genome numbering |
Assembly (NCBI ID) |
3KRG SNPs** (PLINK2.0 format) |
PLINK2.0 file size (Gb) |
Realignment/alignment files*** (CRAM format) |
CRAM total file size (Tb) |
|
GJ-temp: IRGSP |
IRGSP |
genome1 |
GCF_001433935.1 |
4.49 |
2.89 ; 5.74 |
||
|
GJ-subtrp: CHAO MEO |
CM |
genome15 |
GCA_009831315.1 |
4.84 |
2.89 |
||
|
GJ-trop1: Azucena |
AZ |
genome5 |
GCA_009830595.1 |
4.64 |
2.84 |
||
|
GJ-trop2: KETAN NANGKA |
KN |
genome14 |
GCA_009831275.1 |
4.88 |
2.92 |
||
|
cB: ARC 10497 |
ARC |
genome7 |
GCA_009831255.1 |
4.92 |
2.96 |
||
|
XI-1A: ZhenShan97RS3 |
ZS97 |
genome28 |
GCA_001623345.2 |
4.78 |
2.38 |
||
|
XI-1B1: IR 64 |
IR64 |
genome6 |
GCA_009914875.1 |
4.75 |
2.40 ; 6.03 |
||
|
XI-1B2: PR 106 |
PR106 |
genome13 |
GCA_009831045.1 |
4.78 |
2.45 |
||
|
XI-2A: GOBOL SAIL |
GS |
genome16 |
GCA_009831025.1 |
4.83 |
2.59 |
||
|
XI-2B: LARHA MUGAD |
LM |
genome8 |
GCA_009831355.1 |
4.95 |
2.62 |
||
|
XI-3A: LIMA |
LIMA |
genome11 |
GCA_009829395.1 |
4.72 |
2.53 |
||
|
XI-3B1: KHAO YAI GUANG |
KYG |
genome10 |
GCA_009831295.1 |
4.76 |
2.52 |
||
|
XI-3B2: LIU XU |
LX |
genome9 |
GCA_009829375.1 |
4.89 |
2.72 |
||
|
XI-adm: MH63RS3 |
MH63 |
genome27 |
GCA_001623365.2 |
4.79 |
2.41 |
||
|
cA1: N22 |
N22 |
genome4 |
GCA_001952365.3 |
4.74 |
3.02 |
||
|
cA2: NATEL BORO |
NABO |
genome12 |
GCA_009831335.1 |
5.22 |
3.21 |
* The annotation files and transcriptome data are available on the page: https://yongzhou2019.github.io/Rice-Population-Reference-Panel/data/
** The SNPs are called from the 3K rice genome project based on the rice population reference panel, i.e. 16 rice genome references using the GATK-HPC pipeline, which is available at wiki page:
https://github.com/IBEXCluster/Rice-Variant-Calling/wiki/ (Zhou Y, Kathiresan N, Yu Z, et al. HPC-based genome variant calling workflow (HPC-GVCW)[J]. bioRxiv, 2023: 2023.06. 25.546420).
The generation of the files required basic bioinformatics skills and tools, e.g., Plink2 (https://www.cog-genomics.org/plink/2.0/)/.
The SNP files in VCF and PLINK2.0 format are also available at the KAUST repository with DOI:https://doi.org/10.25781/KAUST-12WKO .
***The files required basic bioinformatics skills and tools, e.g.,
samtools (http://www.htslib.org/ ), IGV (https://igv.org/), Geneious (https://www.geneious.com/ ),
or Persephone (https://persephonesoft.com/ ), for further operation and visualization.
Citations:
1. 3,000 Rice Genomes Project. The 3,000 rice genomes project[J]. GigaScience, 2014, 3(1): 2047-217X-3-7.
2. Wang W, Mauleon R, Hu Z, et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice[J]. Nature, 2018, 557(7703): 43-49.
3. Zhou Y, Chebotarov D, Kudrna D, et al. A platinum standard pan-genome resource that represents the population structure of Asian rice[J]. Scientific data, 2020, 7(1): 113.)
4. Zhou Y, Yu Z, Chebotarov D, et al. Pan-genome inversion index reveals evolutionary insights into the subpopulation structure of Asian rice[J]. Nature Communications, 2023, 14(1): 1567.
5. Yu Z, Chen Y, Zhou Y, et al. Rice Gene Index: A comprehensive pan-genome database for comparative and functional genomics of Asian rice[J]. Molecular plant, 2023, 16(5): 798-801.