Download table for SNPs/variants dataset and alignment files of 3K RG accessions to 16 reference genomes.

 

 

Reference Genome*

Acronyms

Internal genome numbering

Assembly (NCBI ID)

3KRG SNPs** (PLINK2.0 format)

PLINK2.0 file size (Gb)

Realignment/alignment files*** (CRAM format)

CRAM total file size (Tb)

GJ-temp: IRGSP

IRGSP

genome1

GCF_001433935.1

Nipponbare IRGSP SNPs and variants

4.49

Nipponbare IRGSP CRAM ; sorted CRAM

2.89 ; 5.74

GJ-subtrp: CHAO MEO

CM

genome15

GCA_009831315.1

Chao Meo SNPs and variants

4.84

Chao Meo CRAM

2.89

GJ-trop1: Azucena

AZ

genome5

GCA_009830595.1

 Azucena SNPs and variants

4.64

Azucena CRAM

2.84

GJ-trop2: KETAN NANGKA

KN

genome14

GCA_009831275.1

Ketan Nangka SNPs and variants 

4.88

Ketan Nangka CRAM

2.92

cB: ARC 10497

ARC

genome7

GCA_009831255.1

ARC SNPs and variants

4.92

 ARC CRAM

2.96

XI-1A: ZhenShan97RS3

ZS97

genome28

GCA_001623345.2

Zhen Shan 97 SNPs and variants

4.78

Zhen Shan 97 CRAM

2.38

XI-1B1: IR 64

IR64

genome6

GCA_009914875.1

 IR 64 SNPs and variants

4.75

IR 64 CRAM ; sorted CRAM

2.40 ; 6.03

XI-1B2: PR 106

PR106

genome13

GCA_009831045.1

 PR 106 SNPs and variants

4.78

 PR 106 CRAM

2.45

XI-2A: GOBOL SAIL

GS

genome16

GCA_009831025.1

Gobol Sail SNPs and variants

4.83

Gobol Sail CRAM

2.59

XI-2B: LARHA MUGAD

LM

genome8

GCA_009831355.1

Larha Mugad SNPs and variants

4.95

Larha Mugad CRAM

2.62

XI-3A: LIMA

LIMA

genome11

GCA_009829395.1

Lima SNPs and variants

4.72

Lima CRAM

2.53

XI-3B1: KHAO YAI GUANG

KYG

genome10

GCA_009831295.1

Khao Yai Guang SNPs and variants

4.76

Khao Yai Guang CRAM

2.52

XI-3B2: LIU XU

LX

genome9

GCA_009829375.1

 Liu Xu SNPs and variants

4.89

 Liu Xu CRAM

2.72

XI-adm: MH63RS3

MH63

genome27

GCA_001623365.2

MH 63 SNPs and variants

4.79

MH 63 CRAM  

2.41

cA1: N22

N22

genome4

GCA_001952365.3

N 22 SNPs and variants

4.74

 N 22 CRAM

3.02

cA2: NATEL BORO

NABO

genome12

GCA_009831335.1

 Natel Boro SNPs and variants

5.22

Natel Boro CRAM  

3.21

 

* The annotation files and transcriptome data are available on the page:  https://yongzhou2019.github.io/Rice-Population-Reference-Panel/data/  

** The SNPs are called from the 3K rice genome project based on the rice population reference panel, i.e. 16 rice genome references using the GATK-HPC pipeline, which is available at wiki page:

 https://github.com/IBEXCluster/Rice-Variant-Calling/wiki/  (Zhou Y, Kathiresan N, Yu Z, et al. HPC-based genome variant calling workflow (HPC-GVCW)[J]. bioRxiv, 2023: 2023.06. 25.546420).

The generation of the files required basic bioinformatics skills and tools, e.g., Plink2 (https://www.cog-genomics.org/plink/2.0/)/.

The SNP files in VCF and PLINK2.0 format are also available at the KAUST repository with DOI:https://doi.org/10.25781/KAUST-12WKO .

***The files required basic bioinformatics skills and tools, e.g.,

samtools (http://www.htslib.org/ ), IGV (https://igv.org/), Geneious (https://www.geneious.com/ ),

or Persephone (https://persephonesoft.com/ ), for further operation and visualization.

Citations:

1. 3,000 Rice Genomes Project. The 3,000 rice genomes project[J]. GigaScience, 2014, 3(1): 2047-217X-3-7.

2. Wang W, Mauleon R, Hu Z, et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice[J]. Nature, 2018, 557(7703): 43-49.

3. Zhou Y, Chebotarov D, Kudrna D, et al. A platinum standard pan-genome resource that represents the population structure of Asian rice[J]. Scientific data, 2020, 7(1): 113.)

4. Zhou Y, Yu Z, Chebotarov D, et al. Pan-genome inversion index reveals evolutionary insights into the subpopulation structure of Asian rice[J]. Nature Communications, 2023, 14(1): 1567.

5. Yu Z, Chen Y, Zhou Y, et al. Rice Gene Index: A comprehensive pan-genome database for comparative and functional genomics of Asian rice[J]. Molecular plant, 2023, 16(5): 798-801.