BBSRC logoNSF logoFM-MPI logo

19 genomes of Arabidopsis thaliana

This page contains resources relating to the 19 genomes project, in which the genome sequences, transcriptomes and protein annotations of 19 accessions of the plant Arabidopsis thaliana are described. These genomes are the founders of the MAGIC genetic reference population of recombinant inbred lines, and contribute to the 1001 Arabidopsis genomes project.

These genomes are described in our paper Multiple reference genomes and transcriptomes for Arabidopsis thaliana Nature 2011.

the IMR-DENOM software used to assemble the genomes is now available. The newest version is 0.4.1.

Genetic differences between Arabidopsis thaliana accessions underlie the plant’s extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions.

Details of Accessions Sequenced

The accessions sequenced are:

Accession Origin AIMS Stock Centre #
Bur-0 Ireland CS6643
Can-0 Canary Isles CS6660
Ct-1 Italy CS6674
Edi-0 Scotland CS6688
Hi-0 Netherlands CS6736
Kn-0 Lithuania CS6762
Ler-0 Poland, formerly Germany CS20
Mt-0 Libya CS1380
No-0 Germany CS6805
Oy-0 Norway CS6824
Po-0 Germany CS6839
Rsch-4 Russia CS6850
Sf-2 Spain CS6857
Tsu-0 Japan CS6874
Wil-2 Russia CS6889
Ws-0 Russia CS6891
Wu-0 Germany CS6897
Zu-0 Germany CS6902

More details of the sequencing (libraries, read lengths, yields, coverage etc) are available here.


The project is a collaboration between

Data Available for Download

NOTE: These data were revised and expanded on 5th September 2011. If you downloaded any annotation data prior to that date please check if there is a more recent version. The genome sequences and lists of variants are unaffected.