In-depth genome characterization of a Brazilian common bean core collection using DArTseq high-density SNP genotyping

Background Common bean is a legume of social and nutritional importance as a food crop, cultivated worldwide especially in developing countries, accounting for an important source of income for small farmers. The availability of the complete sequences of the two common bean genomes has dramatically accelerated and has enabled new experimental strategies to be applied for genetic research. DArTseq has been widely used as a method of SNP genotyping allowing comprehensive genome coverage with genetic applications in common bean breeding programs. Results Using this technology, 6286 SNPs (1 SNP/86.5 Kbp) were genotyped in genic (43.3%) and non-genic regions (56.7%). Genetic subdivision associated to the common bean gene pools (K = 2) and related to grain types (K = 3 and K = 5) were reported. A total of 83% and 91% of all SNPs were polymorphic within the Andean and Mesoamerican gene pools, respectively, and 26% were able to differentiate the gene pools. Genetic diversity analysis revealed an average HE of 0.442 for the whole collection, 0.102 for Andean and 0.168 for Mesoamerican gene pools (FST = 0.747 between gene pools), 0.440 for the group of cultivars and lines, and 0.448 for the group of landrace accessions (FST = 0.002 between cultivar/line and landrace groups). The SNP effects were predicted with predominance of impact on non-coding regions (77.8%). SNPs under selection were identified within gene pools comparing landrace and cultivar/line germplasm groups (Andean: 18; Mesoamerican: 69) and between the gene pools (59 SNPs), predominantly on chromosomes 1 and 9. The LD extension estimate corrected for population structure and relatedness (r²SV) was ~ 88 kbp, while for the Andean gene pool was ~ 395 kbp, and for the Mesoamerican was ~ 130 kbp. Conclusions For common bean, DArTseq provides an efficient and cost-effective strategy of generating SNPs for large-scale genome-wide studies. The DArTseq resulted in an operational panel of 560 polymorphic SNPs in linkage equilibrium, providing high genome coverage. This SNP set could be used in genotyping platforms with many applications, such as population genetics, phylogeny relation between common bean varieties and support to molecular breeding approaches. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3805-4) contains supplementary material, which is available to authorized users.