Data from: Genome-wide association study of behavioral, physiological and gene expression traits in outbred CFW mice

  • Clarissa C. Parker (Creator)
  • Shyam Gopalakrishnan (Creator)
  • Peter Carbonetto (Creator)
  • Natalia M. Gonzales (Creator)
  • Emily Leung (Creator)
  • Yeonhee J Park (Creator)
  • Emmanuel Aryee (Creator)
  • Joe Davis (Creator)
  • David A. Blizard (Creator)
  • Cheryl L. Ackert-Bicknell (Creator)
  • Arimantas Lionikas (Creator)

Dataset

Description

Although mice are the most widely used mammalian model organism, genetic studies have suffered from limited mapping resolution due to extensive linkage disequilibrium (LD) that is characteristic of crosses among inbred strains. Carworth Farms White (CFW) mice are a commercially available outbred mouse population that exhibit rapid LD decay in comparison to other available mouse populations. We performed a genome-wide association study (GWAS) of behavioral, physiological and gene expression phenotypes using 1,200 male CFW mice. We used genotyping by sequencing (GBS) to obtain genotypes at 92,734 SNPs. We also measured gene expression using RNA sequencing in three brain regions. Our study identified numerous behavioral, physiological and expression quantitative trait loci (QTLs). We integrated the behavioral QTL and eQTL results to implicate specific genes, including Azi2 in sensitivity to methamphetamine and Zmynd11 in anxiety-like behavior. The combination of CFW mice, GBS and RNA sequencing constitutes a powerful approach to GWAS in mice.

Data type

Physiological and behavioral trait data: Physiological and behavioural phenotype data on 1,219 mice from the Carworth Farms White (CFW) outbred mouse stock. The data are stored in comma-delimited ("csv") format, with one line per sample.
pheno.csv

SNPs genotyped in CFW mice: Information about 92,734 single nucleotide polymorphisms (SNPs) on chromosomes 1-19 genotyped in the CFW outbred mouse cohort. The table is stored in space-delimited columns, with one row per SNP. All genomic positions are based on Mouse Genome Assembly 38 from the NCBI database (mm10, December 2011).
map.txt

Genotype data for 1,161 mice at 92,734 SNPs: Table with space-delimited columns containing genotype data for 1,161 mice at 92,734 SNPs. The first column ("id") is the sample id, and the second column ("discard") indicates whether the sample should be discarded because of flowcell samples that were mislabeled, and so we cannot be sure of the identity of these samples.
geno.txt.gz

PLINK sample information file: CFW genotypes stored in PLINK .fam/.bim/.bed format.
cfw.fam

PLINK SNP information file: CFW genotypes stored in PLINK .bed/.bim/.fam format.
cfw.bim

PLINK binary genotype file: CFW genotypes stored in PLINK .bed/.bim/.fam format.
cfw.bed

Genotype calls from hippocampus RNA-seq data: VCF file containing the genotypes at SNPs identified from RNA-seq data in the hippocampus. Genotypes and allele counts in this VCF file are used for analyzing allele-specific expression in the hippocampus.
rna_HIP.vcf

Genotype calls from prefrontal cortex RNA-seq data: VCF file containing the genotypes at SNPs identified from RNA-seq data in the prefrontal cortex. Genotypes and allele counts in this VCF file are used for analyzing allele-specific expression in the prefrontal cortex.
rna_PFC.vcf

Genotype calls from striatum RNA-seq data: VCF file containing the genotypes at SNPs identified from RNA-seq data in the striatum. Genotypes and allele counts in this VCF file are used for analyzing allele-specific expression in the striatum.
rna_STR.vcf

Normalized RNA-seq fpkm values: Tab-delimited file containing normalized fpkm values obtained from the RNA-seq experiments in 3 brain tissues, after normalization across samples. The fpkm values for the 3 brain tissues have been combined into a single file. Each row corresponds to a sample, and each column corresponds to a gene. The brain tissue for each measurement is indicated by column header. A suffix of .HIP is added to the gene names for hippocampus measurements, .STR for the striatum, and .PFC for the prefrontal cortex. This file only includes data for genes with mean fpkm > 1 prior to normalization.
combined.thr.norm.fpkm

Copyright and Open Data Licencing

This work is licensed under a CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license.
Date made available9 Jun 2017
PublisherDryad Digital Repository

Keywords

  • behavioral genetics
  • Complex traits
  • genome-wide association studies
  • genotyping-by-sequencing
  • mouse genetics

Cite this