Genome-wide association meta-analysis identifies 48 risk variants and highlights the role of the stria vascularis in hearing loss

Summary Hearing loss is one of the top contributors to years lived with disability and is a risk factor for dementia. Molecular evidence on the cellular origins of hearing loss in humans is growing. Here, we performed a genome-wide association meta-analysis of clinically diagnosed and self-reported hearing impairment on 723,266 individuals and identified 48 significant loci, 10 of which are novel. A large proportion of associations comprised missense variants, half of which lie within known familial hearing loss loci. We used single-cell RNA-sequencing data from mouse cochlea and brain and mapped common-variant genomic results to spindle, root, and basal cells from the stria vascularis, a structure in the cochlea necessary for normal hearing. Our findings indicate the importance of the stria vascularis in the mechanism of hearing impairment, providing future paths for developing targets for therapeutic intervention in hearing loss.


Introduction
The number of people with mild-to-complete hearing impairment is projected to increase to an estimated 2.45 billion worldwide by 2050, principally driven by age-related hearing impairment (ARHI). 1 Hearing impairment is ranked third for causes of global years lived with disability (YLDs) across all ages and the leading cause of YLDs in those older than 70, as compared with all other disease categories. 1 The overall global cost of unaddressed hearing loss exceeds $981 billion annually. 2 ARHI has been associated with social withdrawal, depression, anxiety, as well as cognitive decline and dementia. 3 There is no preventive treatment for hearing decline and therapeutics are currently available only in the form of hearing aids or cochlear implants. Moreover, the impact of untreated hearing loss remains underestimated as governmental and industry incentives are still very low in comparison to other diseases of equal prevalence. 4 Hearing thresholds tend to deteriorate gradually with age and ARHI is typically more pronounced in the higher frequencies. Knowledge of the pathophysiological mechanisms of hearing loss derives primarily from animal studies (particularly mouse models 5 ), as well as clinical research on specific families with hearing loss. 6 Hearing loss is moderately heritable, with recent studies attributing 36%-70% of the variation in the heritability of hearing impairment to additive genetic effects. 7,8 Large genome-wide association studies (GWASs) have been recently conducted: UK Biobank (n ¼ 87,056 individuals with self-reported hearing difficulty) revealed 44 independent loci associated with self-reported hearing difficulty and confirmed that hearing loss is a complex polygenic disorder. 9 A combined Icelandic cohort and UK Biobank (n ¼ 121,934 individuals identified through pure-tone audiograms and self-reported hearing difficulty) yielded another 21 novel associations of which 13 were rare variants. 10 Kalra et al. performed a multi-trait analysis of GWASs (MTAG) 11 using UKB data from up to 337,000 participants with different hearing phenotypes and identified 8 novel hits supported with transcription data. 12 However, many loci were not replicated, which may be explained by differences in phenotyping (ICD diagnoses, self-report, hearing thresholds assessed by audiometry), imbalanced sample size with UK Biobank predominating, statistical power, or ancestral differences between samples. While early-onset genetic hearing loss is determined by monogenic factors, ARHI appearing in late adulthood develops from the interaction of environmental and polygenic factors. 13 In order to gain fundamental knowledge on the genetic basis of hearing loss, we conducted a meta-analysis of 17 hearing loss GWASs using both ICD diagnoses and self-reported hearing loss. The latter has been demonstrated to be a good proxy for formal hearing assessment. 14 The study comprised 147,997 affected individuals and 575,269 control subjects including 60,941 affected individuals that were not in our previously published GWAS meta-analysis. 9 We compiled a dataset comprising multiple different European and US population-based cohorts ( Figure 1A, Table S1).

Study design and phenotyping
Adult male and female participants were included from the following 17 population-based cohort studies: Age, Genes/ Environment Susceptibility -Reykjavik (AGES; n ¼ 3,134), the Danish Twin Registry (DTR; n ¼ 1,314), the Estonian Genome Center at the University of Tartu (EGCUT; n ¼ 55,523), FinnGen (n ¼ 212,544), Framingham Heart Study (FHS; n ¼ 2,536), Health Aging and Body Composition (HABC; n ¼ 1,288), Italian Network of Genetic Isolates -Friuli Venezia Giulia (INGI-FVG; n ¼ 339), the Rotterdam Study (RS, cohorts 1-3; n ¼ 6,157), the Salus in Apulia study (SA; n ¼ 1,780; formerly known as Great Age study), Screening Across the Lifespan Twin (SALT; n ¼ 9,565, and SALTY -young; n ¼ 5,133), Screening Twin Adults: Genes and Environment (STAGE; n ¼ 8,345), TwinsUK (n ¼ 5,125), UK Biobank (UKBB; n ¼ 392,143), and the Women's Genome Health Study (WGHS; n ¼ 18,340). All participants provided written informed consent; ethical approval was obtained locally. The declaration of Helsinki was adhered to. Phenotype was based on ICD9 & 10 diagnoses of hearing loss (EGCUT and FinnGen) or questions on hearing loss (all other cohorts). Mean age overall was 59.6. A detailed description of phenotype definition case control distribution and age range for each study can be found in Table S1. Study details and cohort descriptions are available in supplemental methods. An umbrella ethics license was granted by the local ethics committee, Regionala etikprövningsnämnden in Stockholm (2015/ 2129-31/1). Informed consent was obtained from all participants. Individual ethical license from all contributing cohorts are included in supplemental methods.

Genome-wide association studies and meta-analysis
GWASs have been carried out for each cohort locally and summary statistics were collected for each study. Standardized quality con-trol was performed using EasyQC software, 15 followed by metaanalysis using METAL software. 16 Briefly, the quality control steps with EasyQC were excluding monomorphic SNPs, SNP missingness <0.05, filtering out duplicate SNPs and SNPs with imputation score <0.5. After harmonizing allele coding and marker names, uniformed summary statistics were produced. LD score regression 17 was applied to estimate the impact of population stratification and other confounders on test-statistic inflation for cohorts with sample size n > 5,000 for QC-ed and harmonized GWAS data. The genome build was hg19.
Meta-analysis of GWAS summary statistics was conducted using an inverse-variance-weighted fixed-effect model in METAL. 16 To control for population stratification and other confounders, individual cohorts were adjusted for either genomic control (cohort with sample size >5,000) or intercept. A genome-wide significance threshold was defined as p < 5 3 10 À8 . Conditional and joint A B Figure 1. GWAS meta-analysis for ARHI (n ¼ 723,266) (A) Origin of the datasets used for the meta-analysis of the ARHI GWAS: 8 European countries and the United States. (B) Manhattan plot displays all associations per variant ordered according to their genomic position on the x axis and showing the strength of the association with the Àlog 10 transformed p values on the y axis. The threshold for genome-wide significance (p < 5 3 10 À8 ) is indicated by the red line, while the blue line represents the suggestive threshold (p < 1 3 10 À5 ). association analysis (COJO) was carried out to reveal independent lead SNPs for genome-wide significant loci using GCTA software. 18 We randomly selected 50,000 individuals of European ancestry from UK Biobank as a reference sample for COJO. We examined whether the SNPs have been previously associated with hearing loss in a large-scale GWAS. A locus was designated ''new'' when LD with previous associated variants was <0.6. In the case of missense SNPs, the Combined Annotation Dependent Depletion (CADD) score (GRCh37-v1.6) was used to estimate the deleterious effect of the SNP. 19 Gene prioritization and pathway analysis For gene prioritization in the genome-wide significant loci, we used MAGMA 20 (significance threshold p < 2.66 3 10 À6 ) implemented in FUMA 21 and VEGAS2 software 22 (Bonferroni corrected p < 0.05). We used the offline version of VEGAS2 and analyzed the most associated 10 SNPs flanking 10 kb upstream and downstream the genes. The list of genes was obtained from the VEGAS2 website (https://vegas2.qimrberghofer.edu.au/glist-hg19) and included 26,056 genes. Given that the number of genes in the output would depend on the analysis parameters, we chose to correct for multiple testing assuming 20,000 independent tests by Bonferroni approach.
Genes within 500 Mb from the top SNP were checked for any known association with hearing loss in either humans or mice. For that purpose, existing literature and the website of the International Mouse Phenotyping Consortium (www.mousephenotype. org) were consulted. 23 The Shared Inner Ear Laboratory Database (SHIELD) 24 was used to examine whether candidate genes were expressed in inner and outer hair cells of the cochlea in adult mice (P25-P30), designated positive when expression levels (measured by fluorescent intensity readings) exceeded 10.9, as described in the referenced paper. 25 We used VEGAS2 for pathway analysis based on the results of gene prioritization as described above and the list of pathways provided as part of VEGAS2 distribution (https://vegas2.qimrberghofer. edu.au/biosystems20160324.vegas2pathSYM). 26

Variant analysis
All variants identified in genes of interest in the gnomAD v.2.1.1 were downloaded, and minor allele frequencies (MAFs) were filtered to select for variants that were common (MAF R1%) in the non-Finnish European (EUR) population. Transcripts were selected based on consensus with the Deafness Variation Database transcript catalog, 27 where possible, or otherwise we used the longest transcript according to Ensembl. Variant effects were analyzed using an array of partially orthogonal computational prediction algorithms, PolyPhen-2, 28 CADD, 19 DANN, 29 PROVEAN, 30 REVEL, 31 VEST3, 32 and Eigen, 33 that consider genetic, evolutionary, structural, and biochemical information to infer variant pathogenicity and deleteriousness. The individual algorithmic assessments were aggregated into a consensus ensemble score normalized to the range from zero (variant unanimously predicted to be deleterious) to one (unanimously predicted to be benign). The secondary structure of GJB2 was obtained from the Protein Data Bank (PDB: 2ZW3) and the structural consequences of GJB2 variants were modeled using PyMOL v..1.1.
Fine-mapping was carried out using CAUSALdb-finemapping-pip pipeline (https://github.com/mulinlab/CAUSALdb-finemappingpip). We analyzed 1 Mb regions surrounding lead SNPs in each genome-wide significant locus. The output includes credible sets and posterior probabilities for SNPs to be causal as per PAINTOR, CAVIARBF, and FINEMAP algorithms.

Genetic correlations
In LD Hub 34 we used LD score regression to estimate the genetic correlation between hearing loss and a range of other disorders and traits, to evaluate the extent of shared genetic architecture based on common gene variants and hypothesize about association with potential risk factors. After excluding all UK Biobankbased phenotypes, the list comprised 256 phenotypes. Significance was thus set at p < 2 3 10 À04 after Bonferroni correction.
In the internal genome-wide association library (Omnibus data) collated by the Psychiatric Genomics Consortium, we used LDSC to get SNP-based genetic correlation between hearing loss and psychiatric and anthropometric traits.

Expression data sets
We obtained expression specificity data in 37 GTEx v8 35 human tissues processed by previous research. 36 Briefly, the following tissue filters were applied: (1) tissues with fewer than 100 donors, (2) non-natural tissues (e.g., cancer tissue and cell lines), and (3) testis tissues (expression outlier). Since GTEx does not contain cochlear data, we sought expression data from 36,616 cells originating from 2 datasets from adult mouse cochlea (post-natal day 60, CBA/CaJ) published by Milon et al. 37 (one from the stria vascularis and the other from spiral ganglion neuron [SGN]) summing to 15 different cell types (Table S2). Since these two datasets were genotyped using exactly the same technique in the same technical infrastructure, we merged them first by aggregating the count per gene per cell type and normalized to 1 TPM per cell type to account for the variation of cell counts per cell type in each dataset, while preserving the relative expression pattern per cell type (Table S2). Monocytes and neutrophils were found in both the stria vascularis and in the SGN data, and the correlations of normalized expression between the two datasets were high (0.90 for monocytes and 0.98 for neutrophils based on 15,798 genes), supporting the appropriateness of the merging step. Additionally, since the data from Milon et al. did not contain data from the organ of Corti (e.g., hair cells and and Deiters' cells), we relied on single-cell data extracted by Ranum et al. 38 (post-natal day 15, C3HeB/FeJ). We note that the murine cochlea is functionally mature at P14. Finally, we also used expression data from Zeisel et al. 39 consisting in 160,796 cells of 39 broad cell types sampled from 19 regions in the entire mouse neural system (post-natal day P12-P30, as well as 6 and 8 weeks old, CD-1 and Swiss) that were processed to get cell type expression specificity 36 (Table S2). Only genes with 1:1 orthology between human and mouse were preserved for calculating the expression specificity.

Calculation of cell-type expression specificity
We processed and calculated the expression specificity as previously described. 36 Briefly, in each tissue expression dataset (i.e., organ of Corti, 38 stria vascularis/SGNs, 37 and neural tissue 39 ), we first aggregated the count per gene per cell type and excluded genes that are (1) not expressed in any cell type, (2) with duplicated identifier, or (3) not 1:1 orthologous between mouse and human. We then normalized the expression to 1 TPM (transcripts per million) per cell type. Next, gene expression specificity was calculated per gene per cell type as:

Specificity ¼
Normalized expression in the cell type Sum of normalized expression in all cell types   Specificity ranges from 0 to 1; a higher value indicates that the gene is more specific to the corresponding cell types compared to its expression profiles across all included cell types. We selected genes with the top 10% specificity values in each cell type as the gene list for the cell type that was used for the heritability enrichment analyses.

SNP-heritability enrichment
We used MAGMA (v.1.08) 20 and partitioned LDSC 40 to evaluate whether the top 10% specifically expressed genes per tissue/cell type were enriched of the SNP-based h 2 of hearing loss.
MAGMA evaluated whether the cell-type-specific genes were enriched in hearing loss gene-level associations in two steps. In the first step (SNP-wise gene analysis), we filtered out SNPs with minor allele frequency <1% and poor imputation quality (INFO < 0.6) from the ARHI summary statistics and calculated the p value for per gene association with hearing loss using SNP p values (35 kb upstream and 10 kb downstream window per gene). 36 In the second step (gene-set analysis), the p values were converted to Z-scores, and one-sided tests were performed to compare whether the Z-scores in the gene set (i.e., cell-type-specific genes) were higher than those not in the gene set, which indicated enrichment of SNP-heritability in the gene set. 20,41 We then applied partitioned LDSC adjusting for the baseline annotations. 42 Partitioned LDSC evaluates whether the per-SNP heritability is higher in the SNPs in an SNP list (in our study the SNPs within 5100 kb of the cell-type-specific genes) compared to the other SNPs. 40 We calculated p values from one-sided Z score coefficient for tissue/cell-type-specific genes.

Results
We curated summary statistics from 17 independent cohorts, a total of 723,266 individuals of European descent comprising 147,997 hearing loss cases (20.5%) and 575,269 controls (79.5%). Affected individuals were defined by either clinical diagnosis of hearing loss (ICD9 and 10; FinnGen and EGCUT, 37% of participants) or self-reported hearing impairment (all other cohorts, 63% of participants). We completed a genome-wide meta-anal-ysis of 8,244,938 imputed SNPs that passed quality control (QC) and identified 48 significant loci (p < 5 3 10 À8 ). There was no evidence of residual population stratification in the results of meta-analysis (l GC ¼ 1.2764, l GC scaled to 1,000 cases and 1,000 controls ¼ 1.001173). LD score regression intercept is 1.0039 (0.0095), indicating that the inflation of the GWAS test statistics was due to polygenicity ( Figure S1). SNP heritability (h 2 ) on the observed scale was h 2 ¼ 0.0252 (SE ¼ 0.0013) and on the liability scale was estimated to be between 0.033 (SE ¼ 0.002) and 0.061 (SE ¼ 0.003) given the case/control ratio in our sample (20%) and populational prevalence in the range of 5%-40%. Next, we employed conditional and joint analysis (COJO) to identify lead independent signals in the loci (Table S3). Of 48 lead significant SNPs, 10 were considered novel associations (Table 1), defined as LD < 0.6. LDSC genetic correlations attributable to genome-wide SNPs (rg) were estimated across all hearing loss cohorts (Table S4). Regional locus zoom plots and forest plots of significant loci are presented in Figures S2 and  S3, respectively.

Gene prioritization and pathway analysis
We used MAGMA v.1.08 and VEGAS2 for gene-set analysis and prioritization of genes at associated loci ( Figure S4). Genes were examined for their relationship with hearing loss in human or mice. Seventeen loci were in or near genes with known associations to hearing loss (Table S5). Pathway analysis using VEGAS2 revealed strong enrichment in sensory perception of mechanical stimulus, sensory perception of sound, actin binding, and negative regulation of actin filament polymerization (Table S6). Interestingly, sensory perception pathways included KCNQ4, OTOF, POU4F3, PDH15, and GRIN2B, genes known to play a role in many different aspects of hearing function. Additional fine-mapping analysis identified credible sets of SNPs for each locus with 95% probability of being causal (total of 5,605 SNPs, Table S7).

Missense SNPs
Eight SNPs encoded missense mutations ( Table 2). The proportion of missense SNPs overall was 17%, which is significantly higher than the average of 5  Abbreviations: Chr, chromosome; Pos, genomic position; MAF, minor allele frequency in the current study; AA change, amino acid change; AD, autosomal dominant; AR, autosomal recessive; bp, base pair; EUR, European (non-Finnish); MAF, minor allele frequency; SNP, single-nucleotide polymorphism; DM, disease causing mutation. Pathogenicity score is estimated from an aggregated score detailed in Table S6. Aggregated pathogenicity score is normalized from 0 (variant predicted to be deleterious) to 1 (predicted to be benign).  We conducted an additional in silico functional analysis on these 8 missense variants and the results provide strong evidence that variants in FSCN2 and COL9A3 are highly deleterious and likely impact gene function (Table S8). High-resolution protein structures were only available for GJB2; however, the utilized algorithms were not unanimous regarding the functional consequences of the identified variant GJB2 p.Met34Thr (rs35887622) ( Figure S5A). The GJB2 transporter, also named connexin-26 (Cx26), is a hexamer with the altered amino acid being located at the core of the channel ( Figure S5A). The amino acid exchange results in a substitution of hydrophilic arginine for hydrophobic methionine, which alters the surface energy of the channel and likely affects substrate translocation ( Figure S5B).

Genetic correlations
A significant positive genetic correlation was found between hearing loss and insomnia, depressive symptoms, neuroticism, and obesity (LD Hub, Figure S6, Table S9). A significant negative genetic correlation was found with subjective well-being. No significant correlation was established between hearing loss and several neurological disorders or medical conditions. Additionally, we implemented the same approach to investigate SNP-based correlation between hearing loss and other traits using recent results from the Psychiatric Genomics Consortium and found significant genetic correlation with major depressive disorder, autism spectrum disorder, alcohol dependence, neuroticism, attention deficit hyperactivity disorder, and smoking initiation/smoking ( Figure S7, Table S10).
GTEx tissue enrichment analysis and cell-type specificity of hearing loss genetic associations We next attempted to identify the specific somatic tissues implicated by our GWAS and define a spatial topography of hearing loss heritability-associated gene expression. Expression datasets for multiple human tissues GTEx v.8 do not contain inner ear tissue and, unsurprisingly, LDSC and MAGMA revealed no significant enrichment of SNP-h 2 in GTEx tissues ( Figure S8; Table S11). We relied on mouse cochlear 37 and brain 39 cell-specific expression profiles to determine cell types matching the common variants identified. This approach enabled us to prioritize cells that are fundamental to the etiology of hearing loss. Mouse cochlea scRNA-seq originated from Milon et al., 37 which included the spiral ganglion region and the stria vascularis (total of 36,616 cells; Figure 2A, Table S2), but since this dataset did not include scRNA-seq data from cells of the organ of Corti (that harbor hair cells and Deiters' cells), we included an additional dataset from Ranum et al. containing Deiters' cells and inner and outer hair cells from the mouse cochleae 38 (total of 3,189 cells; Figure 2A; Table S2). Since these two cochlea studies used different methods, these were analyzed separately. Nervous system scRNA-seq included 39 broad cell types from 19 regions of the mouse central, peripheral, and enteric nervous system (total of 160,796 cells; Figure 2B). 39 If hearing loss is associated with a particular cell type, we would expect more of the genome-wide association signal to be concentrated in genes with greater specificity for that cell type. To show evidence connecting hearing loss GWASs to cell type, we used two different methods accounting for gene size and linkage disequilibrium: LDSC, 40 assessing the enrichment of the common SNP heritability of hearing loss in the most cell-type-specific genes and MAGMA, 20 evaluating whether gene-level genetic association with hearing loss linearly increases with cell-type expression specificity. We found no enrichment in cells from the organ of Corti (Deiters' cells, inner and outer hair cells; Figure 2C, Table S12). Arguably, the lack of results in this enrichment analysis could be due to the fact that specificity is a relative measure between these three cell types. This assumes that the ''effective'' genes have similar expression pattern in all three cell types and therefore would not be captured by the specificity measure. However, this reasoning is not supported by current findings since Deiters' cells and hair cells differ considerably in gene expression signature. 38 When assessing the enrichment in SGN and cells from the cochlear lateral wall (stria vascularis), LDSC analysis revealed the involvement of spindle cells of the stria vascularis and root cells of the outer sulcus, whereas MAGMA analysis highlighted the involvement of basal cells of the stria vascularis in hearing loss ( Figure 2D, Table S13). Here, spindle and root cells could not be distinguished molecularly one from another, which is why they were labeled ''spindle root cells.'' In contrast, no enrichment was found in any cell type from the mouse nervous system ( Figure 2E, Table S14). These findings strongly support a prominent role of the stria vascularis in hearing loss.
To further gain insights into the potential molecular mechanisms involved in strial dysfunction, we investigated the top 10% specifically expressed genes in basal (342 genes, Table S15) and spindle root (380 genes, Table S16) cells. In basal cells, 10 genes were associated with SNPs that were GWAS significant, but none of them immune cells (monocytes, neutrophils, and B cells), these are not shown on the schematic. The red line shows the Bonferroni significance threshold (Àlog 10 p value 2.42) (E). Mouse nervous system cell type enrichment showing no significant enrichment. The red line shows the Bonferroni significance threshold (Àlog 10 p value 2.89). Images from (A) and (B) were reproduced from previous work [47][48][49] with permission from Nature Springer.
were found in the significant pathways associated with hearing loss and listed in Table S6. Among these genes, the evidence for an involvement in hearing loss is sparse. For instance, NID2, which has been associated in humans with the Landau-Kleffner syndrome, a rare language disorder with suspicions of hearing loss. 50 PC encodes a pyruvate carboxylase that requires biotin and ATP for catalyzing gluconeogenesis. Pyruvate carboxylase deficiency is a rare severe metabolic disease that in some cases can be manifested with hearing loss. 51 CCS encodes a copper ion binding protein and its mutations are associated with disfunctions of copper metabolism resulting in Wilson disease, a rare inherited disorder that causes excess accumulation of copper in several organs. 52 Individuals with Wilson disease display abnormal auditory brainstem responses. 53 Similarly, AHDC1 is most probably involved in DNA binding, and loss-of-function mutations result in Xia-Gibbs syndrome-a neurodevelopmental disorder with rare presentation of hearing loss. 54 In spindle root cells, EYA4 and HOMER2 were identified in pathways that were found significant in VEGAS2-sensory perception of sound and sensory perception of mechanical stimulus, respectively. The contribution of EYA4 variants to hearing loss has been well established. 55 HOMER2 is involved in intracellular homeostasis of calcium and cytoskeletal organization and has been previously associated with hearing loss, 56 but its function within the stria vascularis remains unknown. TMPRSS9 encodes a membrane-bound serine polyprotease involved in the proliferation of inner ear progenitor cells in the mouse cochlea. 57 GAS2 encodes an actin filament binding protein that plays a role in cell shape and regulating microfilament rearrangements. In mice, Gas2 is expressed in supporting cells but also in the stria vascularis from the post-natal cochlea and its disruption causes hearing loss in mice and human. 58 Taken together, these findings suggest that dysfunctions in the stria vascularis involve a large range of molecular mechanisms, globally impacting strial function.

Discussion
The present genome-wide meta-analysis is among the largest conducted in hearing genetics to date and provides an association catalog that helps to refine the fundamental basis of hearing loss. We find evidence of association for 48 common genetic loci, of which 10 were novel and highlight the role of genes expressed in cochlear lateral wall, consisting of the spiral ligament and stria vascularis as an important contributor to hearing loss. We employed a pragmatic, clinically informed approach by including cohorts that met empirical criteria for sufficient genetic and phenotypic similarity, based on both self-report and medical registries. We previously verified a high genetic correlation between objective measures of hearing loss and questionnaires. 14 Second, our findings point to multiple genes that have been reported to cause Mendelian forms of hearing loss previously, including EYA4, CDH23, TRIOBP, and GJB2, the latter being the most commonly reported gene in autosomal-recessive non-syndromic hearing loss. 59 This study is part of an expanding number of hearing loss GWASs with ever increasing sample size and the emerging functional and cellular bioinformatics tools enable us to unravel more of its pathophysiological pathways. Our study confirms as do other recent GWASs that hearing loss is driven by multiple common variants in known hearing genes.
Our results allow us to draw several broad conclusions. Of importance, a large proportion of potentially disruptive missense variants were found in contrast to other diseaserelated GWASs (Fischer's exact test; p ¼ 0.005; Figure S9), suggesting that a burden of common and rare yet impactful variants may drive the risk of hearing loss.
Second, our results do not point toward a large involvement of the brain within hearing loss. Although this is not entirely unexpected, proficient hearing acuity requires the functional integration of signals that are captured at the level of the cochlea, which are transduced to provide signal down the VIII th cranial nerve and further propagate via the brainstem toward the thalamus and the auditory cortex. Abnormal CNS streaming of signal to noise has been implicated in ARHI 60 and is supported by the association with cognitive decline, but our GTEx and scRNA-seq analyses revealed no enrichment of GWA signals in the brain, nor in its regions or its major cell types.
Third, we found significant positive genetic correlations with depressive symptoms, obesity, and smoking, but not with Alzheimer disease. The latter is interesting as hearing impairment is an established risk factor for cognitive decline and dementia, 61 suggesting that hearing impairment rather than shared underlying genetic factors contribute to the development of dementia. A more complete analysis of the drivers of this relationship is needed, but the inference is that shared environmental factors contributing to hearing loss and dementia will predominate, rather than shared genetic factors.
Fourth, our findings in hearing loss pathway analyses implicate the processes involved in cytoskeleton organization and actin binding, two broad features of the mechano-transduction apparatus of the sensory hair cells. 62 Indeed, these findings are consistent with a recent GWAS performed on the UKBB that localized a number of lead SNPs in cells from the post-natal mouse cochleae using scRNA-seq data 12 or in the human cochlea (mainly in type I SGN, or hair cells) using immunohistochemistry on samples collected from individuals with life-threatening posterior cranial fossa meningioma compressing the brain stem. 63 These included EYA4, LMX1A, PTK2/ FAK, UBE3B, MMP2, SYNJ2, GRM5, TRIOBP, LMO-7, and NOX4. Consistent with their findings, we also identified SPTBN1, a mouse ortholog Spectrin expressed in the cuticular plate at the base of the stereocilia, deletion of which causes profound deafness. 64 Interestingly, synaptic plasticity genes were also found such CTBP2, which is an important marker of the pre-synaptic machinery-namely the synaptic ribbon-gathering the glutamate-filled vesicles prior to their release. Alterations in ribbon abundance has been associated with cochlear synaptopathy in mouse models of noise-induced hearing loss. [65][66][67][68] In these models a decrease in ribbon abundance in the absence of hearing loss (the so-called hidden hearing loss) is thought to be associated with problems in speech in noise recognition and tinnitus. 69,70 Fifth, although lead association SNPs are related to sensory hair cell function and their involvement in hearing loss is well established, our cell-specific enrichment analysis revealed hearing loss being also driven, at least in part, by basal cells and spindle cells in the stria vascularis and root cells in the outer sulcus. Indeed, the lead SNPs mainly related to hair cell and auditory neuron function represent the tip of the iceberg (e.g., STBPN1, CLRN2, EYA4, SYNJ2, CDH23, CTBP2, LMO7, FSCN2, ANKRD24, TRIOBP, and BAIAP2L2) influencing hearing loss with the greatest probability, while the whole iceberg is pictured by the contribution of all GWAS signals, pointing to genes expressed in cells from the lateral wall, namely basal cells and spindle cells of the stria vascularis and root cells in the outer sulcus (e.g., EYA4, MMP2, GJB2, and GJB6). These three cell types are primarily involved in endolymph ion homeostasis. 47,[71][72][73] The basal cells are coupled to each other by tight junctions to prevent leakage of ions 74 and to keep the stria vascularis separate from the spiral ligament. In addition, the spindle cells have recently been shown through gene regulatory networks to have a role in responses to inflammation. 47 However, we were unable to differentiate the spindle cells from the root cells unlike Gu et al. 47 who used single-nucleus RNA-seq and could identify a differential expression between these two cell types.
Our findings are in opposition to a recent study suggesting that outer and inner hair cell loss is the main contributor to ARHI and that strial tissue loss does not correlate with audiologic patterns of ARHI. 75 The human otopathologic analysis of this study focused on cellular loss of the stria vascularis and may have captured loss of basal cells that cover the full extension of stria. However, cellular loss of the other two cell types is not covered by this analysis. The spindle cells reside at the edges of the stria and the root cells are outside of the stria in the outer sulcus region. Furthermore, genetically caused functional loss cannot be recognized at the histopathologic level and may precede cellular loss. The precise contribution of the sensory cell and stria vascularis mechanisms to the development of hearing loss needs to be further elucidated using molecular techniques and at a time before cell death is apparent.
The lack of cochlear tissue in the GTEx biobank is indeed a major limitation to all genetic studies of hearing and communication disorders that needs to be addressed. Having access to eQTL data from human cochlear tissue would also provide significant advances in understanding the biology of hearing loss. These limitations were partially ad-dressed in our study by gathering a unique combination of datasets of mouse cochlear scRNA-seq. Given the complexity of the organ and its ossification, the number of compartments, and the variety of constituent cell types, such comprehensive knowledge was not available until recently. Using expression data from rodents to infer on human auditory physiology may be seen as a limitation, since mouse expression data are not fully representative of human cells. However, from a total of 48 loci we identified, 18 harbor genes that have been associated with hearing loss. From these 18 loci, 16 are related to genes disruption of which causes hearing loss in mice. Thus, these findings strongly argue in favor of the translational reliability of the present findings.
We also note that part of the scRNA-seq mouse data used here was generated from 10X Genomics, which has a sequencing resolution that may have yielded insufficient number of genes to reveal enrichments (e.g., when compared to new methodologies such as Smart-Seq2). For instance, using 1,100 proprioceptive neurons and Smart-seq2, Wu et al. detected 11,000 genes per cell and identified 8 cell types, 76 while previous studies using 10X Genomics favoring a higher number of cells but with lower coverage could not differentiate proprioceptive neurons in any subtypes. 77 In the study from Milon et al., only one type of fibrocyte was identified, whereas there are 5 known types of fibrocytes (type I-V) present in both man and experimental animals, but that have quite different function and molecular expression. 78,79 Thus, new sequencing technologies may offer increased resolution and statistical power to reveal more accurate predictions of the involvement of more specific cells in hearing loss.
Hearing loss is a heterogeneous disorder with many contributing factors during life. A potential implication for future genetic studies is the elucidation of the bulk of common variants using a cost-effective shortcut involving on-line self-reports combined with auditory tests, such as automated speech and noise test. 80 The use of online assessment would allow for a comprehensive recording of phenotypes and environmental exposures in millions of individuals. Conversely, current clinical audiologic phenotyping is mostly limited to pure tone thresholds and by far not exhaustive. Carefully phenotyped individuals with hearing loss in combination with next-generation sequencing may increase the resolution of the genetic coverage and reduce the sample size to thousands, something that has been shown highly effective in the context of schizophrenia. 81

Conclusions
This study of hearing loss identified 48 associated loci, including 10 novel associations and 8 missense SNPs. Our work highlights the role of the cochlear lateral wall including the stria vascularis and the outer sulcus as a contributor to hearing loss. The results provide a valuable resource for the selection of promising genes for further functional validation in pre-clinical models and define targets for screening purposes, drug development, gene therapy, or stratification approaches. We believe such experiments will serve as a solid foundation for ultimately improving therapies against hearing loss.