Release history
KidneyGPS 2.3.1 [2023-09] - update of documentation page
KidneyGPS 2.3.0 [2023-08] - integration of eGFR association statistics separated by diabetes status for all credible set variants.
KidneyGPS 2.2.0 [2023-08] - re-organisation of the GPS tab with extendend filter options.
KidneyGPS 2.1.0 [2023-07] - separated drug information by indication for kidney diseases according to the ICD-11 codes.
KidneyGPS 2.0.1 [2023-07] - enable variant-search for chromosome and position (based on GRCh37).
KidneyGPS 2.0 [2023-07] - reanalysis of the 424 loci regarding stability of independent signals. 594 stable independent signals were identified using stepwise conditioning with GCTA. Credible sets were re-calculated for these signals resulting in 35,885 credible set variants.
KidneyGPS 1.3.1 [2023-02] - additional summary numbers in "GPS tab" below GPS table; in minor fixes
KidneyGPS 1.3.0 [2022-11] - integration of drug target and interaction information for all genes from Therapeutic Target Database
KidneyGPS 1.2.0 [2022-11] - integration of drug target and interaction information for all genes from Therapeutic Target Database
KidneyGPS 1.2.0 [2022-10] - integration of ADTKD genes, DM-status interaction and eGFRcrea decline association; loosening of the CADD-Phred cutoff for protein-altering variants with a clear functional cosequence
KidneyGPS 1.1.2 [2022-06] - small layout changes
KidneyGPS 1.1.1 [2022-04] - minor fixes
KidneyGPS 1.1.0 [2022-03]- integration of eQTL data from Susztaklab
first publication KidneyGPS 1.0 [2022-03]
Gene Prioritisation (GPS) – How to Generate a List of Genes with Defined Properties:
The GPS tab in KidneyGPS comprises of three filter sections that can be utilized to restrict the displayed GPS table to genes and signals with specific properties:
Filtering sections:
-
'Singal-filtering:'
The first section enables to restrict the GPS table to signals with specific properties.
-
Filter for the strength of statistical support: You can restrict to signals based on the posterior probability of association (PPA) of a signal’s credible set variants.
Each credible set is thought to contain the variant causal for the association with a 99% probability.
For instance, selecting PPA>99% will show signals containing only one credible set variant.
PPA>50% indicates a high probability of one variant being causal, and PPA>10% excludes signals with only low probability variants.
Alternatively, you can filter for signals with a small credible set of five or fewer variants.
If the PPA filter and the small-set filter are selected, signals that meet one of these criteria are shown in the displayed GPS table. The 'no filter' option displays all signals.
-
Further filter options:
-
You can restrict to signals that are validated for kidney function relevance based on eGFRcys or BUN associations.
-
You can restrict to signals with different associations in individuals with vs. without diabetes, or signals associated with eGFR decline.
-
'Variant-to-gene mapping:'
The second section deals with variant-to-gene mapping. It allows you to restrict the GPS-view to genes that are mapped by a credible set variant with specific properties.
-
Feature selection: Choose one blue or orange feature to restrict the GPS-view to genes mapped by a credible set variant with that specific property. When multiple options are selected genes with non-zero entries in any of the respective GPS columns will be shown.
-
Additional mapping restrictions:
If you choose any blue or orange feature, you can further restrict mapped credible set variants based on the strength of statistical support of the mapped variant itself.
As in the first section, you can select PPA and small set criteria. If both PPA and small set criteria are chosen, the mapped variant must satisfy at least one of them;
otherwise, the gene mapped by the variant is omitted from the GPS-view.
-
To display all genes, regardless of variant-to-gene mapping, select the 'no filter' option. Note: These genes may not directly relate to kidney function and could simply be located in an eGFRcrea locus by chance.
-
'Gene-to-phenotype mapping:'
The third section deals with gene-to-phenotype mapping. It allows you to restrict the GPS-view to genes with specific properties.
-
Kidney phenotypes: Choose from two options to restrict the GPS-view to genes with either a kidney phenotype in mouse models or genes known to cause human genetic diseases with kidney phenotypes.
If both options are selected, the gene must satisfy at least one of them.
-
Drug information options: These options refer to drugs listed in the Therapeutic Target Database (TTD), separated by indications.
One option filters genes targeted by drugs for any kidney disease, and the other filters genes targeted by drugs for other diseases or with unspecified indications.
-
If multiple options are selected, genes must only satisfy one of the chosen criteria.
The GPS Table:
By default, the displayed GPS Table is a full and comprehensive summary of the annotation for all 5,906 genes and 594 eGFRcrea signals.
After applying “Signal-Filtering”, “Variant-to-gene mapping” or “Gene-to-phenotype mapping”, the displayed GPS Table shows the results of the respective data filtering.
Blue and orange columns contain numbers representing the count of credible set variants targeting the respective gene through that specific feature.
If a gene was located in a locus with multiple independent signals or was mapped by credible set variants from different association signals, multiple rows for that gene may be included.
The number of genes meeting the filter criteria is displayed below the GPS Table.
Step-by-step guide
Find here a step-by-step guide with three examples:
step-by-step_guide_for_app.pdf
Gene Search:
The 'Genes' tab allows to search your gene(s) of interest. Information is available for the 5906 genes overlapping any eGFRcrea locus.
Search panel
The left panel on the 'Genes' tab is the search panel. There are three options to start your search request
-
The option 'Gene Name' allows to enter one gene name. KidneyGPS uses the official HGNC gene names and does only support a few synonyms.
-
The option 'Paste a list of genes' allows to enter a bulk of gene names to search for. Allowed separators are spaces and all non-letter symbols except for hyphens ('-') as those can be part of a gene name.
-
The option 'Choose txt file with a list of genes' allows to upload a txt-file with a list of gene names to search for. After selecting a file via the 'Browse..' button, the separator must be selected. This can be comma, semicolon, or tab.
The 'reset' button deletes the uploaded file.
The last clicked input field is automatically selected and only gene names in this field will be included in the search.
Specification panel
Beside searching for all available information for a gene in KidneyGPS, the search results can be restricted or extended using the different options in the 'Specification' panel. The colored options are selected by default, so that each information supporting a genes relevance for kidney function will be shown. Unselect options if these should not be shown in the results.
Additionally, you can restrict the shown resultsto genes that map to credible set variants above a user-define PPA threshold. The higher a variant's PPA, the more likely it is the causal variant for the association signal.
Search results
After clicking 'GO' or pressing 'Enter' the search will be performed. First, an excerpt of the GPS Table will be shown for the searched genes that are located in an eGFR locus. A text will inform how many of the searched genes are included in KidneyGPS.
How many functionally or regulatory relevant credible variants map to the searched gene is shown in the GPS Table. Details to these variants can be found below the GPS Table.
If a PPA filter was selected only credible set variants meeting this criterion will be counted in the GPS Table.
Variant Search:
Search panel
The left panel on the 'Variants' tab is the search panel. There are three options to start your search request. KidneyGPS supports RS-identifiers and genetic positions (format chr:position) from genomic build hg19.
-
The option 'Single SNP search' allows to enter one variant.
-
The option 'Paste a list of RSIDs' allows to enter a bulk of variants to search for. Allowed separators are spaces and all non-letter symbols.
-
The option 'Choose txt file with a list of RSIDs' allows to upload a txt-file with a list of variants to search for. After selecting a file via the 'Browse..' button, the separator must be selected. This can be comma, semicolon, or tab.
The 'reset' button deletes the uploaded file.
Search options
Similar to the 'Gene search', search results can be restricted or extended using the search options. There is the possibility to search for eGFR association statistics in 'all ancestries', which will be displayed when a variant is genome-wide significant (in the all ancestry meta-analysis).
In general, variants are searched for overlap with 99% credible set variants (selected from European-only meta-analysis; do not necessarily meet the genome-wide significane criterion). Additional functional information is available for these variants (coloured options) as well as association statistics separated by diabetes status and a regional association diagram for the locus containing the variant(s) being searched for.
Search results
After clicking 'GO' or pressing 'Enter' the search will be performed. If the default options are selected and if the searched variants are genome-wide significant associated or credible set variants, the first two tables will display the association statistics. If the searched variant has functional consequences for a gene, this will be displayed below.
Region Search:
Search panel
The tab 'Region' allows to search a region for overlapping eGFR association signals. The region can be defined by selection the chromosome and entering start- and end-postion of the region.
Search results
After clicking 'GO' or pressing 'Enter' the search will be performed. If the searched region overlaps at least partly an eGFR locus, two results tables will be displayed. The first table shows the independent signals included in the overlapping eGFR locus. The second table is the GPS Table excerpt for those signals including all genes in the eGFR locus.
Data Sources:
Association with eGFRcrea:
Genetic loci and genes within these loci are based on a
GWAS
meta-analysis for eGFRcrea of
UK Biobank data
and
CKDGen consortium
data
(n=1,201,909).
Detailed information on the selection process can be found
here.
A GWAS-meta-analysis restricted to individuals of European-ancestry (n=1,004,040) was used to identify independent
association signals and to calculate posterior probabilities of association (PPA) for all variants in each signal. The 99% credible set of variants contains the causal variant with 99% probability, under the assumption that there is one causal variant per association signal and that this variant is included in the analysis.
Association with other phenotypes:
eGFRcys & BUN
GWAS meta-analyses were also performed for eGFR estimated from serum cystatin C (eGFRcys, n=460,826) and blood urea nitrogen (BUN, n=852,678). KidneyGPS provides the information if the locus lead variant (variant with the smallest association p-value in a locus) is nominal significantly associated with eGFRcys or BUN with concordant effect directions.
Summary statistics of these analyses can be downloaded
here.
Interaction with diabetes status
Diabetes mellitus (DM) is a risk factor for kidney failure. A GWAS meta-analysis for eGFRcrea conducted separatly for individuals with or without DM (
nDM
=178,691,
nnoDM
=1,296,113) by Winkler et al. identified 7 loci with significant DM/noDM difference.
5 of these locis showed a more pronounced effect on eGFR in DM versus noDM (DM>NoDM), one locus had a DM-only effect and one locus a noDM-only effect. Further information on the impact of diabetes status on the genetic eGFRcrea effect sizes can be found in the original publication:
Winkler et al. Commun. Biol. 2022
. Variants identified by this study were mapped to eGFRcrea signals in KidneyGPS via overlap, or strong correlation with the signal index variant identified by Stanzick et al.
Association with eGFRcrea decline
Progressive eGFR-decline can lead to kidney failure, necessitating dialysis or transplantation. Hence,
Gorski et al. [Kidney Int. 2022]
searched for genetic association with annual eGFR-decline using 62 longitudinal studies in 343,339 individuals. Associated variants were identified by three approaches: First, a genome-wide screen on eGFR-decline unadjusted for eGFR-baseline revealed two significantly (
(Pdecline
< 5 x 10-8
) associated variants within the
UMOD-PDILT
locus.Second, a candidate approach among the 263 lead variants for eGFRcrea from
Wuttke et al. [Nat. Genet. 2019]
identified two associated variants (Bonferroni corrected:
Pdecline
< 0.05/263 = 1.90 x 10-4
). Third, a genome-wide screen for association with eGFR-decline adjusted for eGFRcrea at baseline revealed five variants, that were also associated (Bonferroni corrected:
Pdecline
< 0.05/12 = 4.17 x 10-3
) with eGFRcrea decline unadjusted. The identified C15orf54 signal maps to a second signal in this locus and is thus not included in our GPS.
We integrated these identified variants, when they resided in eGFRcrea signal or showed strong correlation with the signal index variant identified by Stanzick et al.
CADD:
The combined annotation dependend depletion (CADD) score is a measurement of the deleteriousness of a genetic variant [Rentzsch et al. 2018]. By integrating multiple annotations, it contrasts variants that survived natural selection with simulated mutations.
CADD evaluated ~8.6 billion SNPs and the CADD-Phred Score used on this website represents the rank of variant compared to all annotated variants. Variants with the coding and non-coding consequences "stop-gained", "stop-lost", "missense", "canonical splice", "noncoding change", "synonymous" or "splice-site" are not restricted regarding their CADD-Phred Score.
Variants with "other" consequences are filtered for a CADD-Phred Score
≥
15, which restricts our analysis to the 3.2% most deleterious variants.
Further, the analysis is restricted to variants within the affected gene as overlap with eQTLs and sQTLs should be minimized to avoid overscoring particular genes and variants. For additional information regarding CADD, please vistit the
CADD website.
Used version: v1.6 [2020-03-23]
eQTL and sQTL data:
All credible variants were searched in expression quantitative trait loci (eQTL) databases. Three sources for eQTL data were used:
NEPTUNE
eQTL data from the NEPTUNE study [Gillies et al. 2018] includes
cis-
eQTLs, which are variants that influence expression of genes within a 1Mb region centred around the variant. The association between a variant and the expression of a gene was deemed to be significant if the false dicovery rate (FDR) was <0.05.
This eQTL data was obtained from glomerular and tubulo-interstitial tissue. Further information about the NEPTUNE study can be found on the webpage of the
study
and on the
NephQTL browser.
Version from [2017-09-25]
Susztaklab (Sheng et al.)
The Susztaklab also provides comprehensive kidney omics data. We integrated the eQTL data from glomerular und tubulo-interstitial tissue published by Sheng et al.
(Sheng, X. et al., Nature Genetics, 2021).
GTEx
In contrast to the other two eQTL sources, the GTEx project is not restricted to kidney tissue. Furthermore, additional splicing altering variants (sQTLs) were investigated. Thus, the here integrated GTEx data includes
cis-
eQTL and -sQTL information from 48 different tissues with a mapping window of 1Mb up- and downstream of the transcription start site [Auget et al. 2021].
Further information about GTEx can be found
here.
Used version: GTEx Release v7 [2017-09-05]
Mouse phenotypes:
Information on genes with kidney-relevant phenotypes in mice origin from the Mouse Genome Informatics database (MGI,[Bult el al. 2018]). This includes all phenotypes subordinate to "abnormal kidney morphology" (MP:0002135) and "abnormal kidney physiology" (MP:0002136).
Further information how this data was collected can be found on the
MGI webpage.
Version from [2020-06-03]
Human phenotypes:
We used three sources to identify genes causing genetic disorders with kidney phenotype in human:
OMIM
The Online Mendelian Inheritance in Man (OMIM) database [Hamosh et al. 2000] was queried for phenotype entries subordinate to the clinical synopsis class "kidney". Diseases with "kidney"-phenotype entries being: "normal kidneys", "normal renal ultrasound at ages 4 and 7 (in two family)", "no kidney disease", "no renal disease; normal renal function", "normal renal function; no kidney disease" and "no renal findings" were manually excluded.
Be aware that OMIM entries missing a clinical synopsis entry are not included in kidneyGPS regardless of a potential kidney involvement. Further information on the diseases can be found at the
OMIM webpage.
Version from [2020-08-07]
Groopman et al.
A list of 625 genes
associated with Mendelian forms of kidney and genitourinary disease was published by Groopman et al. in 2019 in the New England Journal of Medicine. The original article "Diagnostic Utility of Exome Sequencing for Kidney Disease" can be found
here.
Please notice that not all 625 genes are included in any eGFRcrea locus and thus cannot be found in kidneyGPS.
Wopperer al.
Autosomal Dominant Tubulointerstitial Kidney Disease (ADTKD) is a heriditary kidney-disease normaly caused by mutations in at least one of five genes (
UMOD, MUC1, REN, HNF1B, SEC61A1
) and leads to kidney failure in midadulthood. However, Wopperer et al. (2022) identified 27 putative novel ADTKD genes, of which 9 are located within an eGFRcrea associated locus.
Disease type of known ADTKD genes is stated as "confirmed ADTKD" in the "Kidney phenotypes in human" section and as "putative ADTKD" for the novel genes. The original publication can be found
here.
Drug information:
Information on weather a gene, it's mRNA or the respective protein is a known drug target or interacts with a drug (e.g. as transporter) was downloaded from the Thearepeutic Target Database (TTD). "Highest drug status" and "disease/indication" refer to the drug and not necessarily to the shown target-drug pair.
Additional information on TTD can be found in the related publication from
Ying Zhou et al. 2022.
Drugs referring to kidney related indications were obtained using ICD-11 codes GB4'X' to GB9'X'
Privacy, data security an License
We take the privacy and security of your data very seriously. Your input data is only used within the application to provide the intended functionality; your input data is not stored..
Data Collection: Data Collection: We do not collect any user data, including IP addresses, searched genes, or variants within the RShiny application.
Data Logging: We do not log any data queries or actions performed by users within the application.
Data Sharing: We do not share any user data or information with third parties.
Data Access: Only authorized personnel have access to the application's underlying infrastructure.
This work is licensed under a Creative Commons Attribution 4.0 International License.
This RShiny web-server is operated by the University of Regensburg. The full privacy policy can be viewed here: https://www.uni-regensburg.de/datenschutz/
Citation
Did you use KidneyGPS for a publication? We would appreciate if you cite us: "KidneyGPS: an easily accessible web application to prioritize kidney function genes and variants based on evidence from genome-wide association studies" and the original data sources appropriately.