DGLinker is a tool for the prediction of novel human Disease-Gene associations given a set of genes that are known to be associated with the target human phenotype(s). In brief, utilizing a set of databases of biological and phenotypic information, the tool generates a knowledge graph. An enrichment test is then used to identify predictive features of the genes known to be associated with the target phenotype(s). The total adjacency of every gene with all predictors of each type (the columns of the matrix) is calculated from the graph. The adjacency matrix is then scaled and weighted to produce a final score for every gene. Predictions are made by applying a threshold to this similarity score, with all genes above the threshold predicted as candidate genes. The optimum weighting and score threshold are learned from the known set of associated genes.



Job Details

Email address:

i (Optional) We will send you job status notification and details


Input options:

 Select phenotype(s) 

i Genes associated with the selected phenotype(s) from DisGeNet will be used

 Select phenotype(s) associated genes 

i This option allows you to define your own set of known gene disease associations

 Select genes 

i Please provide a set of genes associated with the target phenotype. Use this option only if you could not find the target phenotype using the other options above.

 Retrieve related phenotypes 

i phenotypes matching the input keywords will be automatically selected

 Ontology-based Search 

i use HPO to query diseases

Select phenotype(s)

Please input the name of phenotypes you want to predict, seperated by commas. phenotypes recognised: 0/0
Phenotypes not recognised:


Available databases:

i Please select the databases to use in the generation of the knowledge graph. At least one "Gene-Disease Association" database needs to be selected

Gene Pathways
Reactome v2021-06
KEGG v2021-03

Gene-Disease Association
DisGeNet v7.0
OMIM v2021-04
HPO v2021-04
ClinVar v2021-09

Publications
NCBI PubMed v2021-04
Expression
Human Protein Atlas (HPA) v20.1
ArrayExpress Atlas (experiment E-MTAB-513 Illumina body map) v2021-02
GTex, Tissue specific gene expression and eQTLs v8

Gene Function (GOterms)
Gene Ontology v2021-02

i GO annotation only, contains three parts: Biological Process, Cellular Component, Molecular Function.

Gene Ontology (Biological Process) v2021-02

i GO annotation only

Gene Ontology (Biological Process) (expanded) v2021-02

i The ontology structure will be used

Gene Ontology (Cellular Component) v2021-02

i GO annotation only

Gene Ontology (Cellular Component) (expanded) v2021-02

i The ontology structure will be used

Gene Ontology (Molecular Function) v2021-02

i GO annotation only

Gene Ontology (Molecular Function) (expanded) v2021-02

i The ontology structure will be used

Protein-Protein Interaction
IntAct v2021-04
BioGRID v4.4
InnateDB v2021-02
Mentha v2021-02
MINT v2021-02
IMex v2021-02
MatrixDB v2021-02
UniProt v2021-02

i The subset of IntAct Database


Upload database(s): 

i Please see the tutorial for the format specifications

Model validation:

Apply k-fold cross validation (k=

i Apply k-fold cross validation to the model. K should equals to 2,3,4 or 5. The validation will increase the job processing time by approximately a factor K.

Enrichment analysis:

Select the threshold cut off =  

i Apply threshold cut off for the number of predicted genes included in the analyses. If not checked, the threshold will defaultly be 100.








Retrieve Job

Find your job: 

i Insert your job ID in the box and click "Find" to retrieve the results. You were given a job ID upon submission and it was emailed to you if a valid email address was provided.

  




References

[1] Bean, D.M., Wu, H., Iqbal, E. et al. Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records. Scientific Reports, 2017, 7(1):16416.

[2] Bean, D.M., Al-Chalabi, A., Dobson, R.J.B., Iacoangeli, A. A Knowledge-Based Machine Learning Approach to Gene Prioritisation in Amyotrophic Lateral Sclerosis. Genes, 2020, 11(6):668.

[3] Hu, J., Lepore R., Dobson, R.J.B., Al-Chalabi, A., Bean, D.M., Iacoangeli, A. DGLinker: flexible knowledge-graph prediction of disease–gene associations, Nucleic Acids Research, 2021, gkab449.