WikiORA integrates Wikidata, Gene Ontology, PanglaoDB, and Wikipedia for a seamless, encyclopedic enrichment analysis.
WikiORA is a tool designed to simplify the process of gene set over-representation analysis by integrating data from Wikidata and Wikipedia. Our goal is to provide researchers with an intuitive platform to identify significantly enriched gene sets in their data using curated information from various sources.
WikiORA follows these steps to perform over-representation analysis:
Here are the definitions of the various metrics generated by WikiORA. While only some metrics are shown in the dashboard, they are all available upon downloading the tsv.
odds_ratio = (overlap_count * (total_genes - gene_set_size - input_gene_list_size + overlap_count)) / max((gene_set_size - overlap_count) * (input_gene_list_size - overlap_count), 1)
combined_score = -log10(p_value) * odds_ratio
gene_ratio = overlap_count / gene_set_size
WikiORA uses Wikidata as a the sole data source for the gene sets, selecting only terms with linked Wikipedia pages. Wikidata combines community curation with data imports. For cell type markers, the major source of information is PanglaoDB. For biological processes, molecular functions and cellular components, the major sources are the Gene Ontology Annotation (GOA) Database and the Gene Ontology Resource.
The gene sets available for this version can be retrieved in the Download page. To ensure reproducibility, older versions of the tool are archived on GitHub.
Manuscript in preparation.
Lubiana, T., & Nakaya, H. (2024).WikiORA (version 0.4.0). Retrieved from https://wikiora.sysbio.tools
WikiORA is developed in Brazil by a team of bioinformaticians passionate about open knowledge. The project is led by Tiago Lubiana at the Computational Systems Biology Laboratory, headed by Prof. Helder Nakaya.
If you have any questions, feedback, or suggestions, please contact us via GitHub.
If you like our project, give us a ⭐ on GitHub!