With the rapid expansion of genome-wide association studies (GWAS), researchers are increasingly faced with the challenge of translating large-scale genetic data into meaningful biological insights. Although powerful analytical approaches such as Mendelian randomization (MR), polygenic risk score (PRS) calculation, Gene Ontology (GO), and the Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment analyses are widely used, their implementation often requires familiarity with multiple software packages and complex parameter settings. This technical barrier can be especially daunting for beginners entering the field of bioinformatics.
To address this challenge, Professor Yong Cui from the Department of Dermatology, China-Japan Friendship Hospital, along with a team of researchers, developed MPGK, a user-friendly, command-line-based bioinformatics tool that integrates MR, PRS, GO, and KEGG analyses into a single streamlined workflow. Their study was made available online on March 09, 2026, in the Chinese Medical Journal.
MPGK unifies several widely used bioinformatics packages within a structured pipeline. For MR analysis, it calls the TwoSampleMR package and automatically performs heterogeneity testing, pleiotropy assessment, and visualization. The PRS module integrates PLINK for genotype processing and PRSice for score calculation, generating both numerical outputs and probability distribution plots for intuitive interpretation. Meanwhile, the GO and KEGG modules utilize clusterProfiler to conduct functional enrichment analyses and produce publication-ready visualizations such as dot plots, bar plots, directed acyclic graphs, and network plots.
Unlike conventional workflows that require switching between multiple scripts and environments, MPGK allows researchers to execute different analytical modules independently or combine them within a single automated process. Its flag-based command-line interface ensures reproducibility while remaining concise and easy to operate across Linux and macOS systems.
To demonstrate its capabilities, the team conducted three representative analyses using publicly available GWAS datasets for diabetes and psoriasis, as well as institutional sequencing data. In MR analysis, MPGK identified a statistically significant causal relationship between diabetes and psoriasis using the inverse variance weighted method (P = 3.75 ×10-25), consistent with previous epidemiological and genetic findings. In PRS analysis, MPGK successfully generated individual-level PRS for diabetes and visualized the probability distributions across case and control groups. Functional enrichment analyses further revealed that psoriasis-associated genes were enriched in immune-related pathways, including antigen processing and presentation, Epstein–Barr virus infection, and T helper cell differentiation pathways. These results aligned with established literature on psoriasis pathogenesis and comorbidities.
Beyond its analytical functions, MPGK emphasizes accessibility and reproducibility. By standardizing input formats and automating key steps, it reduces manual errors and improves efficiency. Importantly, it does not require high-performance computing resources, making advanced genomic analyses more feasible for smaller research teams and clinical investigators.
The developers acknowledge that MPGK currently focuses primarily on genomics-based analyses. However, future updates aim to expand into multi-omics integration and incorporate machine learning capabilities, potentially leveraging Python-based frameworks to enhance analytical depth.
As genomic data generation continues to accelerate, tools that bridge methodological complexity and practical usability are increasingly essential. MPGK represents a step toward democratizing advanced bioinformatics analysis, empowering both beginners and experienced researchers to derive meaningful insights from genetic data.