TY - JOUR
T1 - Pangenome and genomic signatures linked to the dominance of the lineage-4 of Mycobacterium tuberculosis isolated from extrapulmonary tuberculosis patients in western Ethiopia
AU - Chekesa, Basha
AU - Singh, Harinder
AU - Gonzalez-Juarbe, Norberto
AU - Vashee, Sanjay
AU - Wiscovitch-Russo, Rosana
AU - Dupont, Christopher L.
AU - Girma, Musse
AU - Kerro, Oudessa
AU - Gumi, Balako
AU - Ameni, Gobena
N1 - Publisher Copyright:
Copyright: This is an open access article,
PY - 2024/7
Y1 - 2024/7
N2 - Background The lineage 4 (L4) of Mycobacterium tuberculosis (MTB) is not only globally prevalent but also locally dominant, surpassing other lineages, with lineage 2 (L2) following in prevalence. Despite its widespread occurrence, factors influencing the expansion of L4 and its sub-lineages remain poorly understood both at local and global levels. Therefore, this study aimed to conduct a pan-genome and identify genomic signatures linked to the elevated prevalence of L4 sublineages among extrapulmonary TB (EPTB) patients in western Ethiopia. Methods A cross-sectional study was conducted at an institutional level involving confirmed cases of extrapulmonary tuberculosis (EPTB) patients from August 5, 2018, to December 30, 2019. A total of 75 MTB genomes, classified under lineage 4 (L4), were used for conducting pan-genome and genome-wide association study (GWAS) analyses. After a quality check, variants were identified using MTBseq, and genomes were de novo assembled using SPAdes. Gene prediction and annotation were performed using Prokka. The pangenome was constructed using GET_HOMOLOGUES, and its functional analysis was carried out with the Bacterial Pan-Genome Analysis tool (BPGA). For GWAS analysis, Scoary was employed with Benjamini-Hochberg correction, with a significance threshold set at p-value 0.05. Results The analysis revealed a total of 3,270 core genes, predominantly associated with orthologous groups (COG) functions, notably in the categories of ‘[R] General function prediction only’ and ‘[I] Lipid transport and metabolism’. Conversely, functions related to ‘[N] Cell motility’ and ‘[Q] Secondary metabolites biosynthesis, transport, and catabolism’ were primarily linked to unique and accessory genes. The pan-genome of MTB L4 was found to be open. Furthermore, the GWAS study identified genomic signatures linked to the prevalence of sublineages L4.6.3 and L4.2.2.2. Conclusions Apart from host and environmental factors, the sublineage of L4 employs distinct virulence factors for successful dissemination in western Ethiopia. Given that the functions of these newly identified genes are not well understood, it is advisable to experimentally validate their roles, particularly in the successful transmission of specific L4 sublineages over others.
AB - Background The lineage 4 (L4) of Mycobacterium tuberculosis (MTB) is not only globally prevalent but also locally dominant, surpassing other lineages, with lineage 2 (L2) following in prevalence. Despite its widespread occurrence, factors influencing the expansion of L4 and its sub-lineages remain poorly understood both at local and global levels. Therefore, this study aimed to conduct a pan-genome and identify genomic signatures linked to the elevated prevalence of L4 sublineages among extrapulmonary TB (EPTB) patients in western Ethiopia. Methods A cross-sectional study was conducted at an institutional level involving confirmed cases of extrapulmonary tuberculosis (EPTB) patients from August 5, 2018, to December 30, 2019. A total of 75 MTB genomes, classified under lineage 4 (L4), were used for conducting pan-genome and genome-wide association study (GWAS) analyses. After a quality check, variants were identified using MTBseq, and genomes were de novo assembled using SPAdes. Gene prediction and annotation were performed using Prokka. The pangenome was constructed using GET_HOMOLOGUES, and its functional analysis was carried out with the Bacterial Pan-Genome Analysis tool (BPGA). For GWAS analysis, Scoary was employed with Benjamini-Hochberg correction, with a significance threshold set at p-value 0.05. Results The analysis revealed a total of 3,270 core genes, predominantly associated with orthologous groups (COG) functions, notably in the categories of ‘[R] General function prediction only’ and ‘[I] Lipid transport and metabolism’. Conversely, functions related to ‘[N] Cell motility’ and ‘[Q] Secondary metabolites biosynthesis, transport, and catabolism’ were primarily linked to unique and accessory genes. The pan-genome of MTB L4 was found to be open. Furthermore, the GWAS study identified genomic signatures linked to the prevalence of sublineages L4.6.3 and L4.2.2.2. Conclusions Apart from host and environmental factors, the sublineage of L4 employs distinct virulence factors for successful dissemination in western Ethiopia. Given that the functions of these newly identified genes are not well understood, it is advisable to experimentally validate their roles, particularly in the successful transmission of specific L4 sublineages over others.
UR - http://www.scopus.com/inward/record.url?scp=85199669560&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85199669560&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0304060
DO - 10.1371/journal.pone.0304060
M3 - Article
C2 - 39052555
AN - SCOPUS:85199669560
SN - 1932-6203
VL - 19
JO - PLoS ONE
JF - PLoS ONE
IS - 7 July
M1 - e0304060
ER -