与 HPO 本体一起使用的 Python 包
项目描述
一个 Python 库,用于处理、分析、过滤和检查人类表型本体
请访问PyHPO 文档,以更详细地了解所有功能。
主要特点
根据临床特征识别患者群组
GWAS 聚类患者或其他临床信息
表型到基因型研究
HPO相似性分析
基于图表的表型、基因和疾病分析
PyHPO允许处理单个术语HPOTerm、一组术语HPOSet和完整的Ontology。
该库有助于发现新的基因疾病关联和 GWAS 数据分析研究。同时,它可用于在研究或诊断环境中组织患者的临床信息。
在内部,本体表示为一个分支链表,每个术语都包含指向其父术语和子术语的指针。这允许快速树遍历功能。
它提供了一个从其数据创建Pandas Dataframe的接口,允许集成到现有的数据分析工具中。
例子
两名患者的表型有多相似
from pyhpo import Ontology
# initilize the Ontology ()
_ = Ontology()
# Declare the clinical information of the patients
patient_1 = HPOSet.from_queries([
'HP:0002943',
'HP:0008458',
'HP:0100884',
'HP:0002944',
'HP:0002751'
])
patient_2 = HPOSet.from_queries([
'HP:0002650',
'HP:0010674',
'HP:0000925',
'HP:0009121'
])
# and compare their similarity
patient_1.similarity(patient_2)
#> 0.7594183905785477
两个 HPO 术语有多接近
from pyhpo import Ontology
# initilize the Ontology ()
_ = Ontology()
term_1 = Ontology.get_hpo_object('Scoliosis')
term_2 = Ontology.get_hpo_object('Abnormal axial skeleton morphology')
path = term_1.path_to_other(term_2)
for t in path[1]:
print(t)
"""
HP:0002650 | Scoliosis
HP:0010674 | Abnormality of the curvature of the vertebral column
HP:0000925 | Abnormality of the vertebral column
HP:0009121 | Abnormal axial skeleton morphology
"""
入门
安装PyHPO最简单的方法是通过 pip
pip install pyhpo
或者,您可以额外安装可选包以获得额外功能
# Include pandas during install
pip install pyhpo[pandas]
# Include scipy
pip install pyhpo[scipy]
# Include all dependencies
pip install pyhpo[all]
使用示例
HPO术语
HPOTerm包含有关该术语的各种元数据,以及指向其父项和子项的指针。您可以访问其信息内容,计算与其他术语的相似度分数,找到两个术语之间的最短或最长连接。列出所有相关的基因或疾病等。
例子:
HPO-Term 的基本功能
from pyhpo import Ontology
# initilize the Ontology ()
_ = Ontology()
# Retrieve a term e.g. via its HPO-ID
term = Ontology.get_hpo_object('Scoliosis')
print(term)
#> HP:0002650 | Scoliosis
# Get information content from Term <--> Omim associations
term.information_content['omim']
#> 2.39
# Show how many genes are associated to the term
# (Note that this includes indirect associations, associations
# from children terms to genes.)
len(term.genes)
#> 947
# Show how many Omim Diseases are associated to the term
# (Note that this includes indirect associations, associations
# from children terms to diseases.)
len(term.omim_diseases)
#> 730
# Get a list of all parent terms
for p in term.parents:
print(p)
#> HP:0010674 | Abnormality of the curvature of the vertebral column
# Get a list of all children terms
for p in term.children:
print(p)
"""
HP:0002943 | Thoracic scoliosis
HP:0008458 | Progressive congenital scoliosis
HP:0100884 | Compensatory scoliosis
HP:0002944 | Thoracolumbar scoliosis
HP:0002751 | Kyphoscoliosis
"""
(这个脚本是完整的,它应该“按原样”运行)
一些附加功能,使用多个术语
from pyhpo import Ontology
_ = Ontology()
term = Ontology.get_hpo_object('Scoliosis')
# Let's get a second term, this time using it HPO-ID
term_2 = Ontology.get_hpo_object('HP:0009121')
print(term_2)
#> HP:0009121 | Abnormal axial skeleton morphology
# Check if the Scoliosis is a direct or indirect child
# of Abnormal axial skeleton morphology
term.child_of(term_2)
#> True
# or vice versa
term_2.parent_of(term)
#> True
# show all nodes between two term:
path = term.path_to_other(term_2)
for t in path[1]:
print(t)
"""
HP:0002650 | Scoliosis
HP:0010674 | Abnormality of the curvature of the vertebral column
HP:0000925 | Abnormality of the vertebral column
HP:0009121 | Abnormal axial skeleton morphology
"""
print(f'Steps from Term 1 to Term 2: {path[0]}')
#> Steps from Term 1 to Term 2: 3
# Calculate the similarity between two terms
term.similarity_score(term_2)
#> 0.442
(这个脚本是完整的,它应该“按原样”运行)
本体论
Ontology包含所有 HPO 术语、它们之间的联系以及与基因和疾病的关联。它为HPOTerm搜索功能提供了一些帮助函数
例子
from pyhpo import Ontology, HPOSet
# initilize the Ontology (this must be done only once)
_ = Ontology()
# Get a term based on its name
term = Ontology.get_hpo_object('Scoliosis')
print(term)
#> HP:0002650 | Scoliosis
# ...or based on HPO-ID
term = Ontology.get_hpo_object('HP:0002650')
print(term)
#> HP:0002650 | Scoliosis
# ...or based on its index
term = Ontology.get_hpo_object(2650)
print(term)
#> HP:0002650 | Scoliosis
# shortcut to retrieve a term based on its index
term = Ontology[2650]
print(term)
#> HP:0002650 | Scoliosis
# Search for term
for term in Ontology.search('olios'):
print(term)
"""
HP:0002211 | White forelock
HP:0002290 | Poliosis
HP:0002650 | Scoliosis
HP:0002751 | Kyphoscoliosis
HP:0002943 | Thoracic scoliosis
HP:0002944 | Thoracolumbar scoliosis
HP:0003423 | Thoracolumbar kyphoscoliosis
HP:0004619 | Lumbar kyphoscoliosis
HP:0004626 | Lumbar scoliosis
HP:0005659 | Thoracic kyphoscoliosis
HP:0008453 | Congenital kyphoscoliosis
HP:0008458 | Progressive congenital scoliosis
HP:0100884 | Compensatory scoliosis
"""
(这个脚本是完整的,它应该“按原样”运行)
Ontology 是一个单例,应该只启动一次。它可以跨多个模块重用,例如:
主文件
from pyhpo import Ontology, HPOSet
import module2
# initilize the Ontology
_ = Ontology()
if __name__ == '__main__':
module2.find_term('Compensatory scoliosis')
模块2.py
from pyhpo import Ontology
def find_term(term):
return Ontology.get_hpo_object(term)
HPOSet
HPOSet是HPOTerm 的集合,可用于表示例如患者的临床信息。它提供用于过滤、与其他HPOSet和术语/基因/疾病富集比较的 API。
例子:
from pyhpo import Ontology, HPOSet
# initilize the Ontology
_ = Ontology()
# create HPOSets, corresponding to
# e.g. the clinical information of a patient
# You can initiate an HPOSet using either
# - HPO-ID: 'HP:0002943'
# - HPO-Name: 'Scoliosis'
# - HPO-ID (int): 2943
ci_1 = HPOSet.from_queries([
'HP:0002943',
'HP:0008458',
'HP:0100884',
'HP:0002944',
'HP:0002751'
])
ci_2 = HPOSet.from_queries([
'HP:0002650',
'HP:0010674',
'HP:0000925',
'HP:0009121'
])
# Compare the similarity
ci_1.similarity(ci_2)
#> 0.7593552670152157
# Remove all non-leave nodes from a set
ci_leaf = ci_2.child_nodes()
len(ci_2)
#> 4
len(ci_leaf)
#> 1
ci_2
#> HPOSet.from_serialized("925+2650+9121+10674")
ci_leaf
#> HPOSet.from_serialized("2650")
# Check the information content of an HPOSet
ci_1.information_content()
"""
{
'mean': 6.571224974009769,
'total': 32.856124870048845,
'max': 8.97979449089521,
'all': [5.98406221734122, 8.286647310335265, 8.97979449089521, 5.5458072864100645, 4.059813565067086]
}
"""
(这个脚本是完整的,它应该“按原样”运行)
获取富含HPOSet的基因
例子:
from pyhpo import Ontology, HPOSet
from pyhpo.stats import EnrichmentModel
# initilize the Ontology
_ = Ontology()
ci = HPOSet.from_queries([
'HP:0002943',
'HP:0008458',
'HP:0100884',
'HP:0002944',
'HP:0002751'
])
gene_model = EnrichmentModel('gene')
genes = gene_model.enrichment(method='hypergeom', hposet=ci)
print(genes[0]['item'])
#> PAPSS2
(这个脚本是完整的,它应该“按原样”运行)
有关如何使用 PyHPO 的更详细说明,请访问PyHPO 文档。
贡献
是的,请这样做。我们感谢任何帮助、改进建议或其他反馈。只需创建一个拉取请求或打开一个问题。
执照
PyHPO 是在MIT 许可下发布的。
PyHPO 正在使用人类表型本体。在http://www.human-phenotype-ontology.org了解更多信息
Sebastian Köhler、Leigh Carmody、Nicole Vasilevsky、Julius OB Jacobsen 等人。扩展人类表型本体 (HPO) 知识库和资源。核酸研究。(2018) doi: 10.1093/nar/gky1105
项目详情
下载文件
下载适用于您平台的文件。如果您不确定要选择哪个,请了解有关安装包的更多信息。