Skip to main content

与 HPO 本体一起使用的 Python 包

项目描述

一个 Python 库,用于处理、分析、过滤和检查人类表型本体

请访问PyHPO 文档,以更详细地了解所有功能。

主要特点

  • 根据临床特征识别患者群组

  • GWAS 聚类患者或其他临床信息

  • 表型到基因型研究

  • HPO相似性分析

  • 基于图表的表型、基因和疾病分析

PyHPO允许处理单个术语HPOTerm、一组术语HPOSet和完整的Ontology

该库有助于发现新的基因疾病关联和 GWAS 数据分析研究。同时,它可用于在研究或诊断环境中组织患者的临床信息。

在内部,本体表示为一个分支链表,每个术语都包含指向其父术语和子术语的指针。这允许快速树遍历功能。

它提供了一个从其数据创建Pandas Dataframe的接口,允许集成到现有的数据分析工具中。

例子

两名患者的表型有多相似

from pyhpo import Ontology

# initilize the Ontology ()
_ = Ontology()

# Declare the clinical information of the patients
patient_1 = HPOSet.from_queries([
    'HP:0002943',
    'HP:0008458',
    'HP:0100884',
    'HP:0002944',
    'HP:0002751'
])

patient_2 = HPOSet.from_queries([
    'HP:0002650',
    'HP:0010674',
    'HP:0000925',
    'HP:0009121'
])

# and compare their similarity
patient_1.similarity(patient_2)
#> 0.7594183905785477

两个 HPO 术语有多接近

from pyhpo import Ontology

# initilize the Ontology ()
_ = Ontology()

term_1 = Ontology.get_hpo_object('Scoliosis')
term_2 = Ontology.get_hpo_object('Abnormal axial skeleton morphology')

path = term_1.path_to_other(term_2)
for t in path[1]:
    print(t)

"""
HP:0002650 | Scoliosis
HP:0010674 | Abnormality of the curvature of the vertebral column
HP:0000925 | Abnormality of the vertebral column
HP:0009121 | Abnormal axial skeleton morphology
"""

入门

安装PyHPO最简单的方法是通过 pip

pip install pyhpo

或者,您可以额外安装可选包以获得额外功能

# Include pandas during install
pip install pyhpo[pandas]

# Include scipy
pip install pyhpo[scipy]

# Include all dependencies
pip install pyhpo[all]

使用示例

HPO术语

HPOTerm包含有关该术语的各种元数据,以及指向其父项和子项的指针您可以访问其信息内容,计算与其他术语的相似度分数,找到两个术语之间的最短或最长连接。列出所有相关的基因或疾病等。

例子:

HPO-Term 的基本功能

from pyhpo import Ontology

# initilize the Ontology ()
_ = Ontology()

# Retrieve a term e.g. via its HPO-ID
term = Ontology.get_hpo_object('Scoliosis')

print(term)
#> HP:0002650 | Scoliosis

# Get information content from Term <--> Omim associations
term.information_content['omim']
#> 2.39

# Show how many genes are associated to the term
# (Note that this includes indirect associations, associations
# from children terms to genes.)
len(term.genes)
#> 947

# Show how many Omim Diseases are associated to the term
# (Note that this includes indirect associations, associations
# from children terms to diseases.)
len(term.omim_diseases)
#> 730

# Get a list of all parent terms
for p in term.parents:
    print(p)
#> HP:0010674 | Abnormality of the curvature of the vertebral column

# Get a list of all children terms
for p in term.children:
    print(p)
"""
HP:0002943 | Thoracic scoliosis
HP:0008458 | Progressive congenital scoliosis
HP:0100884 | Compensatory scoliosis
HP:0002944 | Thoracolumbar scoliosis
HP:0002751 | Kyphoscoliosis
"""

(这个脚本是完整的,它应该“按原样”运行)

一些附加功能,使用多个术语

from pyhpo import Ontology
_ = Ontology()
term = Ontology.get_hpo_object('Scoliosis')

# Let's get a second term, this time using it HPO-ID
term_2 = Ontology.get_hpo_object('HP:0009121')

print(term_2)
#> HP:0009121 | Abnormal axial skeleton morphology

# Check if the Scoliosis is a direct or indirect child
# of Abnormal axial skeleton morphology

term.child_of(term_2)
#> True

# or vice versa
term_2.parent_of(term)
#> True

# show all nodes between two term:
path = term.path_to_other(term_2)
for t in path[1]:
    print(t)

"""
HP:0002650 | Scoliosis
HP:0010674 | Abnormality of the curvature of the vertebral column
HP:0000925 | Abnormality of the vertebral column
HP:0009121 | Abnormal axial skeleton morphology
"""

print(f'Steps from Term 1 to Term 2: {path[0]}')
#> Steps from Term 1 to Term 2: 3


# Calculate the similarity between two terms
term.similarity_score(term_2)
#> 0.442

(这个脚本是完整的,它应该“按原样”运行)

本体论

Ontology包含所有 HPO 术语、它们之间的联系以及与基因和疾病的关联。它为HPOTerm搜索功能提供了一些帮助函数

例子

from pyhpo import Ontology, HPOSet

# initilize the Ontology (this must be done only once)
_ = Ontology()

# Get a term based on its name
term = Ontology.get_hpo_object('Scoliosis')
print(term)
#> HP:0002650 | Scoliosis

# ...or based on HPO-ID
term = Ontology.get_hpo_object('HP:0002650')
print(term)
#> HP:0002650 | Scoliosis

# ...or based on its index
term = Ontology.get_hpo_object(2650)
print(term)
#> HP:0002650 | Scoliosis

# shortcut to retrieve a term based on its index
term = Ontology[2650]
print(term)
#> HP:0002650 | Scoliosis

# Search for term
for term in Ontology.search('olios'):
    print(term)

"""
HP:0002211 | White forelock
HP:0002290 | Poliosis
HP:0002650 | Scoliosis
HP:0002751 | Kyphoscoliosis
HP:0002943 | Thoracic scoliosis
HP:0002944 | Thoracolumbar scoliosis
HP:0003423 | Thoracolumbar kyphoscoliosis
HP:0004619 | Lumbar kyphoscoliosis
HP:0004626 | Lumbar scoliosis
HP:0005659 | Thoracic kyphoscoliosis
HP:0008453 | Congenital kyphoscoliosis
HP:0008458 | Progressive congenital scoliosis
HP:0100884 | Compensatory scoliosis
"""

(这个脚本是完整的,它应该“按原样”运行)

Ontology 是一个单例,应该只启动一次。它可以跨多个模块重用,例如:

主文件

from pyhpo import Ontology, HPOSet

import module2

# initilize the Ontology
_ = Ontology()

if __name__ == '__main__':
    module2.find_term('Compensatory scoliosis')

模块2.py

from pyhpo import Ontology

def find_term(term):
    return Ontology.get_hpo_object(term)

HPOSet

HPOSet是HPOTerm 的集合,用于表示例如患者的临床信息。它提供用于过滤、与其他HPOSet和术语/基因/疾病富集比较的 API。

例子:

from pyhpo import Ontology, HPOSet

# initilize the Ontology
_ = Ontology()

# create HPOSets, corresponding to
# e.g. the clinical information of a patient
# You can initiate an HPOSet using either
# - HPO-ID: 'HP:0002943'
# - HPO-Name: 'Scoliosis'
# - HPO-ID (int): 2943

ci_1 = HPOSet.from_queries([
    'HP:0002943',
    'HP:0008458',
    'HP:0100884',
    'HP:0002944',
    'HP:0002751'
])

ci_2 = HPOSet.from_queries([
    'HP:0002650',
    'HP:0010674',
    'HP:0000925',
    'HP:0009121'
])

# Compare the similarity
ci_1.similarity(ci_2)
#> 0.7593552670152157

# Remove all non-leave nodes from a set
ci_leaf = ci_2.child_nodes()
len(ci_2)
#> 4
len(ci_leaf)
#> 1
ci_2
#> HPOSet.from_serialized("925+2650+9121+10674")
ci_leaf
#> HPOSet.from_serialized("2650")

# Check the information content of an HPOSet
ci_1.information_content()
"""
{
    'mean': 6.571224974009769,
    'total': 32.856124870048845,
    'max': 8.97979449089521,
    'all': [5.98406221734122, 8.286647310335265, 8.97979449089521, 5.5458072864100645, 4.059813565067086]
}
"""

(这个脚本是完整的,它应该“按原样”运行)

获取富含HPOSet的基因

例子:

from pyhpo import Ontology, HPOSet
from pyhpo.stats import EnrichmentModel

# initilize the Ontology
_ = Ontology()

ci = HPOSet.from_queries([
    'HP:0002943',
    'HP:0008458',
    'HP:0100884',
    'HP:0002944',
    'HP:0002751'
])

gene_model = EnrichmentModel('gene')
genes = gene_model.enrichment(method='hypergeom', hposet=ci)

print(genes[0]['item'])
#> PAPSS2

(这个脚本是完整的,它应该“按原样”运行)

有关如何使用 PyHPO 的更详细说明,请访问PyHPO 文档

贡献

是的,请这样做。我们感谢任何帮助、改进建议或其他反馈。只需创建一个拉取请求或打开一个问题。

执照

PyHPO 是在MIT 许可下发布的。

PyHPO 正在使用人类表型本体。在http://www.human-phenotype-ontology.org了解更多信息

Sebastian Köhler、Leigh Carmody、Nicole Vasilevsky、Julius OB Jacobsen 等人。扩展人类表型本体 (HPO) 知识库和资源。核酸研究。(2018) doi: 10.1093/nar/gky1105

下载文件

下载适用于您平台的文件。如果您不确定要选择哪个,请了解有关安装包的更多信息。

源分布

pyhpo-3.1.2.tar.gz (14.5 MB 查看哈希

已上传 source

内置分布

pyhpo-3.1.2-py3-none-any.whl (15.1 MB 查看哈希

已上传 py3