用于处理帧语义解析数据的 python 模块

项目描述

pyfn

欢迎来到pyfn处理 FrameNet 注释的 Python 模块。

pyfn可用于：

在 FRAMENET XML、SEMEVAL XML、SEMAFOR CoNLL、BIOS 和 CoNLL-X 之间转换数据
使用标准化的最先进管道预处理FrameNet 数据
在 FrameNet 1.5、1.6 和 1.7 数据集上运行SEMAFOR、OPEN-SESAME 和 SIMPLEFRAMEID 帧语义解析器以进行帧和/或参数识别
使用一组标准的 Python 模型构建您自己的帧语义解析器来编组/解组 FrameNet XML 数据

该存储库还随附（Kabbach 等人，2018 年）论文：

@InProceedings{C18-1267,
  author = 	"Kabbach, Alexandre
		and Ribeyre, Corentin
		and Herbelot, Aur{\'e}lie",
  title = 	"Butterfly Effects in Frame Semantic Parsing: impact of data processing on model ranking",
  booktitle = 	"Proceedings of the 27th International Conference on Computational Linguistics",
  year = 	"2018",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"3158--3169",
  location = 	"Santa Fe, New Mexico, USA",
  url = 	"http://aclweb.org/anthology/C18-1267"
}

依赖项

在 Unix 上，您可能需要安装以下软件包：

libxml2 libxml2-dev libxslt1-dev python-3.x-dev

安装

pip3 install pyfn

利用

使用时pyfn，您的 FrameNet 拆分目录结构应如下所示：

.
|-- fndata-1.x-with-dev
|   |-- train
|   |   |-- fulltext
|   |   |-- lu
|   |-- dev
|   |   |-- fulltext
|   |   |-- lu
|   |-- test
|   |   |-- fulltext
|   |   |-- lu
|   |-- frame
|   |-- frRelation.xml
|   |-- semTypes.xml

转换

pyfn可用于将数据转换为：

FRAMENET XML：发布的FrameNet XML数据的格式
SEMEVAL XML：SEMEVAL 2007 共享任务 19 框架语义结构提取的格式
SEMAFOR CoNLL：SEMAFOR 解析器使用的格式
BIOS：OPEN-SESAME 解析器使用的格式
CoNLL-X：各种最先进的 POS 标记器和依赖解析器使用的格式（请参阅下面的帧语义解析的预处理注意事项）

以及生成.csvSEMAFOR 和 OPEN-SESAME 解析器使用的层次结构文件以集成层次结构特征（详见 (Kshirsagar et al., 2015)）。

有关所有格式的详尽描述，请查看FORMAT.md。

如何

以下部分提供了将 FN 数据转换为不同格式的命令示例。所有命令都可以使用以下选项：

--splits：指定应转换哪些拆分。--splits train将根据在 fndata-1.x/{train/dev/test} 目录下找到的数据生成所有 train/dev/test 拆分。--splits dev将根据 fndata-1.x/{dev/test} 目录下的数据生成开发和测试拆分。此选项将跳过训练拆分，但生成与使用--splits train. --splits test将根据在 fndata-1.x/test 目录下找到的数据生成测试拆分，并跳过 train/dev 拆分。使用生成的测试拆分与使用和生成的测试拆分--splits test相同。默认为。--splits train--splits dev--splits test
--output_sentences：如果指定，将.sentences在进程中输出一个文件，包含所有原始注释句子，每行一个句子。
--with_exemplars: 如果指定，将处理lu除全文之外的示例（目录下的数据）。
--filter：指定数据过滤选项（详见下文）。

有关pyfn使用的详细信息，请执行以下操作：

pyfn --help
pyfn generate --help
pyfn convert --help

从 FN XML 到 BIOS

要将数据从 FrameNet XML 格式转换为 BIOS 格式，请执行以下操作：

pyfn convert \
  --from fnxml \
  --to bios \
  --source /abs/path/to/fndata-1.x \
  --target /abs/path/to/xp/data/output/dir \
  --splits train \
  --output_sentences \
  --filter overlap_fes

使用--filter overlap_fes将跳过所有具有重叠框架元素的注释集，因为 BIOS 格式不支持这些情况。

从 FN XML 到 SEMAFOR CoNLL

要生成train.frame.elements用于训练 SEMAFOR 的 {dev,test}.frames文件和用于解码的文件，请执行以下操作：

pyfn convert \
  --from fnxml \
  --to semafor \
  --source /abs/path/to/fndata-1.x \
  --target /abs/path/to/xp/data/output/dir \
  --splits train \
  --output_sentences

从 FN XML 到 SEMEVAL XML

要生成{dev,test}.gold.xmlSEMEVAL 格式的黄金文件进行评分，请执行以下操作：

pyfn convert \
  --from fnxml \
  --to semeval \
  --source /abs/path/to/fndata-1.x \
  --target /abs/path/to/xp/data/output/dir \
  --splits {dev,test}

从 BIOS 到 SEMEVAL XML

要将{dev,test}.bios.semeval.decodedOPEN-SESAME 的解码 BIOS 文件转换为 SEMEVAL XML 格式进行评分，请执行以下操作：

pyfn convert \
  --from bios \
  --to semeval \
  --source /abs/path/to/{dev,test}.bios.semeval.decoded \
  --target /abs/path/to/output/{dev,test}.predicted.xml \
  --sent /abs/path/to/{dev,test}.sentences

从 SEMAFOR CoNLL 到 SEMEVAL XML

要将 SEMAFOR 的解码{dev,test}.frame.elements文件转换为 SEMEVAL XML 格式进行评分，请执行以下操作：

pyfn convert \
  --from semafor \
  --to semeval \
  --source /abs/path/to/{dev,test}.frame.elements \
  --target /abs/path/to/output/{dev,test}.predicted.xml \
  --sent /abs/path/to/{dev,test}.sentences

生成层次结构`.csv`文件

pyfn generate \
  --source /abs/path/to/fndata-1.x \
  --target /abs/path/to/xp/data/output/dir

要同时处理示例，请添加--with_exemplars选项

预处理和帧语义解析

pyfn附带一组 bash 脚本，以使用各种 POS 标记器和依赖解析器预处理 FrameNet 数据，以及使用各种开源解析器执行帧语义解析。

目前支持的词性标注器包括：

MXPOST (Ratnaparkhi, 1996)
NLP4J（崔，2016）

当前支持的依赖解析器包括：

MST（麦当劳等人，2006）
BIST BARCH（Kiperwasser 和 Goldberg，2016 年）
BIST BMST（Kiperwasser 和 Goldberg，2016 年）

当前支持的帧语义解析器包括：

SIMPLEFRAMEID (Hartmann et al., 2017) 用于帧识别
SEMAFOR (Kshirsagar et al., 2015) 用于参数识别
OPEN-SESAME (Swayamdipta et al., 2017) 用于论证识别

要请求对 POS 标记器、依赖解析器或框架语义解析器的支持，请在 Github/Gitlab 上创建问题。

下载

要运行预处理和帧语义解析脚本，首先下载：

data.7z包含 FN 1.5 和 FN 1.7 的所有 FrameNet 拆分

wget backup.3azouz.net/pyfn/data.7z

lib.7z包含所有不同的外部软件（标记器、解析器等）

wget backup.3azouz.net/pyfn/lib.7z

resources.7z包含所有必需的资源

wget backup.3azouz.net/pyfn/resources.7z

scripts.7z包含一组 bash 脚本以调用不同的解析器和预处理工具包

wget backup.3azouz.net/pyfn/scripts.7z

提取目录下所有档案的内容pyfn。您的 pyfn 文件夹结构应如下所示：

.
|-- pyfn
|   |-- data
|   |   |-- fndata-1.5-with-dev
|   |   |-- fndata-1.7-with-dev
|   |-- lib
|   |   |-- bistparser
|   |   |-- jmx
|   |   |-- mstparser
|   |   |-- nlp4j
|   |   |-- open-sesame
|   |   |-- semafor
|   |   |-- semeval
|   |-- resources
|   |   |-- bestarchybrid.model
|   |   |-- bestarchybrid.params
|   |   |-- bestfirstorder.model
|   |   |-- bestfirstorder.params
|   |   |-- config-decode-pos.xml
|   |   |-- nlp4j.plemma.model.all.xz
|   |   |-- sskip.100.vectors
|   |   |-- wsj.model
|   |-- scripts
|   |   |-- CoNLLizer.py
|   |   |-- deparse.sh
|   |   |-- flatten.sh
|   |   |-- ...

请严格遵循此目录结构，以免出现意外错误。pyfn依赖于许多相对路径分辨率来使脚本调用更短，并且更改此目录结构可能会破坏一切

为 POS 标记设置 NLP4J

要将 NLP4J 用于 POS 标记，请resources/config-decode-pos.xml 通过将 models.pos 绝对路径替换为您的文件来修改文件resources/nlp4j.plemma.model.all.xz：

<configuration>
	...
	<models>
		<pos>/absolute/path/to/pyfn/resources/nlp4j.plemma.model.all.xz</pos>
	</models>
</configuration>

为 BIST 或 OPEN-SESAME 设置 DyNET

如果您打算使用 BIST 解析器进行依赖解析或使用 OPEN-SESAME 进行帧语义解析，则需要通过以下方式安装 DyNET 2.0.2：

pip install dynet=2.0.2

如果您在通过 pip 安装 DyNET 时遇到问题，请执行以下操作：

https://dynet.readthedocs.io/en/2.0.2/python.html

设置 SEMAFOR

要使用 SEMAFOR 帧语义解析器，请修改scripts/setup.sh文件：

# SEMAFOR options to be changed according to your env
export JAVA_HOME_BIN="/abs/path/to/java/jdk/bin"
export num_threads=2 # number of threads to use
export min_ram=4g # min RAM allocated to the JVM in GB. Corresponds to the -Xms argument
export max_ram=8g # max RAM allocated to the JVM in GB. Corresponds to the -Xmx argument

# SEMAFOR hyperparameters
export kbest=1 # keep k-best parse
export lambda=0.000001 # hyperparameter for argument identification. Refer to Kshirsagar et al. (2015) for details.
export batch_size=4000 # number of batches processed at once for argument identification.
export save_every_k_batches=400 # for argument identification
export num_models_to_save=60 # for argument identification

设置 SIMPLEFRAMEID

如果您打算使用 SIMPLEFRAMEID 进行帧识别，则需要安装以下软件包（在 python 2.7 上）：

pip install keras==2.0.6 lightfm==1.13 sklearn numpy==1.13.1 networkx==1.11 tensorflow==1.3.0

使用 SEMEVAL PERL 评估脚本

如果您打算使用 SEMEVAL perl 评估脚本，请确保安装了App::cpanminus和XML::Parser模块：

cpan App::cpanminus
cpanm XML::Parser

使用 bash 脚本

每个脚本都有一个助手：用--help!

小心！大多数脚本期望数据输出pyfn convert ... 位于代表实验编号并使用参数指定的位置下，并且目录与pyfn/experiments/xp_XYZ/data目录位于同一级别。事实证明，这种固执己见的选择在大量实验中批量启动脚本时非常有用，因为它避免了每次都输入完整路径。XYZ-x XYZexperimentsscripts

确保使用

pyfn convert \
  --from ... \
  --to ... \
  --source ... \
  --target /abs/path/to/pyfn/experiments/xp_XYZ/data \
  --splits ...

在调用preprocess.sh, prepare.sh,semafor.sh或之前open-sesame.sh

预处理.sh

用于preprocess.shPOS 标记和依赖解析使用pyfn convert .... 助手应显示：

Usage: ${0##*/} [-h] -x XP_NUM -t {mxpost,nlp4j} -p {semafor,open-sesame} [-d {mst,bmst,barch}] [-v]
Preprocess FrameNet train/dev/test splits.

  -h, --help                           display this help and exit
  -x, --xp      XP_NUM                 xp number written as 3 digits (e.g. 001)
  -t, --tagger  {mxpost,nlp4j}         pos tagger to be used: 'mxpost' or 'nlp4j'
  -p, --parser  {semafor,open-sesame}  frame semantic parser to be used: 'semafor' or 'open-sesame'
  -d, --dep     {mst,bmst,barch}       dependency parser to be used: 'mst', 'bmst' or 'barch'
  -v, --dev                            if set, script will also preprocess dev splits

假设您使用以下方法为 SEMAFOR 生成 FrameNet 分割：

pyfn convert \
  --from fnxml \
  --to semafor \
  --source /path/to/fndata-1.7-with-dev \
  --target /path/to/experiments/xp_001/data \
  --splits train \
  --output_sentences

您可以使用 NLP4J 和 BMST 预处理这些拆分

./preprocess.sh -x 001 -t nlp4j -d bmst -p semafor

准备.sh

用于prepare.sh自动生成 misc。框架语义解析管道所需的数据，例如用于评分的黄金 SEMEVAL XML 文件，SEMAFOR 使用的framenet.frame.element.map和层次.csv文件，或SEMAFOR 和 OPEN-SESAME 使用的frames.xml和文件。frRelations.xml助手应显示：

Usage: ${0##*/} [-h] -x XP_NUM -p {semafor,open-sesame} -s {dev,test} -f FN_DATA_DIR [-u] [-e]
Prepare misc. data for frame semantic parsing.

  -h, --help                                   display this help and exit
  -x, --xp              XP_NUM                 xp number written as 3 digits (e.g. 001)
  -p, --parser          {semafor,open-sesame}  frame semantic parser to be used: 'semafor' or 'open-sesame'
  -s, --splits          {dev,test}             which splits to score: dev or test
  -f, --fn              FN_DATA_DIR            absolute path to FrameNet data directory
  -u, --with_hierarchy                         if specified, will use the hierarchy feature
  -e, --with_exemplars                         if specified, will use the exemplars

假设您使用以下方法为 SEMAFOR 生成 FrameNet 分割：

pyfn convert \
  --from fnxml \
  --to semafor \
  --source /path/to/fndata-1.7-with-dev \
  --target /path/to/experiments/xp_001/data \
  --splits train \
  --output_sentences

您可以使用以下方法准备 SEMAFOR 数据：

./prepare.sh -x 001 -p semafor -s test -f /path/to/fndata-1.7-with-dev

框架ID.sh

用于使用frameid.shSIMPLEFRAMEID 执行帧识别。助手应显示：

Usage: ${0##*/} [-h] -m {train,decode} -x XP_NUM [-p {semafor,open-sesame}]
Perform frame identification.

  -h, --help                            display this help and exit
  -m, --mode                            train on all models or decode using a single model
  -x, --xp       XP_NUM                 xp number written as 3 digits (e.g. 001)
  -p, --parser   {semafor,open-sesame}  formalize decoded frames for specified parser

假设您使用以下方法为 SEMAFOR 生成 FrameNet 分割：

pyfn convert \
  --from fnxml \
  --to semafor \
  --source /path/to/fndata-1.7-with-dev \
  --target /path/to/experiments/xp_101/data \
  --splits train \
  --output_sentences

预处理后，您可以使用以下方法训练 SIMPLEFRAMEID 解析器：

./frameid.sh -m train -x 101

和解码（在解码参数标识之前）使用：

./frameid.sh -m decode -x 101 -p semafor

semafor.sh

用于semafor.sh训练 SEMAFOR 解析器或解码测试/开发拆分。助手应显示：

Usage: ${0##*/} [-h] -m {train,decode} -x XP_NUM [-s {dev,test}] [-u]
Train or decode with the SEMAFOR parser.

  -h, --help                             display this help and exit
  -m, --mode            {train,decode}   semafor mode to use: train or decode
  -x, --xp              XP_NUM           xp number written as 3 digits (e.g. 001)
  -s, --splits          {dev,test}       which splits to use in decode mode: dev or test
  -u, --with_hierarchy                   if specified, parser will use the hierarchy feature

假设您使用以下方法为 SEMAFOR 生成 FrameNet 分割：

pyfn convert \
  --from fnxml \
  --to semafor \
  --source /path/to/fndata-1.7-with-dev \
  --target /path/to/experiments/xp_001/data \
  --splits train \
  --output_sentences

在预处理和准备之后，您可以使用以下方法训练 SEMAFOR 解析器：

./semafor.sh -m train -x 001

并使用以下方法解码测试拆分：

./semafor.sh -m decode -x 001 -s test

芝麻开门.sh

用于open-sesame.sh训练 OPEN-SESMAE 解析器或解码测试/开发拆分。助手应显示：

Usage: ${0##*/} [-h] -m {train,decode} -x XP_NUM [-s {dev,test}] [-d] [-u]
Train or decode with the OPEN-SESAME parser.

  -h, --help                              display this help and exit
  -m, --mode              {train,decode}  open-sesame mode to use: train or decode
  -x, --xp                XP_NUM          xp number written as 3 digits (e.g. 001)
  -s, --splits            {dev,test}      which splits to use in decode mode: dev or test
  -d, --with_dep_parses                   if specified, parser will use dependency parses
  -u, --with_hierarchy                    if specified, parser will use the hierarchy feature

假设您使用以下方法为 OPEN-SESAME 生成 FrameNet 分割：

pyfn convert \
  --from fnxml \
  --to bios \
  --source /path/to/fndata-1.7-with-dev \
  --target /path/to/experiments/xp_002/data \
  --splits train \
  --output_sentences \
  --filter overlap_fes

在预处理和准备之后，您可以使用以下方法训练 SEMAFOR 解析器：

./open-sesame.sh -m train -x 002

并使用以下方法解码测试拆分：

./open-sesame.sh -m decode -x 002 -s test

分数.sh

使用score.shSEMEVAL 评分脚本，使用预测帧的黄金来获取用于在开发/测试拆分上进行帧语义解析的 P/R/F1 分数。助手应显示：

Usage: ${0##*/} [-h] -x XP_NUM -p {semafor,open-sesame} -s {dev,test} -f {gold,predicted}
Score frame semantic parsing with a modified version of the SEMEVAL scoring script.

  -h, --help                           display this help and exit
  -x, --xp      XP_NUM                 xp number written as 3 digits (e.g. 001)
  -p, --parser  {semafor,open-sesame}  frame semantic parser to be used: 'semafor' or 'open-sesame'
  -s, --splits  {dev,test}             which splits to score: dev or test
  -f, --frames  {gold,predicted}       score with gold or predicted frames

请注意，评分是使用 SEMEVAL perl 脚本的更新版本完成的，以便在设置中获得更可靠的分数。有关修改的完整说明，请参阅 (Kabbach et al., 2018) 和位于lib/semeval/.

要在测试拆分中使用金框获得 SEMAFOR 的分数，请使用：

./score.sh -x XYZ -p semafor -s test -f gold

要使用测试拆分的预测帧获得 SEMAFOR 的分数，请使用：

./score.sh -x XYZ -p semafor -s test -f predicted

复制

该experiments目录提供了一套详细的说明来复制 (Kabbach et al., 2018) 中报告的关于帧语义解析中的实验蝴蝶效应的所有结果。这些指令可用于比较不同帧语义解析器在各种实验设置中的性能。

编组和解组 FrameNet XML 数据

pyfn提供一组 Python 模型来处理 FrameNet XML 数据。这些可用于帮助您构建自己的框架语义解析器。

pyfn模型的核心是AnnotationSet对应一个 XML<annotationSet>标签。它存储有关给定句子中给定目标的一组给定 FrameNet 注释的各种信息。值得注意的创新是 thelabelstore和 the valenceunitstore，它们以原始格式和自定义格式存储 FrameNet 标签（FE/PT/GF），这可能对帧语义解析有用。

探索包pyfn.models目录下的各种模型pyfn 。

解组 FrameNet XML 数据

要将 fulltext.xml 文件和/或 lu.xml 文件列表转换为pyfn.AnnotationSet对象生成器，并且在 train/dev/test 拆分之间没有重叠，请使用：

import pyfn.marshalling.unmarshallers.framenet as fn_unmarshaller

if __name__ == '__main__':
  splits_dirpath = '/abs/path/to/framenet-1.x-with-dev/'
  splits = 'train'
  with_exemplars = False
  annosets_dict = fn_unmarshaller.get_annosets_dict(splits_dirpath,
                                                    splits, with_exemplars)

splits_dirpath应该指向包含 train/dev/test 拆分目录的目录（参见上面的详细结构）。

get_annosets_dict将返回一个字符串到 AnnotationSet 生成器字典。它将确保训练/开发/测试拆分之间没有重叠。

调用get_annosets_dict将splits='test'返回一个带有单个'test'键的字典。调用get_annosets_dict将splits='dev' 返回一个带有两个键的字典：'dev'和'test'。调用get_annosets_dict将splits='train'返回一个包含三个键的字典'train'：'dev'和'test'。

要遍历每个键的 AnnotationSet 对象列表，您可以执行以下操作：

for (splits, annosets) in annosets_dict.items():
  print('Iterating over annotationsets for splits: {}'.format(splits))
  for annoset in annosets:
    print('annoset with #id = {}'.format(annoset._id))

或者简单地说，迭代特定的键值（例如训练 annosets）：

for annoset in annosets_dict['train']:
    print('annoset with #id = {}'.format(annoset._id))

请注意，为了性能， annosets 不是一个列表，而是一个生成器。

解组 OPEN-SESAME BIOS 数据

要将.bios文件及其对应.sentences文件转换为对象生成器pyfn.AnnotationSet，请使用：

import pyfn.marshalling.unmarshallers.bios as bios_unmarshaller

if __name__ == '__main__':
  bios_filepath = '/abs/path/to/.bios'
  sent_filepath = '/abs/path/to/.sentences'
  annosets = bios_unmarshaller.unmarshall_annosets(bios_filepath,
                                                   sent_filepath)
  for annoset in annosets:
    print('annoset with #id = {}'.format(annoset._id))

重要的！和文件必须是使用参数生成.bios的。.sentencespyfn convert ... --to bios ...--filter overlap_fes

解组 SEMAFOR CONLL 数据

要将.frame.elements文件及其对应.sentences 文件转换为对象生成器pyfn.AnnotationSet，请使用：

import pyfn.marshalling.unmarshallers.semafor as semafor_unmarshaller

if __name__ == '__main__':
  semafor_filepath = '/abs/path/to/.frame.elements'
  sent_filepath = '/abs/path/to/.sentences'
  annosets = semafor_unmarshaller.unmarshall_annosets(semafor_filepath,
                                                      sent_filepath)
  for annoset in annosets:
    print('annoset with #id = {}'.format(annoset._id))

解组 SEMEVAL XML 数据

要将 SEMEVAL.xml文件及其对应.sentences 文件转换为对象生成器pyfn.AnnotationSet，请使用：

import pyfn.marshalling.unmarshallers.semeval as semeval_unmarshaller

if __name__ == '__main__':
  xml_filepath = '/abs/path/to/semeval/.xml'
  annosetss = semeval_unmarshaller.unmarshall_annosets(xml_filepath)

默认情况下unmarshall_annosets，SEMEVAL 将返回嵌入注释集的生成器。要迭代单个注释集，请使用：

for annosets in annosetss:
  for annoset in annosets:
    print('annoset with #id = {}'.format(annoset._id))

要返回 annosets 的“平面”列表，请传入flatten=True参数：

import pyfn.marshalling.unmarshallers.semeval as semeval_unmarshaller

if __name__ == '__main__':
  xml_filepath = '/abs/path/to/semeval/.xml'
  annosets = semeval_unmarshaller.unmarshall_annosets(xml_filepath, flatten=True)
  for annoset in annosets:
    print('annoset with #id = {}'.format(annoset._id))

编组到 OPEN-SESAME BIOS

要将 dict of splitsto pyfn.AnnotationSetobjects 转换为 OPEN-SESAME-style .bios，请参阅 pyfn.marshalling.marshallers.bios.marshall_annosets_dict

编组到 SEMAFOR CONLL

要将 dict of splitsto pyfn.AnnotationSetobjects 转换为 SEMAFOR-style .frame.elements，请参阅 pyfn.marshalling.marshallers.semafor.marshall_annosets_dict

编组为 SEMEVAL XML

要将pyfn.AnnotationSet对象列表转换为 SEMEVAL 样式.xml，请参阅pyfn.marshalling.marshallers.semeval.marshall_annosets

编组到 .csv 层次结构

要将关系列表转换为.csv文件，请参阅 pyfn.marshalling.marshallers.hierarchy.marshall_relations

引文

如果您使用，pyfn请引用：

@InProceedings{C18-1267,
  author = 	"Kabbach, Alexandre
		and Ribeyre, Corentin
		and Herbelot, Aur{\'e}lie",
  title = 	"Butterfly Effects in Frame Semantic Parsing: impact of data processing on model ranking",
  booktitle = 	"Proceedings of the 27th International Conference on Computational Linguistics",
  year = 	"2018",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"3158--3169",
  location = 	"Santa Fe, New Mexico, USA",
  url = 	"http://aclweb.org/anthology/C18-1267"
}

项目详情

发布历史发布通知| RSS订阅

这个版本

1.3.13

2020 年 11 月 3 日

1.3.12

2020 年 11 月 3 日

1.3.11

2020 年 11 月 2 日

1.3.10

2020 年 11 月 2 日

1.3.9

2020 年 11 月 2 日

1.3.7

2019 年 6 月 7 日

1.3.6

2019 年 6 月 6 日

1.3.5

2019 年 6 月 6 日

1.3.4

2019 年 6 月 2 日

1.3.3

2019 年 6 月 1 日

1.3.0

2019 年 5 月 3 日

1.2.6

2019 年 5 月 3 日

1.2.5

2019 年 1 月 5 日

1.2.3

2018 年 9 月 18 日

1.2.2

2018 年 9 月 4 日

1.2.1

2018 年 8 月 30 日

1.2.0

pyfn 1.3.13

导航

项目链接

统计数据

Meta

Maintainers

分类

项目描述

pyfn

依赖项

安装

利用

转换

如何

从 FN XML 到 BIOS

从 FN XML 到 SEMAFOR CoNLL

从 FN XML 到 SEMEVAL XML

从 BIOS 到 SEMEVAL XML

从 SEMAFOR CoNLL 到 SEMEVAL XML

生成层次结构.csv文件

预处理和帧语义解析

下载

为 POS 标记设置 NLP4J

为 BIST 或 OPEN-SESAME 设置 DyNET

设置 SEMAFOR

设置 SIMPLEFRAMEID

使用 SEMEVAL PERL 评估脚本

使用 bash 脚本

预处理.sh

准备.sh

框架ID.sh

semafor.sh

芝麻开门.sh

分数.sh

复制

编组和解组 FrameNet XML 数据

解组 FrameNet XML 数据

解组 OPEN-SESAME BIOS 数据

解组 SEMAFOR CONLL 数据

解组 SEMEVAL XML 数据

编组到 OPEN-SESAME BIOS

编组到 SEMAFOR CONLL

编组为 SEMEVAL XML

编组到 .csv 层次结构

引文

项目详情

项目链接

统计数据

元

维护者

分类器

发布历史 发布通知| RSS订阅

生成层次结构`.csv`文件

发布历史发布通知| RSS订阅