用于排名不同的 Python 库（一种学习排名方法）

项目描述

Python 的公平搜索 DELTR

安装

要安装fairsearchdeltr，只需使用pip（或pipenv）：

pip install fairsearchdeltr

而且，就是这样！

在您的代码中使用它

您需要先从包中导入类：

from fairsearchdeltr import Deltr

训练模型

您需要先训练模型，然后才能对文档进行排名。

# import other helper libraries
import pandas as pd
from io import StringIO

# load some train data (this is just a sample - more is better)
train_data_raw = """q_id,doc_id,gender,score,judgment
    1,1,1,0.962650646167003,1
    1,2,0,0.940172822166108,0.98
    1,3,0,0.925288002880488,0.96
    1,4,1,0.896143226020877,0.94
    1,5,0,0.89180775633204,0.92
    1,6,0,0.838704766545679,0.9
    """
train_data = pd.read_csv(StringIO(train_data_raw))

# setup the DELTR object
protected_feature = "gender" # column name of the protected attribute (index after query and document id)
gamma = 1 # value of the gamma parameter
number_of_iterations = 10000 # number of iterations the training should run
standardize = True # let's apply standardization to the features

# create the Deltr object
dtr = Deltr(protected_feature, gamma, number_of_iterations, standardize=standardize)

# train the model
dtr.train(train_data)
>> array([0.02527054, 0.07692437])
# your run should have approximately same results

使用模型进行排名

现在，您可以使用获得的模型对一些数据进行排名。

# load some test/prediction data
prediction_data_raw = """q_id,doc_id,gender,score
    1,7,0,0.9645
    1,8,0,0.9524
    1,9,0,0.9285
    1,10,0,0.8961
    1,11,1,0.8911
    1,12,1,0.8312
    """
prediction_data = pd.read_csv(StringIO(prediction_data_raw))

# use the model to rank the data  
dtr.rank(prediction_data)
>> doc_id  gender  judgement
4      11       1   0.074849
5      12       1   0.063770
0       7       0   0.063486
1       8       0   0.061248
2       9       0   0.056828
3      10       0   0.050836
# the result will be a re-ranked dataframe

该库包含每个函数的足够代码文档。

更深入地检查模型

您可以使用名为的特殊属性检查模型的训练进展情况log。

dtr.log
>> [<TrainStep [1553844278383,[0.01926469 0.00976336],[[-0.00125304 -0.0014605 ]
  [-0.00125304 -0.0014605 ]
  [-0.00125304 -0.0014605 ]
  [-0.00125304 -0.0014605 ]
  [-0.00125304 -0.0014605 ]
  [-0.00125304 -0.0014605 ]],5.999620187652397,0.0]>,
 ...]

返回类中的log对象列表fairsearchdeltr.models.TrainStep。类是训练每个步骤中参数的表示。包含timestamp, omega, omega_gradient, loss, loss_standard, loss_exposure.

发展

克隆此存储库git clone https://github.com/fair-search/fairsearchdeltr-python
将目录更改为克隆存储库的目录cd WHERE_ITS_DOWNLOADED/fairsearchdeltr-python
使用任何 IDE 处理代码

测试

赶紧跑：

python setup.py test

学分

本文描述了 DELTR 算法：

梅克·泽莱克、吉娜-特蕾莎·迪恩、卡洛斯·卡斯蒂略。“减少排名中的不同曝光：一种学习排名方法。” 预印本 arXiv:1805.08716 (2018)。

该库是由Ivan Kitanovski基于该论文开发的。有关详细信息，请参阅许可证文件。

如有任何问题，请联系Mieke Zehlike

也可以看看

您还可以查看ElasticSearch 的 DELTR 和DELTR Java 库。

项目详情

发布历史发布通知| RSS订阅

这个版本

1.0.2

2019 年 6 月 15 日

1.0.1

2019 年 4 月 10 日

1.0.0

2019 年 3 月 31 日

0.0.3

2019 年 3 月 26 日

下载文件

下载适用于您平台的文件。如果您不确定要选择哪个，请了解有关安装包的更多信息。

源分布

fairsearchdeltr-1.0.2.tar.gz （9.6 kB 查看哈希）

已上传 2019 年 6 月 15 日 source

fairsearchdeltr -1.0.2.tar.gz 的哈希值

fairsearchdeltr-1.0.2.tar.gz 的哈希值
算法	哈希摘要
SHA256	`111fb25121df415b3c808e42bdf7fa1f9005b3727fee0fd6cfc71b9a071e481b`
MD5	`1a72be0d5b01c2af9e3180872428e621`
布莱克2-256	`654596d0cb961d727f2052e529e46a9242b9b5078a3bd792c1a49b4b845f98e5`

fairsearchdeltr 1.0.2

导航

项目链接

统计数据

Meta

Maintainers

分类

项目描述

Python 的公平搜索 DELTR

安装

在您的代码中使用它

训练模型

使用模型进行排名

更深入地检查模型

发展

测试

学分

也可以看看

项目详情

项目链接

统计数据

元

维护者

分类器

发布历史发布通知| RSS订阅

下载文件

源分布

fairsearchdeltr 1.0.2

导航

项目链接

统计数据

Meta

Maintainers

分类

项目描述

Python 的公平搜索 DELTR

安装

在您的代码中使用它

训练模型

使用模型进行排名

更深入地检查模型

发展

测试

学分

也可以看看

项目详情

项目链接

统计数据

元

维护者

分类器

发布历史 发布通知| RSS订阅

下载文件

源分布

发布历史发布通知| RSS订阅