Skip to main content

PostgreSQL 注册表,以及盛宴的在线和离线存储

项目描述

盛宴 PostgreSQL 支持

这个 repo 将 PostgreSQL 离线和在线商店添加到Feast

开始使用

安装盛宴:

pip install feast

安装盛宴postgres:

pip install feast-postgres

创建一个功能存储库:

feast init feature_repo
cd feature_repo

网上商城:

配置在线商店编辑feature_store.yaml

project: feature_repo
registry: data/registry.db
provider: local
online_store:
    type: feast_postgres.PostgreSQLOnlineStore # MUST be this value
    host: localhost
    port: 5432                  # Optional, default is 5432
    database: postgres
    db_schema: feature_store    # Optional, default is None
    user: username
    password: password
offline_store:
    ...

运行时feast apply,如果db_schema设置了,则在创建架构时将使用该值,否则架构的名称将是 中的值user。如果架构已经存在,则不会创建架构,但用户必须具有创建表和索引以及删除表和索引的权限。

线下商店:

配置离线商店编辑feature_store.yaml

project: feature_repo
registry: data/registry.db
provider: local
online_store:
    ...
offline_store:
    type: feast_postgres.PostgreSQLOfflineStore # MUST be this value
    host: localhost
    port: 5432              # Optional, default it 5432
    database: postgres
    db_schema: my_schema
    user: username
    password: password

用户需要具有创建和删除表的权限,db_schema因为在查询历史值时将创建临时表。

注册表存储:

配置注册表编辑feature_store.yaml

registry:
    registry_store_type: feast_postgres.PostgreSQLRegistryStore
    path: feast_registry    # This will become the table name for the registry
    host: localhost
    port: 5432              # Optional, default is 5432
    database: postgres
    db_schema: my_schema
    user: username
    password: password

如果架构不存在,用户将需要具有创建它的权限。如果架构存在,用户将只需要创建表的权限。

例子

首先设置 中的值feature_store.yaml。然后用于copy_from_parquet_to_postgres.py创建一个表并使用 Feast 附带的 parquet 文件中的数据填充它。

然后example.py可以用于feature_store。

# This is an example feature definition file

from google.protobuf.duration_pb2 import Duration

from feast import Entity, Feature, FeatureView, ValueType

from feast_postgres import PostgreSQLSource

# Read data from parquet files. Parquet is convenient for local development mode. For
# production, you can use your favorite DWH, such as BigQuery. See Feast documentation
# for more info.
driver_hourly_stats = PostgreSQLSource(
    query="SELECT * FROM driver_stats",
    event_timestamp_column="event_timestamp",
    created_timestamp_column="created",
)

# Define an entity for the driver. You can think of entity as a primary key used to
# fetch features.
driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",)

# Our parquet files contain sample data that includes a driver_id column, timestamps and
# three feature column. Here we define a Feature View that will allow us to serve this
# data to our model online.
driver_hourly_stats_view = FeatureView(
    name="driver_hourly_stats",
    entities=["driver_id"],
    ttl=Duration(seconds=86400 * 1),
    features=[
        Feature(name="conv_rate", dtype=ValueType.FLOAT),
        Feature(name="acc_rate", dtype=ValueType.FLOAT),
        Feature(name="avg_daily_trips", dtype=ValueType.INT64),
    ],
    online=True,
    batch_source=driver_hourly_stats,
    tags={},
)

然后运行:

feast apply
feast materialize-incremental $(date -u +"%Y-%m-%dT%H:%M:%S")

这将创建特征视图表并填充driver_stats我们在 Postgres 中创建的表中的数据。

项目详情


下载文件

下载适用于您平台的文件。如果您不确定要选择哪个,请了解有关安装包的更多信息。

源分布

mk-feature-store-postgres-0.2.7.ta​​r.gz (19.4 kB 查看哈希

已上传 source

内置分布

mk_feature_store_postgres-0.2.7-py3-none-any.whl (20.7 kB 查看哈希

已上传 py3