Skip to main content

MLCF - 用于加密货币预测的机器学习工具包

项目描述

MLCF - 用于加密货币预测的机器学习工具包

该库提供用于加密货币预测和交易决策的工具。
目前,该库仅提供数据工具,例如:

  • OHLCV 文件阅读器
  • 添加外部和内部指标的工具
  • 标记数据的工具
  • 按顺序在一组间隔上工作的工具
  • 数据窗口化工具
  • 构建、保存和读取数据集的工具
  • 数据标准化工具
  • 通过过滤某些窗口来预处理数据的工具

该库不提供模型或端到端交易机器人。

有关更多信息,请在此处找到文档:https ://guitheg.github.io/mlcf


安装

官方支持的操作系统:

  • Linux

官方支持的 Python 版本:

  • 3.7

  • 3.8

  • 3.9


Linux 安装 (python v3.7)

  • MLCF 封装
pip install mlcf

Linux 安装 (python v3.8, v3.9)

pip install mlcf --no-binary TA-LIB

MLCF 示例模块使用

在这一部分中,我们将介绍 MLCF 模块的一些示例用法。


文件阅读器模块

# -----------  read file ---------------------------------
from pathlib import Path
from mlcf.datatools.data_reader import (
    read_ohlcv_json_from_file,
    read_ohlcv_json_from_dir,
    read_json_file
)

# from a ohlcv json file
data = read_ohlcv_json_from_file(Path("tests/testdata/ETH_BUSD-15m.json"))

# from a directory, a pair, and a timeframe
pair = "ETH_BUSD"
tf = "15m"
data = read_ohlcv_json_from_dir(Path("tests/testdata/"), pair=pair, timeframe=tf)

# read a json file (but not necessary a OHLCV file)
data = read_json_file(Path("tests/testdata/meteo.json"), 'time', ["time", "Temperature"])

# -------------------------------------------------------

指示灯模块

# ------------------- Indicators module -----------------------------
from mlcf.indicators.add_indicators import add_intern_indicator

# you can add yoursel your own indicators or features
data["return"] = data["close"].pct_change(1)
data.dropna(inplace=True)  # make sure to drop nan values

# you can add intern indicator
data = add_intern_indicator(data, indice_name="adx")
# -------------------------------------------------------

标签工具

# ------------------- Labelize Tool -----------------------------
from mlcf.datatools.utils import labelize

# A good practice is to take the mean and the standard deviation of the value you want to
# labelize
mean = data["return"].mean()
std = data["return"].std()

# Here you give the value you want to labelize with column='return'. The new of the labels column
# will be the name give to 'label_col_name'
data = labelize(
    data,
    column="return",
    labels=5,
    bounds=(mean-std, mean+std),
    label_col_name="label"
)

数据区间模块、标准化工具和 WindowFilter 工具

# ------------------- Data Intervals Module and Standardization Tools -----------------------------
from mlcf.datatools.data_intervals import DataIntervals
from mlcf.datatools.standardize_fct import ClassicStd, MinMaxStd
from mlcf.datatools.windowing.filter import LabelBalanceFilter

# We define a dict which give us the information about what standardization apply to each columns.
std_by_features = {
    "close": ClassicStd(),
    "return": ClassicStd(with_mean=False),  # to avoid to shift we don't center
    "adx": MinMaxStd(minmax=(0, 100))  # the value observed in the adx are between 0 and 100 and we
                                       # want to set it between 0 and 1.
}
data_intervals = DataIntervals.create_data_intervals_obj(data, n_intervals=10)
data_intervals.standardize(std_by_features)

# We can apply a filter the dataset we want. Here we will filter the values in order to balance
# the histogram of return value. For this, we use the label previously process on return.
filter_by_set = {
    "train": LabelBalanceFilter("label")  # the column we will balance the data is 'label
                                          # the max count will be automatically process
}

# dict_train_val_test is a dict with the key 'train', 'val', 'test'. The value of the dict is a
# WTSeries (a windowed time series).
dict_train_val_test = data_intervals.windowing(
    window_width=30,
    window_step=1,
    selected_columns=["close", "return", "adx"],
    filter_by_dataset=filter_by_set
)
# -------------------------------------------------------

窗口迭代器工具

# -------------------- Window Iterator Tool --------------------

# If we don't want to use the Data Interval Module. We can simple use a WTSeries with our data.
from mlcf.datatools.windowing.tseries import WTSeriesLite

# To create a WTSeries from pandas.DataFrame
wtseries = WTSeriesLite.create_wtseries_lite(
    dataframe=data,
    window_width=30,
    window_step=1,
    selected_columns=["close", "return", "adx"],
    window_filter=LabelBalanceFilter("label")
)

# Or from a wtseries .h5 file:
wtseries = WTSeriesLite.read(Path("/tests/testdata/wtseries.h5"))

# We can save the wtseries as a file.
wtseries.write(Path("/tests/testdata", "wtseries"))

# we can iterate over the wtseries:
for window in wtseries:
    pass
    # Where window is a pd.Dataframe representing a window.

# -------------------------------------------------------

预测窗口迭代器工具

# -------------------- Forecast Window Iterator Tool --------------------

# This class allow us to iterate over a WTSeries but the iteration
# (__getitem__) give us a tuple of 2

from mlcf.datatools.windowing.forecast_iterator import WindowForecastIterator

data_train = WindowForecastIterator(
    wtseries,
    input_width=29,
    target_width=1,  # The sum of the input_width and target_width must not exceed the window width
                     # of the wtseries
    input_features=["close", "adx"],
    target_features=["return"]
)
for window in data_train:
    window_input, window_target = window
    pass
# -------------------------------------------------------

项目详情


下载文件

下载适用于您平台的文件。如果您不确定要选择哪个,请了解有关安装包的更多信息。

源分布

mlcf-2.2.6.tar.gz (49.1 kB 查看哈希

已上传 source

内置分布

mlcf-2.2.6-py3-none-any.whl (56.5 kB 查看哈希

已上传 py3