MLCF - 用于加密货币预测的机器学习工具包
项目描述
MLCF - 用于加密货币预测的机器学习工具包
该库提供用于加密货币预测和交易决策的工具。
目前,该库仅提供数据工具,例如:
- OHLCV 文件阅读器
- 添加外部和内部指标的工具
- 标记数据的工具
- 按顺序在一组间隔上工作的工具
- 数据窗口化工具
- 构建、保存和读取数据集的工具
- 数据标准化工具
- 通过过滤某些窗口来预处理数据的工具
该库不提供模型或端到端交易机器人。
有关更多信息,请在此处找到文档:https ://guitheg.github.io/mlcf
安装
官方支持的操作系统:
- Linux
官方支持的 Python 版本:
-
3.7
-
3.8
-
3.9
Linux 安装 (python v3.7)
- MLCF 封装
pip install mlcf
Linux 安装 (python v3.8, v3.9)
pip install mlcf --no-binary TA-LIB
MLCF 示例模块使用
在这一部分中,我们将介绍 MLCF 模块的一些示例用法。
文件阅读器模块
# ----------- read file ---------------------------------
from pathlib import Path
from mlcf.datatools.data_reader import (
read_ohlcv_json_from_file,
read_ohlcv_json_from_dir,
read_json_file
)
# from a ohlcv json file
data = read_ohlcv_json_from_file(Path("tests/testdata/ETH_BUSD-15m.json"))
# from a directory, a pair, and a timeframe
pair = "ETH_BUSD"
tf = "15m"
data = read_ohlcv_json_from_dir(Path("tests/testdata/"), pair=pair, timeframe=tf)
# read a json file (but not necessary a OHLCV file)
data = read_json_file(Path("tests/testdata/meteo.json"), 'time', ["time", "Temperature"])
# -------------------------------------------------------
指示灯模块
# ------------------- Indicators module -----------------------------
from mlcf.indicators.add_indicators import add_intern_indicator
# you can add yoursel your own indicators or features
data["return"] = data["close"].pct_change(1)
data.dropna(inplace=True) # make sure to drop nan values
# you can add intern indicator
data = add_intern_indicator(data, indice_name="adx")
# -------------------------------------------------------
标签工具
# ------------------- Labelize Tool -----------------------------
from mlcf.datatools.utils import labelize
# A good practice is to take the mean and the standard deviation of the value you want to
# labelize
mean = data["return"].mean()
std = data["return"].std()
# Here you give the value you want to labelize with column='return'. The new of the labels column
# will be the name give to 'label_col_name'
data = labelize(
data,
column="return",
labels=5,
bounds=(mean-std, mean+std),
label_col_name="label"
)
数据区间模块、标准化工具和 WindowFilter 工具
# ------------------- Data Intervals Module and Standardization Tools -----------------------------
from mlcf.datatools.data_intervals import DataIntervals
from mlcf.datatools.standardize_fct import ClassicStd, MinMaxStd
from mlcf.datatools.windowing.filter import LabelBalanceFilter
# We define a dict which give us the information about what standardization apply to each columns.
std_by_features = {
"close": ClassicStd(),
"return": ClassicStd(with_mean=False), # to avoid to shift we don't center
"adx": MinMaxStd(minmax=(0, 100)) # the value observed in the adx are between 0 and 100 and we
# want to set it between 0 and 1.
}
data_intervals = DataIntervals.create_data_intervals_obj(data, n_intervals=10)
data_intervals.standardize(std_by_features)
# We can apply a filter the dataset we want. Here we will filter the values in order to balance
# the histogram of return value. For this, we use the label previously process on return.
filter_by_set = {
"train": LabelBalanceFilter("label") # the column we will balance the data is 'label
# the max count will be automatically process
}
# dict_train_val_test is a dict with the key 'train', 'val', 'test'. The value of the dict is a
# WTSeries (a windowed time series).
dict_train_val_test = data_intervals.windowing(
window_width=30,
window_step=1,
selected_columns=["close", "return", "adx"],
filter_by_dataset=filter_by_set
)
# -------------------------------------------------------
窗口迭代器工具
# -------------------- Window Iterator Tool --------------------
# If we don't want to use the Data Interval Module. We can simple use a WTSeries with our data.
from mlcf.datatools.windowing.tseries import WTSeriesLite
# To create a WTSeries from pandas.DataFrame
wtseries = WTSeriesLite.create_wtseries_lite(
dataframe=data,
window_width=30,
window_step=1,
selected_columns=["close", "return", "adx"],
window_filter=LabelBalanceFilter("label")
)
# Or from a wtseries .h5 file:
wtseries = WTSeriesLite.read(Path("/tests/testdata/wtseries.h5"))
# We can save the wtseries as a file.
wtseries.write(Path("/tests/testdata", "wtseries"))
# we can iterate over the wtseries:
for window in wtseries:
pass
# Where window is a pd.Dataframe representing a window.
# -------------------------------------------------------
预测窗口迭代器工具
# -------------------- Forecast Window Iterator Tool --------------------
# This class allow us to iterate over a WTSeries but the iteration
# (__getitem__) give us a tuple of 2
from mlcf.datatools.windowing.forecast_iterator import WindowForecastIterator
data_train = WindowForecastIterator(
wtseries,
input_width=29,
target_width=1, # The sum of the input_width and target_width must not exceed the window width
# of the wtseries
input_features=["close", "adx"],
target_features=["return"]
)
for window in data_train:
window_input, window_target = window
pass
# -------------------------------------------------------
项目详情
下载文件
下载适用于您平台的文件。如果您不确定要选择哪个,请了解有关安装包的更多信息。
源分布
mlcf-2.2.6.tar.gz
(49.1 kB
查看哈希)
内置分布
mlcf-2.2.6-py3-none-any.whl
(56.5 kB
查看哈希)