比较两个数据帧并返回列差异和附加记录
项目描述
dataframe_diff
dataframe_diff 是一个微库,它将两个数据帧作为输入,比较它们并返回两个数据帧,并按列比较和附加记录。
安装
pip install dataframe-diff
例子
>>> import pandas as pd
>>> df1=pd.read_csv('students_1.csv')
>>> df2=pd.read_csv('students_2.csv')
>>> from dataframe_diff import dataframe_diff
>>> df1.head()
Name Subjects Marks Grade
0 Leonard Eng 70 B
1 Leonard Math 80 B
2 Leonard Physics 90 A
3 Sheldon Eng 90 A
4 Sheldon Math 99 A
>>> df2.head()
Name Subjects Marks Grade
0 Leonard Eng 75 A
1 Leonard Math 85 A
2 Leonard Physics 90 A
3 Sheldon Eng 99 A
4 Sheldon Math 99 A
>>> d1_column,d2_additional=dataframe_diff(df1, df2, key=['Name','Subjects'])
>>> d1_column
Name Subjects value_x value_y column_name
0 Leonard Eng 70 75 Marks
1 Leonard Eng B A Grade
2 Leonard Math 80 85 Marks
3 Leonard Math B A Grade
4 Sheldon Eng 90 99 Marks
5 Penny Physics 65 75 Marks
6 Penny Physics C B Grade
>>> d2_additional
Name Subjects Marks Grade sets
0 Rajesh Math 93 A df_x
1 Howard Chemistry 83 B df_y
项目详情
关
dataframe_diff -0.5.tar.gz 的哈希值
| 算法 | 哈希摘要 | |
|---|---|---|
| SHA256 | f9c069138a0337d2e16a1c00a5c6f18d1161182d08977e8a07622cf1af685259 |
|
| MD5 | d5a51a37e2ea94db625d477958f23c3a |
|
| 布莱克2-256 | 059e5c8439ec8aa92ff591a9af5837396eca290529d3a224c6ca9f9437e41ffc |