Skip to main content

比较两个数据帧并返回列差异和附加记录

项目描述

dataframe_diff

dataframe_diff 是一个微库,它将两个数据帧作为输入,比较它们并返回两个数据帧,并按列比较和附加记录。

安装

pip install dataframe-diff

例子

>>> import pandas as pd
>>> df1=pd.read_csv('students_1.csv')
>>> df2=pd.read_csv('students_2.csv')
>>> from dataframe_diff import dataframe_diff
>>> df1.head()
      Name Subjects  Marks Grade
0  Leonard      Eng     70     B
1  Leonard     Math     80     B
2  Leonard  Physics     90     A
3  Sheldon      Eng     90     A
4  Sheldon     Math     99     A
>>> df2.head()
      Name Subjects  Marks Grade
0  Leonard      Eng     75     A
1  Leonard     Math     85     A
2  Leonard  Physics     90     A
3  Sheldon      Eng     99     A
4  Sheldon     Math     99     A
>>> d1_column,d2_additional=dataframe_diff(df1, df2, key=['Name','Subjects'])
>>> d1_column
      Name Subjects value_x value_y column_name
0  Leonard      Eng      70      75       Marks
1  Leonard      Eng       B       A       Grade
2  Leonard     Math      80      85       Marks
3  Leonard     Math       B       A       Grade
4  Sheldon      Eng      90      99       Marks
5    Penny  Physics      65      75       Marks
6    Penny  Physics       C       B       Grade
>>> d2_additional
     Name   Subjects  Marks Grade  sets
0  Rajesh       Math     93     A  df_x
1  Howard  Chemistry     83     B  df_y

项目详情


下载文件

下载适用于您平台的文件。如果您不确定要选择哪个,请了解有关安装包的更多信息。

源分布

dataframe_diff-0.5.tar.gz (2.8 kB 查看哈希

已上传 source