pandas merge 重复（python中如何使用pandas.merge？）

时间2025-08-05 12:25:14分类IT科技浏览4725

导读：本文教程操作环境：windows7系统、Python 3.9.1，DELL G3电脑。...

本文教程操作环境：windows7系统、Python 3.9.1 ，DELL G3电脑。

1 、pandas.merge

是pandas的全功能、高性能的的内存连接操作，在习惯上非常类似于SQL之类的关系数据库。

按照数据中具体的某一字段来连接数据，是 DataFrame之间类似于SQL的表连接操作。

2 、merge的默认合并方法

merge用于表内部基于 index-on-index 和 index-on-column(s) 的合并，但默认是基于index来合并。

3 、使用语法

pandas.read_sql(sql,con,index_col=None,coerce_float=True,params=None,parse_dates=None, columns=None,chunksize=None)

4 、使用参数

sql：SQL命令字符串；

con：连接sql数据库的，engine ，一般可以用SQLalchemy或者pymysql之类的包建立；

index_col: 选择某一列作为index；

coerce_float：非常有用，将数字形式的字符串直接以float型读入；

parse_dates：将某一列日期型字符串转换为datetime型数据；

columns：要选取的列；

chunksize：如果提供了一个整数值，那么就会返回一个generator ，每次输出的行数就是提供的值的大小。

5 、使用实例

importpandas; frompandasimportread_csv; items=read_csv( "E:\\pythonlearning\\datacode\\firstpart\\4\\4.12\\data1.csv", sep=|, names=[id,comments,title] ); prices=read_csv( "E://pythonlearning//datacode//firstpart//4//4.12//data2.csv", sep=|, names=[id,oldPrice,nowPrice] ); itemPrices=pandas.merge( items, prices, left_on=id, right_on=id );#以id列用基准，合并数据框