Pandas排序的2种方式（附带实例）

Pandas 中，排序有两种方式，分别为按标签排序和按实际值排序。

Pandas按标签排序

在 Pandas 中，提供了 sort_index() 函数用于按标签进行排序。该函数类似于 Excel 中的排序操作，我们可以根据 index（第 0 列的内容）的顺序对 DF 数据帧中的数据条目进行排序操作。可以使用 ascending 参数来指定是升序排列还是降序排列，ascending 参数为 True，则是升序，如果为 False，则是降序，默认为升序。

【实例】利用 sort_index() 函数对 DataFrame 进行按标签排序。

import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame(np.random.randn(10, 2), index=[1, 4, 6, 2, 3, 5, 9, 8, 0, 7], columns=['col2', 'col1'])
sorted_df = unsorted_df.sort_index()
print("按照升序对行标签进行排序:")
print(sorted_df)
# 通过将布尔值传递给升序参数，可以控制排序顺序
sorted_df = unsorted_df.sort_index(ascending=False)
print("控制排序顺序:")
print(sorted_df)
# 通过传递 axis 参数值为 0 或 1，可以对列标签进行排序。默认情况下，axis=0，逐行排列
sorted_df = unsorted_df.sort_index(axis=1)
print("按列排列:")
print(sorted_df)

运行程序，输出如下：

按照行标签升序排列：
       col2       col1
0   0.005272   1.356294
1  -1.306930   0.794197
2  -0.222829  -0.807098
3   0.931546   0.933223
4   0.783025   0.695322
5  -0.532823   1.288073
6   0.735168   0.214876
7   1.039282  -0.799436
8  -0.794360   0.520308
9  -0.856378  -1.160793
控制排序顺序：
       col2       col1
9  -0.856378  -1.160793
8  -0.794360   0.520308
7   1.039282  -0.799436
6   0.735168   0.214876
5  -0.532823   1.288073
4   0.783025   0.695322
3   0.931546   0.933223
2  -0.222829  -0.807098
1  -1.306930   0.794197
0   0.005272   1.356294
按列排列：
       col1       col2
1   0.794197  -1.306930
4   0.695322   0.783025
6   0.214876   0.735168
2  -0.807098  -0.222829
3   0.933223   0.931546
5   1.288073  -0.532823
9  -1.160793  -0.856378
8   0.520308  -0.794360
0   1.356294   0.005272
7  -0.799436   1.039282

Pandas按实际值排序

Pandas 中的 sort_values() 函数原理类似于 SQL 中的 order by，可以将数据集依照某个字段中的数据进行排序，该函数既可根据指定列的数据也可根据指定行的数据排序。

函数的格式为：

DataFrame.sort_values（by='##'，axis=0，ascending=True，inplace=False，na_position='last'）

by 指定列名（axis=0 或 ‘index’）或索引值（axis=1 或 ‘columns’）。
如果 axis=0 或 'index'，则按照指定列中数据大小排序；如果 axis=1 或 ‘columns’，则按照指定索引中数据大小排序，默认 axis=0。
ascending 表示是否按指定列的数组升序排列，默认为 True，即升序排列。
inplace 表示是否用排序后的数据集替换原来的数据，默认为 False，即不替换。
na_position 设定缺失值的显示位置。

【实例】利用 sort_values() 函数对数据进行按实际值排序。

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'col1': ['A', 'A', 'B', np.nan, 'D', 'C'],
    'col2': [3, 1, 10, 8, 7, 7],
    'col3': [0, 1, 8, 4, 2, 9]
})

print(df)

# 依据第一列排序，并将该列空值放在首位
print(df.sort_values(by=['col1'], na_position='first'))

# 依据第二、三列，数值降序排序
print(df.sort_values(by=['col2', 'col3'], ascending=False))

# 根据第一列中的数值排序，按降序排列，并替换原数据
df.sort_values(by=['col1'], ascending=False, inplace=True, na_position='first')
print(df)

x = pd.DataFrame({
    'x1': [1, 3, 3, 5],
    'x2': [4, 3, 2, 1],
    'x3': [2, 3, 5, 1]
})
print(x)

# 按照索引值为 0 的行，即第一行的值来降序排序
print(x.sort_values(by=0, ascending=False, axis=1))

运行程序，输出如下：

    col1  col2  col3
0     A     3     0
1     A     1     1
2     B    10     8
3   NaN     8     4
4     D     7     2
5     C     7     9

    col1  col2  col3
3   NaN     8     4
0     A     3     0
1     A     1     1
2     B    10     8
5     C     7     9
4     D     7     2

    col1  col2  col3
2     B    10     8
3   NaN     8     4
5     C     7     9
4     D     7     2
0     A     3     0
1     A     1     1

    col1  col2  col3
3   NaN     8     4
4     D     7     2
5     C     7     9
2     B    10     8
0     A     3     0
1     A     1     1

    x1  x2  x3
0    1    4    2
1    3    3    3
2    3    2    5
3    5    1    1

    x2  x3  x1
0    4    2    1
1    3    3    3
2    2    5    3
3    1    1    5

Pandas排序的2种方式（附带实例）

Pandas按标签排序

Pandas按实际值排序

相关文章