本文小编为大家详细介绍“Python数据处理之pd.Series()函数怎么使用”,内容详细,步骤清晰,细节处理妥当,希望这篇“Python数据处理之pd.Series()函数怎么使用”文章能帮助大家解决疑惑,下面跟着小编的思路慢慢深入,一起来学习新知识吧。
Pandas模块的数据结构主要有两种:1.Series 2.DataFrame
Series 是一维数组,基于Numpy的ndarray 结构
Series([data, index, dtype, name, copy, …]) # One-dimensional ndarray with axis labels (including time series).
import Pandas as pd import numpy as np
参数为list ,index为可选参数,若不填写则默认为index从0开始
obj = pd.Series([4, 7, -5, 3, 7, np.nan]) obj
输出结果为:
0 4.0
1 7.0
2 -5.0
3 3.0
4 7.0
5 NaN
dtype: float64
arr = np.arange(6) s = pd.Series(arr) s
输出结果为:
0 0
1 1
2 2
3 3
4 4
5 5
dtype: int32
pd.Series({dict}) d = {'a':10,'b':20,'c':30,'d':40,'e':50} s = pd.Series(d) s
输出结果为:
a 10
b 20
c 30
d 40
e 50
dtype: int64
可以通过DataFrame中某一行或者某一列创建序列
Series.values:Return Series as ndarray or ndarray-like depending on the dtype
obj.values # array([ 4., 7., -5., 3., 7., nan])
Series.index:The index (axis labels) of the Series.
obj.index # RangeIndex(start=0, stop=6, step=1)
Series.name:Return name of the Series.
Series.loc:Access a group of rows and columns by label(s) or a boolean array.
Series.iloc:Purely integer-location based indexing for selection by position.
Series.value_counts:Return a Series containing counts of unique values.
index = ['Bob', 'Steve', 'Jeff', 'Ryan', 'Jeff', 'Ryan'] obj = pd.Series([4, 7, -5, 3, 7, np.nan],index = index) obj.value_counts()
输出结果为:
7.0 2
3.0 1
-5.0 1
4.0 1
dtype: int64
Series.sort_values
Series.sort_values(self, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')
Parameters:
Parameters | Description |
---|---|
axis | {0 or ‘index’}, default 0,Axis to direct sorting. The value ‘index’ is accepted for compatibility with DataFrame.sort_values. |
ascendin | bool, default True,If True, sort values in ascending order, otherwise descending. |
inplace | bool, default FalseIf True, perform operation in-place. |
kind | {‘quicksort’, ‘mergesort’ or ‘heapsort’}, default ‘quicksort’Choice of sorting algorithm. See also numpy.sort() for more information. ‘mergesort’ is the only stable algorithm. |
na_position | {‘first’ or ‘last’}, default ‘last’,Argument ‘first’ puts NaNs at the beginning, ‘last’ puts NaNs at the end. |
Returns:
Series:Series ordered by values.
obj.sort_values()
输出结果为:
Jeff -5.0
Ryan 3.0
Bob 4.0
Steve 7.0
Jeff 7.0
Ryan NaN
dtype: float64
Series.rank
Series.rank(self, axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)[source]
Parameters:
Parameters | Description |
---|---|
axis | {0 or ‘index’, 1 or ‘columns’}, default 0Index to direct ranking. |
method | {‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}, default ‘average’How to rank the group of records that have the same value (i.e. ties): average, average rank of the group; min: lowest rank in the group; max: highest rank in the group; first: ranks assigned in order they appear in the array; dense: like ‘min’, but rank always increases by 1,between groups |
numeric_only | bool, optional,For DataFrame objects, rank only numeric columns if set to True. |
na_option | {‘keep’, ‘top’, ‘bottom’}, default ‘keep’, How to rank NaN values:;keep: assign NaN rank to NaN values; top: assign smallest rank to NaN values if ascending; bottom: assign highest rank to NaN values if ascending |
ascending | bool, default True Whether or not the elements should be ranked in ascending order. |
pct | bool, default False Whether or not to display the returned rankings in percentile form. |
Returns:
same type as caller :Return a Series or DataFrame with data ranks as values.
# obj.rank() #从大到小排,NaN还是NaN obj.rank(method='dense') # obj.rank(method='min') # obj.rank(method='max') # obj.rank(method='first') # obj.rank(method='dense')
输出结果为:
Bob 3.0
Steve 4.0
Jeff 1.0
Ryan 2.0
Jeff 4.0
Ryan NaN
dtype: float64
读到这里,这篇“Python数据处理之pd.Series()函数怎么使用”文章已经介绍完毕,想要掌握这篇文章的知识点还需要大家自己动手实践使用过才能领会,如果想了解更多相关内容的文章,欢迎关注亿速云行业资讯频道。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。