温馨提示×

温馨提示×

您好,登录后才能下订单哦!

密码登录×
登录注册×
其他方式登录
点击 登录注册 即表示同意《亿速云用户服务条款》

Python数据处理之pd.Series()函数怎么使用

发布时间:2022-06-23 09:51:49 来源:亿速云 阅读:404 作者:iii 栏目:开发技术

本文小编为大家详细介绍“Python数据处理之pd.Series()函数怎么使用”,内容详细,步骤清晰,细节处理妥当,希望这篇“Python数据处理之pd.Series()函数怎么使用”文章能帮助大家解决疑惑,下面跟着小编的思路慢慢深入,一起来学习新知识吧。

    1.Series介绍

    Pandas模块的数据结构主要有两种:1.Series 2.DataFrame

    Series 是一维数组,基于Numpy的ndarray 结构

    Series([data, index, dtype, name, copy, …])    
    # One-dimensional ndarray with axis labels (including time series).

    2.Series创建

    import Pandas as pd 
    import numpy as np

    1.pd.Series([list],index=[list])

    参数为list ,index为可选参数,若不填写则默认为index从0开始

    obj = pd.Series([4, 7, -5, 3, 7, np.nan])
    obj

    输出结果为:

    0    4.0
    1    7.0
    2   -5.0
    3    3.0
    4    7.0
    5    NaN
    dtype: float64

    2.pd.Series(np.arange())

    arr = np.arange(6)
    s = pd.Series(arr)
    s

    输出结果为:

    0    0
    1    1
    2    2
    3    3
    4    4
    5    5
    dtype: int32

    pd.Series({dict})
    d = {'a':10,'b':20,'c':30,'d':40,'e':50}
    s = pd.Series(d)
    s

    输出结果为:

    a    10
    b    20
    c    30
    d    40
    e    50
    dtype: int64

    可以通过DataFrame中某一行或者某一列创建序列

    3 Series基本属性

    • Series.values:Return Series as ndarray or ndarray-like depending on the dtype

    obj.values
    # array([ 4.,  7., -5.,  3.,  7., nan])
    • Series.index:The index (axis labels) of the Series.

    obj.index
    # RangeIndex(start=0, stop=6, step=1)
    • Series.name:Return name of the Series.

    4 索引

    • Series.loc:Access a group of rows and columns by label(s) or a boolean array.

    • Series.iloc:Purely integer-location based indexing for selection by position.

    5 计算、描述性统计

     Series.value_counts:Return a Series containing counts of unique values.

    index = ['Bob', 'Steve', 'Jeff', 'Ryan', 'Jeff', 'Ryan'] 
    obj = pd.Series([4, 7, -5, 3, 7, np.nan],index = index)
    obj.value_counts()

    输出结果为:

     7.0    2
     3.0    1
    -5.0    1
     4.0    1
    dtype: int64

    6 排序

    Series.sort_values

    Series.sort_values(self, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')

    Parameters:

    ParametersDescription
    axis{0 or ‘index’}, default 0,Axis to direct sorting. The value ‘index’ is accepted for compatibility with DataFrame.sort_values.
    ascendinbool, default True,If True, sort values in ascending order, otherwise descending.
    inplacebool, default FalseIf True, perform operation in-place.
    kind{‘quicksort’, ‘mergesort’ or ‘heapsort’}, default ‘quicksort’Choice of sorting algorithm. See also numpy.sort() for more information. ‘mergesort’ is the only stable algorithm.
    na_position{‘first’ or ‘last’}, default ‘last’,Argument ‘first’ puts NaNs at the beginning, ‘last’ puts NaNs at the end.

    Returns:

    Series:Series ordered by values.

    obj.sort_values()

    输出结果为:

    Jeff    -5.0
    Ryan     3.0
    Bob      4.0
    Steve    7.0
    Jeff     7.0
    Ryan     NaN
    dtype: float64

    • Series.rank

    Series.rank(self, axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)[source]

    Parameters:

    ParametersDescription
    axis{0 or ‘index’, 1 or ‘columns’}, default 0Index to direct ranking.
    method{‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}, default ‘average’How to rank the group of records that have the same value (i.e. ties): average, average rank of the group; min: lowest rank in the group; max: highest rank in the group; first: ranks assigned in order they appear in the array; dense: like ‘min’, but rank always increases by 1,between groups
    numeric_onlybool, optional,For DataFrame objects, rank only numeric columns if set to True.
    na_option{‘keep’, ‘top’, ‘bottom’}, default ‘keep’, How to rank NaN values:;keep: assign NaN rank to NaN values; top: assign smallest rank to NaN values if ascending; bottom: assign highest rank to NaN values if ascending
    ascendingbool, default True Whether or not the elements should be ranked in ascending order.
    pctbool, default False Whether or not to display the returned rankings in percentile form.

    Returns:

    same type as caller :Return a Series or DataFrame with data ranks as values.

    # obj.rank()            #从大到小排,NaN还是NaN
    obj.rank(method='dense')  
    # obj.rank(method='min')
    # obj.rank(method='max')
    # obj.rank(method='first')
    # obj.rank(method='dense')

    输出结果为:

    Bob      3.0
    Steve    4.0
    Jeff     1.0
    Ryan     2.0
    Jeff     4.0
    Ryan     NaN
    dtype: float64

    读到这里,这篇“Python数据处理之pd.Series()函数怎么使用”文章已经介绍完毕,想要掌握这篇文章的知识点还需要大家自己动手实践使用过才能领会,如果想了解更多相关内容的文章,欢迎关注亿速云行业资讯频道。

    向AI问一下细节

    免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。

    AI