这篇文章将为大家详细讲解有关pd.Series()函数怎么用,小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。
1. Series介绍
Pandas模块的数据结构主要有两:1、Series ;2、DataFrame
series是一个一维数组,是基于NumPy的ndarray结构。Pandas会默然用0到n-1来作为series的index,但也可以自己指定index(可以把index理解为dict里面的key)。
2. Series创建
pd.Series([list],index=[list])
参数为list;index为可选参数,若不填写则默认index从0开始;若填写则index长度应该与value长度相等。
import pandas as pd
s=pd.Series([1,2,3,4,5],index=['a','b','c','f','e'])
print s
pd.Series({dict})
以一字典结构为参数。
import pandas as pd
s=pd.Series({'a':1,'b':2,'c':3,'f':4,'e':5})
print s
3. Series取值
s[index] or s[[index的list]]
取值操作类似数组,当取不连续的多个值时可以以list为参数
import pandas as pd
import numpy as np
v = np.random.random_sample(50)
s = pd.Series(v)
s1 = s[[3, 13, 23, 33]]
s2 = s[3:13]
s3 = s[43]
print("s1", s1)
print("s2", s2)
print("s3", s3)
s1 3 0.064095
13 0.354023
23 0.225739
33 0.959288
dtype: float64
s2 3 0.064095
4 0.405651
5 0.024181
6 0.367606
7 0.844005
8 0.405313
9 0.102824
10 0.806400
11 0.950502
12 0.735310
dtype: float64
s3 0.42803253918
4. Series取头和尾的值
.head(n);.tail(n)
取出头n行或尾n行,n为可选参数,若不填默认5
import pandas as pd
import numpy as np
v = np.random.random_sample(50)
s = pd.Series(v)
print("s.head()", s.head())
print("s.head(3)", s.head(3))
print("s.tail()", s.tail())
print("s.head(3)", s.head(3))
s.head() 0 0.714136
1 0.333600
2 0.683784
3 0.044002
4 0.147745
dtype: float64
s.head(3) 0 0.714136
1 0.333600
2 0.683784
dtype: float64
s.tail() 45 0.779509
46 0.778341
47 0.331999
48 0.444811
49 0.028520
dtype: float64
s.head(3) 0 0.714136
1 0.333600
2 0.683784
dtype: float64
5. Series常用操作
import pandas as pd
import numpy as np
v = [10, 3, 2, 2, np.nan]
v = pd.Series(v)
print("len():", len(v)) # Series长度,包括NaN
print("shape():", np.shape(v)) # 矩阵形状,(,)
print("count():", v.count()) # Series长度,不包括NaN
print("unique():", v.unique()) # 出现不重复values值
print("value_counts():\n", v.value_counts()) # 统计value值出现次数
len(): 5无锡人流医院哪家好 http://www.wxbhnkyy120.com/
shape(): (5,)
count(): 4
unique(): [ 10. 3. 2. nan]
value_counts():
2.0 2
3.0 1
10.0 1
dtype: int64
6. Series加法
import pandas as pd
import numpy as np
v = [10, 3, 2, 2, np.nan]
v = pd.Series(v)
sum = v[1:3] + v[1:3]
sum1 = v[1:4] + v[1:4]
sum2 = v[1:3] + v[1:4]
sum3 = v[:3] + v[1:]
print("sum", sum)
print("sum1", sum1)
print("sum2", sum2)
print("sum3", sum3)
sum 1 6.0
2 4.0
dtype: float64
sum1 1 6.0
2 4.0
3 4.0
dtype: float64
sum2 1 6.0
2 4.0
3 NaN
dtype: float64
sum3 0 NaN
1 6.0
2 4.0
3 NaN
4 NaN
dtype: float64
7. Series查找
范围查找
import pandas as pd
import numpy as np
s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
sa = pd.Series(s, name="age")
print(sa[sa>19])
jim 22.0
lj 24.0
ton 20.0
Name: age, dtype: float64
中位数
import pandas as pd
import numpy as np
s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
sa = pd.Series(s, name="age")
print("sa.median()", sa.median())
sa.median() 20.0
8. Series赋值
import pandas as pd
import numpy as np
s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
sa = pd.Series(s, name="age")
print(s)
print('----------------')
sa['ton'] = 99
print(sa)
{'ton': 20, 'mary': 18, 'jack': 19, 'jim': 22, 'lj': 24, 'car': None}
----------------
car NaN
jack 19.0
jim 22.0
lj 24.0
mary 18.0
ton 99.0
Name: age, dtype: float64
关于“pd.Series()函数怎么用”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,使各位可以学到更多知识,如果觉得文章不错,请把它分享出去让更多的人看到。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。