User Tools

Site Tools


pandas_series

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Last revisionBoth sides next revision
pandas_series [2021/02/04 19:55] – created rajupandas_series [2024/02/06 05:11] raju
Line 1: Line 1:
 +===== creating a series =====
 +==== create a series from a list ====
 +<code>
 +>>> a = pd.Series(['sun', 'mon', 'tue'])
 +>>> a
 +0    sun
 +1    mon
 +2    tue
 +dtype: object
 +</code>
 +
 +To assign an index
 +<code>
 +>>> b = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't'])
 +>>> b
 +s    sun
 +m    mon
 +t    tue
 +dtype: object
 +</code>
 +
 +To assign a name to the column
 +<code>
 +>>> c = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't'], name='day')
 +>>> c
 +s    sun
 +m    mon
 +t    tue
 +Name: day, dtype: object
 +</code>
 +
 +To assign a name to the index
 +<code>
 +>>> d = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't'], name='day')
 +>>> d.index.name = 'letter'
 +>>> d
 +letter
 +s    sun
 +m    mon
 +t    tue
 +Name: day, dtype: object
 +</code>
 +
 +Column name is useful when converting the series to dataframe.
 +<code>
 +>>> b.to_frame()
 +     0
 +s  sun
 +m  mon
 +t  tue
 +
 +>>> c.to_frame()
 +   day
 +s  sun
 +m  mon
 +t  tue
 +</code>
 +
 +If the series did not have a name to begin with but we desire to have one while converting to the dataframe
 +<code>
 +>>> b.to_frame(name='days')
 +  days
 +s  sun
 +m  mon
 +t  tue
 +</code>
 +
 +The index name comes in handy while resetting the index
 +<code>
 +>>> c.reset_index()
 +  index  day
 +0      sun
 +1      mon
 +2      tue
 +>>> d.reset_index()
 +  letter  day
 +0      s  sun
 +1      m  mon
 +2      t  tue
 +</code>
 +
 +===== dummy =====
 ==== append element to series ==== ==== append element to series ====
 <code> <code>
Line 19: Line 101:
 '1.2.1' '1.2.1'
 </code> </code>
 +
 +==== return a random element ====
 +Use pandas.Series.sample
 +
 +Ref:-
 +  * https://pandas.pydata.org/docs/reference/api/pandas.Series.sample.html
 +===== check if =====
 +==== check if a series is empty ====
 +Use pandas.Series.empty .
 +
 +<code>
 +$ ipython
 +
 +In [1]:
 +import pandas as pd
 +import numpy as np
 +df1 = pd.DataFrame({'A':  []})
 +df1
 +Out[1]:
 +Empty DataFrame
 +Columns: [A]
 +Index: []
 +
 +In [2]:
 +df1['A'].empty
 +Out[2]:
 +True
 +</code>
 +
 +A series with just NaNs is considered "non-empty". Drop the NaNs to make it "empty".
 +<code>
 +In [3]:
 +df2 = pd.DataFrame({'A':  [np.nan]})
 +df2
 +Out[3]:
 +    A
 +0 NaN
 +
 +In [4]:
 +df2['A'].empty
 +Out[4]:
 +False
 +
 +In [5]:
 +df2['A'].dropna().empty
 +Out[5]:
 +True
 +</code>
 +
 +Used Python 3.9.4 and IPython 7.22.0
 +
 +tags | check if a series has at least one element
 +
 +==== check if all elements in a series are unique ====
 +Use pandas.Series.is_unique
 +
 +<code>
 +In [1]: 
 +import pandas as pd
 +
 +In [2]: 
 +pd.Series([1, 2, 3]).is_unique
 +Out[2]: 
 +True
 +
 +In [3]: 
 +pd.Series([1, 2, 2]).is_unique
 +Out[3]: 
 +False
 +</code>
 +
 +Missing values are treated as any other value. So if there are multiple NaNs, it will return True. If this is not desired, drop the NaNs first.
 +<code>
 +In [4]: 
 +import numpy as np
 +pd.Series([1, 2, 3, np.nan, np.nan]).is_unique
 +Out[4]: 
 +False
 +
 +In [5]: 
 +pd.Series([1, 2, 3, np.nan, np.nan]).dropna().is_unique
 +Out[5]: 
 +True
 +</code>
 +
 +For completeness
 +<code>
 +In [6]: 
 +pd.Series([1, 2, 2, np.nan, np.nan]).is_unique
 +Out[6]: 
 +False
 +
 +In [7]: 
 +pd.Series([1, 2, 2, np.nan, np.nan]).dropna().is_unique
 +Out[7]: 
 +False
 +</code>
 +
 +Using | pandas 1.5.3, python 3.11.4, ipython 8.12.0
 +
 +Ref:-
 +  * https://pandas.pydata.org/docs/reference/api/pandas.Series.is_unique.html
 +  * https://stackoverflow.com/questions/48838247/how-to-check-every-pandas-series-value-is-unique
 +
pandas_series.txt · Last modified: 2024/02/06 05:18 by raju