===== creating a series ===== ==== create a series from a list ==== >>> a = pd.Series(['sun', 'mon', 'tue']) >>> a 0 sun 1 mon 2 tue dtype: object To assign an index >>> b = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't']) >>> b s sun m mon t tue dtype: object To assign a name to the column >>> c = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't'], name='day') >>> c s sun m mon t tue Name: day, dtype: object To assign a name to the index >>> d = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't'], name='day') >>> d.index.name = 'letter' >>> d letter s sun m mon t tue Name: day, dtype: object Column name is useful when converting the series to dataframe. >>> b.to_frame() 0 s sun m mon t tue >>> c.to_frame() day s sun m mon t tue If the series did not have a name to begin with but we desire to have one while converting to the dataframe >>> b.to_frame(name='days') days s sun m mon t tue The index name comes in handy while resetting the index >>> c.reset_index() index day 0 s sun 1 m mon 2 t tue >>> d.reset_index() letter day 0 s sun 1 m mon 2 t tue ===== dummy ===== ==== append element to series ==== In [1]: import pandas as pd s = pd.Series(dtype='int') N = 4 for i in range(N): s.at[i**2] = i print(s) 0 0 1 1 4 2 9 3 dtype: int64 In [2]: pd.__version__ Out[2]: '1.2.1' ==== return a random element ==== Use pandas.Series.sample Ref:- * https://pandas.pydata.org/docs/reference/api/pandas.Series.sample.html ==== expand a series ==== tags | using reindex, change index Given two series S, I of length n, and an integer N which is >= n, the idea here is to expand S into an N-element vector, E so that E[I[:]] = S[:]. For example if S is [3.4, 1.8], I is [3, 5] and N is 10, we want E to be [0, 0, 0, 3.4, 0, 1.8, 0, 0, 0, 0] import pandas as pd import numpy as np def expand_series(S, I, N, id='val'): E = pd.Series(S.values, index=I, name=id).reindex(np.arange(0, N)).fillna(0) return E df = pd.DataFrame({'id': [3,5], 'val': [3.4, 1.8]}) print(df) id val 0 3 3.4 1 5 1.8 unravelled_series = expand_series(df['val'], df['id'], 10) print(unravelled_series) id 0 0.0 1 0.0 2 0.0 3 3.4 4 0.0 5 1.8 6 0.0 7 0.0 8 0.0 9 0.0 Name: val, dtype: float64 Sample code: https://github.com/KamarajuKusumanchi/notebooks/blob/master/pandas/expand%20a%20series.ipynb Ref: * https://stackoverflow.com/questions/40029071/setting-series-as-index * https://chrisalbon.com/python/data_wrangling/pandas_dataframe_reindexing/ - contains some examples on using pandas.Series.reindex * https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.reindex.html - API ==== Convert series to a dataframe ==== Use to_frame(). By default, it will use the series name to set the column name in the dataframe. But you can also assign one while calling the to_frame function. >>> import pandas as pd >>> b = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't']) >>> b s sun m mon t tue dtype: object >>> b.to_frame() 0 s sun m mon t tue >>> c = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't'], name='day') >>> c s sun m mon t tue Name: day, dtype: object >>> c.to_frame() day s sun m mon t tue >>> b.to_frame(name='days') days s sun m mon t tue ===== check if ===== ==== check if a series is empty ==== Use pandas.Series.empty . $ ipython In [1]: import pandas as pd import numpy as np df1 = pd.DataFrame({'A': []}) df1 Out[1]: Empty DataFrame Columns: [A] Index: [] In [2]: df1['A'].empty Out[2]: True A series with just NaNs is considered "non-empty". Drop the NaNs to make it "empty". In [3]: df2 = pd.DataFrame({'A': [np.nan]}) df2 Out[3]: A 0 NaN In [4]: df2['A'].empty Out[4]: False In [5]: df2['A'].dropna().empty Out[5]: True Used Python 3.9.4 and IPython 7.22.0 tags | check if a series has at least one element ==== check if all elements in a series are unique ==== Use pandas.Series.is_unique In [1]: import pandas as pd In [2]: pd.Series([1, 2, 3]).is_unique Out[2]: True In [3]: pd.Series([1, 2, 2]).is_unique Out[3]: False Missing values are treated as any other value. So if there are multiple NaNs, it will return True. If this is not desired, drop the NaNs first. In [4]: import numpy as np pd.Series([1, 2, 3, np.nan, np.nan]).is_unique Out[4]: False In [5]: pd.Series([1, 2, 3, np.nan, np.nan]).dropna().is_unique Out[5]: True For completeness In [6]: pd.Series([1, 2, 2, np.nan, np.nan]).is_unique Out[6]: False In [7]: pd.Series([1, 2, 2, np.nan, np.nan]).dropna().is_unique Out[7]: False Using | pandas 1.5.3, python 3.11.4, ipython 8.12.0 Ref:- * https://pandas.pydata.org/docs/reference/api/pandas.Series.is_unique.html * https://stackoverflow.com/questions/48838247/how-to-check-every-pandas-series-value-is-unique