User Tools

Site Tools


pandas_series

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
pandas_series [2024/02/06 05:11] rajupandas_series [2024/02/06 05:18] (current) – [return a random element] raju
Line 107: Line 107:
 Ref:- Ref:-
   * https://pandas.pydata.org/docs/reference/api/pandas.Series.sample.html   * https://pandas.pydata.org/docs/reference/api/pandas.Series.sample.html
 +
 +==== expand a series ====
 +tags | using reindex, change index
 +
 +Given two series S, I of length n, and an integer N which is >= n, the idea here is to expand S into an N-element vector, E so that E[I[:]] = S[:].
 +
 +For example if S is [3.4, 1.8], I is [3, 5] and N is 10, we want E to be [0, 0, 0, 3.4, 0, 1.8, 0, 0, 0, 0]
 +
 +<code>
 +import pandas as pd
 +import numpy as np
 +
 +def expand_series(S, I, N, id='val'):
 +    E = pd.Series(S.values, index=I, name=id).reindex(np.arange(0, N)).fillna(0)
 +    return E
 +</code>
 +
 +<code>
 +df = pd.DataFrame({'id': [3,5], 'val': [3.4, 1.8]})
 +print(df)
 +</code>
 +
 +<code>
 +   id  val
 +0    3.4
 +1    1.8
 +</code>
 +
 +<code>
 +unravelled_series = expand_series(df['val'], df['id'], 10)
 +print(unravelled_series)
 +</code>
 +
 +<code>
 +id
 +0    0.0
 +1    0.0
 +2    0.0
 +3    3.4
 +4    0.0
 +5    1.8
 +6    0.0
 +7    0.0
 +8    0.0
 +9    0.0
 +Name: val, dtype: float64
 +</code>
 +
 +Sample code: https://github.com/KamarajuKusumanchi/notebooks/blob/master/pandas/expand%20a%20series.ipynb
 +
 +Ref:
 +
 +  * https://stackoverflow.com/questions/40029071/setting-series-as-index
 +  * https://chrisalbon.com/python/data_wrangling/pandas_dataframe_reindexing/ - contains some examples on using pandas.Series.reindex
 +  * https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.reindex.html - API
 +
 +==== Convert series to a dataframe ====
 +
 +Use to_frame(). By default, it will use the series name to set the column name in the dataframe. But you can also assign one while calling the to_frame function.
 +
 +<code>
 +>>> import pandas as pd
 +>>> b = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't'])
 +>>> b
 +s    sun
 +m    mon
 +t    tue
 +dtype: object
 +>>> b.to_frame()
 +     0
 +s  sun
 +m  mon
 +t  tue
 +>>> c = pd.Series(['sun', 'mon', 'tue'], index=['s', 'm', 't'], name='day')
 +>>> c
 +s    sun
 +m    mon
 +t    tue
 +Name: day, dtype: object
 +>>> c.to_frame()
 +   day
 +s  sun
 +m  mon
 +t  tue
 +>>> b.to_frame(name='days')
 +  days
 +s  sun
 +m  mon
 +t  tue
 +</code>
 ===== check if ===== ===== check if =====
 ==== check if a series is empty ==== ==== check if a series is empty ====
pandas_series.txt · Last modified: 2024/02/06 05:18 by raju