reshape_fns module¶
Functions for reshaping arrays.
Reshape functions transform a pandas object/NumPy array in some way, such as tiling, broadcasting, and unstacking.
IndexFromLike _UnionGenericAlias¶
Any object that can be coerced into a index_from
argument.
broadcast function¶
broadcast(
*args,
to_shape=None,
to_pd=None,
to_frame=None,
align_index=None,
align_columns=None,
index_from=None,
columns_from=None,
require_kwargs=None,
keep_raw=False,
return_meta=False,
**kwargs
)
Bring any array-like object in args
to the same shape by using NumPy broadcasting.
See Broadcasting.
Can broadcast pandas objects by broadcasting their index/columns with broadcast_index().
Args
*args
:array_like
- Array-like objects.
to_shape
:tuple
ofint
- Target shape. If set, will broadcast every element in
args
toto_shape
. to_pd
:bool
orlist
ofbool
-
Whether to convert all output arrays to pandas, otherwise returns raw NumPy arrays. If None, converts only if there is at least one pandas object among them.
If sequence, applies to each argument.
to_frame
:bool
- Whether to convert all Series to DataFrames.
align_index
:bool
-
Whether to align index of pandas objects using multi-index.
Pass None to use the default.
align_columns
:bool
-
Whether to align columns of pandas objects using multi-index.
Pass None to use the default.
index_from
:any
-
Broadcasting rule for index.
Pass None to use the default.
columns_from
:any
-
Broadcasting rule for columns.
Pass None to use the default.
require_kwargs
:dict
orlist
ofdict
-
Keyword arguments passed to
np.require
.If sequence, applies to each argument.
keep_raw
:bool
orlist
ofbool
-
Whether to keep the unbroadcasted version of the array.
Only makes sure that the array can be broadcast to the target shape.
If sequence, applies to each argument.
return_meta
:bool
- Whether to also return new shape, index and columns.
**kwargs
- Keyword arguments passed to broadcast_index().
For defaults, see broadcasting
in settings.
Usage
- Without broadcasting index and columns:
>>> import numpy as np
>>> import pandas as pd
>>> from vectorbt.base.reshape_fns import broadcast
>>> v = 0
>>> a = np.array([1, 2, 3])
>>> sr = pd.Series([1, 2, 3], index=pd.Index(['x', 'y', 'z']), name='a')
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]],
... index=pd.Index(['x2', 'y2', 'z2']),
... columns=pd.Index(['a2', 'b2', 'c2']))
>>> for i in broadcast(
... v, a, sr, df,
... index_from='keep',
... columns_from='keep',
... ): print(i)
0 1 2
0 0 0 0
1 0 0 0
2 0 0 0
0 1 2
0 1 2 3
1 1 2 3
2 1 2 3
a a a
x 1 1 1
y 2 2 2
z 3 3 3
a2 b2 c2
x2 1 2 3
y2 4 5 6
z2 7 8 9
- Taking new index and columns from position:
>>> for i in broadcast(
... v, a, sr, df,
... index_from=2,
... columns_from=3
... ): print(i)
a2 b2 c2
x 0 0 0
y 0 0 0
z 0 0 0
a2 b2 c2
x 1 2 3
y 1 2 3
z 1 2 3
a2 b2 c2
x 1 1 1
y 2 2 2
z 3 3 3
a2 b2 c2
x 1 2 3
y 4 5 6
z 7 8 9
- Broadcasting index and columns through stacking:
>>> for i in broadcast(
... v, a, sr, df,
... index_from='stack',
... columns_from='stack'
... ): print(i)
a2 b2 c2
x x2 0 0 0
y y2 0 0 0
z z2 0 0 0
a2 b2 c2
x x2 1 2 3
y y2 1 2 3
z z2 1 2 3
a2 b2 c2
x x2 1 1 1
y y2 2 2 2
z z2 3 3 3
a2 b2 c2
x x2 1 2 3
y y2 4 5 6
z z2 7 8 9
- Setting index and columns manually:
>>> for i in broadcast(
... v, a, sr, df,
... index_from=['a', 'b', 'c'],
... columns_from=['d', 'e', 'f']
... ): print(i)
d e f
a 0 0 0
b 0 0 0
c 0 0 0
d e f
a 1 2 3
b 1 2 3
c 1 2 3
d e f
a 1 1 1
b 2 2 2
c 3 3 3
d e f
a 1 2 3
b 4 5 6
c 7 8 9
broadcast_index function¶
broadcast_index(
args,
to_shape,
index_from=None,
axis=0,
ignore_sr_names=None,
**kwargs
)
Produce a broadcast index/columns.
Args
args
:list
ofarray_like
- Array-like objects.
to_shape
:tuple
ofint
- Target shape.
index_from
:any
-
Broadcasting rule for this index/these columns.
Accepts the following values:
- 'keep' or None - keep the original index/columns of the objects in
args
- 'stack' - stack different indexes/columns using stack_indexes()
- 'strict' - ensure that all pandas objects have the same index/columns
- 'reset' - reset any index/columns (they become a simple range)
- integer - use the index/columns of the i-th object in
args
- everything else will be converted to
pd.Index
- 'keep' or None - keep the original index/columns of the objects in
axis
:int
- Set to 0 for index and 1 for columns.
ignore_sr_names
:bool
-
Whether to ignore Series names if they are in conflict.
Conflicting Series names are those that are different but not None.
**kwargs
- Keyword arguments passed to stack_indexes().
For defaults, see broadcasting
in settings.
Note
Series names are treated as columns with a single element but without a name. If a column level without a name loses its meaning, better to convert Series to DataFrames with one column prior to broadcasting. If the name of a Series is not that important, better to drop it altogether by setting it to None.
broadcast_to function¶
broadcast_to(
arg1,
arg2,
to_pd=None,
index_from=None,
columns_from=None,
**kwargs
)
Broadcast arg1
to arg2
.
Pass None to index_from
/columns_from
to use index/columns of the second argument.
Keyword arguments **kwargs
are passed to broadcast().
Usage
>>> import numpy as np
>>> import pandas as pd
>>> from vectorbt.base.reshape_fns import broadcast_to
>>> a = np.array([1, 2, 3])
>>> sr = pd.Series([4, 5, 6], index=pd.Index(['x', 'y', 'z']), name='a')
>>> broadcast_to(a, sr)
x 1
y 2
z 3
Name: a, dtype: int64
>>> broadcast_to(sr, a)
array([4, 5, 6])
broadcast_to_array_of function¶
broadcast_to_array_of(
arg1,
arg2
)
Broadcast arg1
to the shape (1, *arg2.shape)
.
arg1
must be either a scalar, a 1-dim array, or have 1 dimension more than arg2
.
Usage
>>> import numpy as np
>>> from vectorbt.base.reshape_fns import broadcast_to_array_of
>>> broadcast_to_array_of([0.1, 0.2], np.empty((2, 2)))
[[[0.1 0.1]
[0.1 0.1]]
[[0.2 0.2]
[0.2 0.2]]]
broadcast_to_axis_of function¶
broadcast_to_axis_of(
arg1,
arg2,
axis,
require_kwargs=None
)
Broadcast arg1
to an axis of arg2
.
If arg2
has less dimensions than requested, will broadcast arg1
to a single number.
For other keyword arguments, see broadcast().
flex_choose_i_and_col_nb function¶
flex_choose_i_and_col_nb(
a,
flex_2d=True
)
Choose selection index and column based on the array's shape.
Instead of expensive broadcasting, keep the original shape and do indexing in a smart way. A nice feature of this is that it has almost no memory footprint and can broadcast in any direction infinitely.
Call it once before using flex_select_nb().
if flex_2d
is True, 1-dim array will correspond to columns, otherwise to rows.
flex_select_auto_nb function¶
flex_select_auto_nb(
a,
i,
col,
flex_2d=True
)
Combines flex_choose_i_and_col_nb() and flex_select_nb().
flex_select_nb function¶
flex_select_nb(
a,
i,
col,
flex_i,
flex_col,
flex_2d=True
)
Select element of a
as if it has been broadcast.
get_multiindex_series function¶
get_multiindex_series(
arg
)
Get Series with a multi-index.
If DataFrame has been passed, should at maximum have one row or column.
make_symmetric function¶
make_symmetric(
arg,
sort=True
)
Make arg
symmetric.
The index and columns of the resulting DataFrame will be identical.
Requires the index and columns to have the same number of levels.
Pass sort=False
if index and columns should not be sorted, but concatenated and get duplicates removed.
Usage
>>> import pandas as pd
>>> from vectorbt.base.reshape_fns import make_symmetric
>>> df = pd.DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['c', 'd'])
>>> make_symmetric(df)
a b c d
a NaN NaN 1.0 2.0
b NaN NaN 3.0 4.0
c 1.0 3.0 NaN NaN
d 2.0 4.0 NaN NaN
repeat function¶
repeat(
arg,
n,
axis=1,
raw=False
)
Repeat each element in arg
n
times along the specified axis.
soft_to_ndim function¶
soft_to_ndim(
arg,
ndim,
raw=False
)
Try to softly bring arg
to the specified number of dimensions ndim
(max 2).
tile function¶
tile(
arg,
n,
axis=1,
raw=False
)
Repeat the whole arg
n
times along the specified axis.
to_1d function¶
to_1d(
arg,
raw=False
)
Reshape argument to one dimension.
If raw
is True, returns NumPy array. If 2-dim, will collapse along axis 1 (i.e., DataFrame with one column to Series).
to_2d function¶
to_2d(
arg,
raw=False,
expand_axis=1
)
Reshape argument to two dimensions.
If raw
is True, returns NumPy array. If 1-dim, will expand along axis 1 (i.e., Series to DataFrame with one column).
to_any_array function¶
to_any_array(
arg,
raw=False
)
Convert any array-like object to an array.
Pandas objects are kept as-is.
to_dict function¶
to_dict(
arg,
orient='dict'
)
Convert object to dict.
to_pd_array function¶
to_pd_array(
arg
)
Convert any array-like object to a pandas object.
unstack_to_array function¶
unstack_to_array(
arg,
levels=None
)
Reshape arg
based on its multi-index into a multi-dimensional array.
Use levels
to specify what index levels to unstack and in which order.
Usage
>>> import pandas as pd
>>> from vectorbt.base.reshape_fns import unstack_to_array
>>> index = pd.MultiIndex.from_arrays(
... [[1, 1, 2, 2], [3, 4, 3, 4], ['a', 'b', 'c', 'd']])
>>> sr = pd.Series([1, 2, 3, 4], index=index)
>>> unstack_to_array(sr).shape
(2, 2, 4)
>>> unstack_to_array(sr)
[[[ 1. nan nan nan]
[nan 2. nan nan]]
[[nan nan 3. nan]
[nan nan nan 4.]]]
>>> unstack_to_array(sr, levels=(2, 0))
[[ 1. nan]
[ 2. nan]
[nan 3.]
[nan 4.]]
unstack_to_df function¶
unstack_to_df(
arg,
index_levels=None,
column_levels=None,
symmetric=False,
sort=True
)
Reshape arg
based on its multi-index into a DataFrame.
Use index_levels
to specify what index levels will form new index, and column_levels
for new columns. Set symmetric
to True to make DataFrame symmetric.
Usage
>>> import pandas as pd
>>> from vectorbt.base.reshape_fns import unstack_to_df
>>> index = pd.MultiIndex.from_arrays(
... [[1, 1, 2, 2], [3, 4, 3, 4], ['a', 'b', 'c', 'd']],
... names=['x', 'y', 'z'])
>>> sr = pd.Series([1, 2, 3, 4], index=index)
>>> unstack_to_df(sr, index_levels=(0, 1), column_levels=2)
z a b c d
x y
1 3 1.0 NaN NaN NaN
1 4 NaN 2.0 NaN NaN
2 3 NaN NaN 3.0 NaN
2 4 NaN NaN NaN 4.0
wrap_broadcasted function¶
wrap_broadcasted(
old_arg,
new_arg,
is_pd=False,
new_index=None,
new_columns=None
)
If the newly brodcasted array was originally a pandas object, make it pandas object again and assign it the newly broadcast index/columns.