splitters module¶
Splitters for cross-validation.
Defines splitter classes similar (but may not compatible) to sklearn.model_selection.BaseCrossValidator
.
split_ranges_into_sets function¶
split_ranges_into_sets(
start_idxs,
end_idxs,
set_lens=(),
left_to_right=True
)
Generate ranges between each in start_idxs
and end_idxs
and optionally split into one or more sets.
Args
start_idxs
:array_like
- Start indices.
end_idxs
:array_like
- End indices.
set_lens
:list
offloat
-
Lengths of sets in each range.
The number of returned sets is the length of
set_lens
plus one, which stores the remaining elements.Can be passed per range.
left_to_right
:bool
orlist
ofbool
-
Whether to resolve
set_lens
from left to right.Makes the last set variable, otherwise makes the first set variable.
Can be passed per range.
Usage
set_lens=(0.5)
: 50% in training set, the rest in test setset_lens=(0.5, 0.25)
: 50% in training set, 25% in validation set, the rest in test setset_lens=(50, 30)
: 50 in training set, 30 in validation set, the rest in test setset_lens=(50, 30)
andleft_to_right=False
: 30 in test set, 50 in validation set, the rest in training set
BaseSplitter class¶
BaseSplitter()
Abstract splitter class.
Subclasses
split method¶
BaseSplitter.split(
X,
**kwargs
)
ExpandingSplitter class¶
ExpandingSplitter()
Expanding walk-forward splitter.
Superclasses
split method¶
ExpandingSplitter.split(
X,
n=None,
min_len=1,
**kwargs
)
Similar to RollingSplitter.split(), but expanding.
**kwargs
are passed to split_ranges_into_sets().
RangeSplitter class¶
RangeSplitter()
Range splitter.
Superclasses
split method¶
RangeSplitter.split(
X,
n=None,
range_len=None,
min_len=1,
start_idxs=None,
end_idxs=None,
**kwargs
)
Either split into n
ranges each range_len
long, or split into ranges between start_idxs
and end_idxs
, and concatenate along the column axis.
At least one of range_len
, n
, or start_idxs
and end_idxs
must be set:
- If
range_len
is None, are split evenly inton
ranges. - If
n
is None, returns the maximum number of ranges of lengthrange_len
(can be a percentage). - If
start_idxs
andend_idxs
, splits into ranges between both arrays. Both index arrays should be either NumPy arrays with absolute positions or pandas indexes with labels. The last index should be inclusive. The distance between each start and end index can be different, and smaller ranges are filled with NaNs.
range_len
can be a floating number between 0 and 1 to indicate a fraction of the total range.
**kwargs
are passed to split_ranges_into_sets().
RollingSplitter class¶
RollingSplitter()
Rolling walk-forward splitter.
Superclasses
split method¶
RollingSplitter.split(
X,
n=None,
window_len=None,
min_len=1,
**kwargs
)
Split by rolling a window.
**kwargs
are passed to split_ranges_into_sets().
SplitterT class¶
SplitterT(
*args,
**kwargs
)
Base class for protocol classes.
Protocol classes are defined as::
class Proto(Protocol):
def meth(self) -> int:
...
Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing).
For example::
__ class C__
def meth(self) -> int:
return 0
def func(x: Proto) -> int: return x.meth()
func(C()) # Passes static type check
See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as::
class GenProto(Protocol[T]):
def meth(self) -> T:
...
Superclasses
typing.Generic
typing.Protocol
split method¶
SplitterT.split(
X,
**kwargs
)