splitters module¶
Splitters for cross-validation.
Defines splitter classes similar (but may not compatible) to sklearn.model_selection.BaseCrossValidator.
split_ranges_into_sets function¶
split_ranges_into_sets(
start_idxs,
end_idxs,
set_lens=(),
left_to_right=True
)
Generate ranges between each in start_idxs and end_idxs and optionally split into one or more sets.
Args
start_idxs:array_like- Start indices.
end_idxs:array_like- End indices.
set_lens:listoffloat-
Lengths of sets in each range.
The number of returned sets is the length of
set_lensplus one, which stores the remaining elements.Can be passed per range.
left_to_right:boolorlistofbool-
Whether to resolve
set_lensfrom left to right.Makes the last set variable, otherwise makes the first set variable.
Can be passed per range.
Usage
set_lens=(0.5): 50% in training set, the rest in test setset_lens=(0.5, 0.25): 50% in training set, 25% in validation set, the rest in test setset_lens=(50, 30): 50 in training set, 30 in validation set, the rest in test setset_lens=(50, 30)andleft_to_right=False: 30 in test set, 50 in validation set, the rest in training set
BaseSplitter class¶
BaseSplitter()
Abstract splitter class.
Subclasses
split method¶
BaseSplitter.split(
X,
**kwargs
)
ExpandingSplitter class¶
ExpandingSplitter()
Expanding walk-forward splitter.
Superclasses
split method¶
ExpandingSplitter.split(
X,
n=None,
min_len=1,
**kwargs
)
Similar to RollingSplitter.split(), but expanding.
**kwargs are passed to split_ranges_into_sets().
RangeSplitter class¶
RangeSplitter()
Range splitter.
Superclasses
split method¶
RangeSplitter.split(
X,
n=None,
range_len=None,
min_len=1,
start_idxs=None,
end_idxs=None,
**kwargs
)
Either split into n ranges each range_len long, or split into ranges between start_idxs and end_idxs, and concatenate along the column axis.
At least one of range_len, n, or start_idxs and end_idxs must be set:
- If
range_lenis None, are split evenly intonranges. - If
nis None, returns the maximum number of ranges of lengthrange_len(can be a percentage). - If
start_idxsandend_idxs, splits into ranges between both arrays. Both index arrays should be either NumPy arrays with absolute positions or pandas indexes with labels. The last index should be inclusive. The distance between each start and end index can be different, and smaller ranges are filled with NaNs.
range_len can be a floating number between 0 and 1 to indicate a fraction of the total range.
**kwargs are passed to split_ranges_into_sets().
RollingSplitter class¶
RollingSplitter()
Rolling walk-forward splitter.
Superclasses
split method¶
RollingSplitter.split(
X,
n=None,
window_len=None,
min_len=1,
**kwargs
)
Split by rolling a window.
**kwargs are passed to split_ranges_into_sets().
SplitterT class¶
SplitterT(
*args,
**kwargs
)
Base class for protocol classes.
Protocol classes are defined as::
class Proto(Protocol):
def meth(self) -> int:
...
Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing).
For example::
__ class C__
def meth(self) -> int:
return 0
def func(x: Proto) -> int: return x.meth()
func(C()) # Passes static type check
See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as::
class GenProto(Protocol[T]):
def meth(self) -> T:
...
Superclasses
typing.Generictyping.Protocol
split method¶
SplitterT.split(
X,
**kwargs
)