SHOGUN
v1.1.0
|
Streaming features are features which are used for online algorithms.
Reading/parsing of input, and learning through the algorithm are carried out in separate threads. Input is from a CStreamingFile object.
A StreamingFeatures object usually stores only one example at a time, and any function like dot(), add_to_dense_vec() etc. apply with this example as one implicit operand.
Similarly, when we refer to the "feature vector" of a StreamingFeatures object, it refers to the vector of the example currently stored in that object.
It is up to the user to indicate when he is done using the example so that the next one can be fetched and stored in its place.
Example objects are fetched one-by-one through a CInputParser object, and therefore a StreamingFeatures object must implement the following methods in the derived class:
get_num_features(): returns the number of features for the current example.
release_example() must be called before get_next_example().
The feature vector itself may be returned through a derived class since at the moment the parser is templated for each data type.
Thus, a templated or specialized version of get_vector(SGVector<T>) must be implemented in the derived class.
Definition at line 63 of file StreamingFeatures.h.
Public Member Functions | |
CStreamingFeatures () | |
CStreamingFeatures (CStreamingFile *file, bool is_labelled, int32_t size) | |
virtual | ~CStreamingFeatures () |
void | set_read_functions () |
virtual void | set_vector_reader ()=0 |
virtual void | set_vector_and_label_reader ()=0 |
virtual void | start_parser ()=0 |
virtual void | end_parser ()=0 |
virtual float64_t | get_label ()=0 |
virtual bool | get_next_example ()=0 |
virtual void | release_example ()=0 |
virtual int32_t | get_num_features ()=0 |
virtual bool | get_has_labels () |
virtual bool | is_seekable () |
virtual void | reset_stream () |
![]() | |
CFeatures (int32_t size=0) | |
CFeatures (const CFeatures &orig) | |
CFeatures (CFile *loader) | |
virtual CFeatures * | duplicate () const =0 |
virtual | ~CFeatures () |
virtual EFeatureType | get_feature_type ()=0 |
virtual EFeatureClass | get_feature_class ()=0 |
virtual int32_t | add_preprocessor (CPreprocessor *p) |
set preprocessor | |
virtual CPreprocessor * | del_preprocessor (int32_t num) |
del current preprocessor | |
CPreprocessor * | get_preprocessor (int32_t num) |
get current preprocessor | |
void | set_preprocessed (int32_t num) |
bool | is_preprocessed (int32_t num) |
int32_t | get_num_preprocessed () |
get whether specified preprocessor (or all if num=1) was/were already applied | |
int32_t | get_num_preprocessors () const |
void | clean_preprocessors () |
int32_t | get_cache_size () |
virtual int32_t | get_num_vectors () const =0 |
virtual bool | reshape (int32_t num_features, int32_t num_vectors) |
virtual int32_t | get_size ()=0 |
void | list_feature_obj () |
virtual void | load (CFile *loader) |
virtual void | save (CFile *writer) |
bool | check_feature_compatibility (CFeatures *f) |
bool | has_property (EFeatureProperty p) |
void | set_property (EFeatureProperty p) |
void | unset_property (EFeatureProperty p) |
virtual void | set_subset (CSubset *subset) |
virtual void | remove_subset () |
virtual void | subset_changed_post () |
index_t | subset_idx_conversion (index_t idx) const |
bool | has_subset () const |
virtual CFeatures * | copy_subset (SGVector< index_t > indices) |
![]() | |
CSGObject () | |
CSGObject (const CSGObject &orig) | |
virtual | ~CSGObject () |
virtual const char * | get_name () const =0 |
virtual bool | is_generic (EPrimitiveType *generic) const |
template<class T > | |
void | set_generic () |
void | unset_generic () |
virtual void | print_serializable (const char *prefix="") |
virtual bool | save_serializable (CSerializableFile *file, const char *prefix="") |
virtual bool | load_serializable (CSerializableFile *file, const char *prefix="") |
void | set_global_io (SGIO *io) |
SGIO * | get_global_io () |
void | set_global_parallel (Parallel *parallel) |
Parallel * | get_global_parallel () |
void | set_global_version (Version *version) |
Version * | get_global_version () |
SGVector< char * > | get_modelsel_names () |
char * | get_modsel_param_descr (const char *param_name) |
index_t | get_modsel_param_index (const char *param_name) |
Protected Attributes | |
bool | has_labels |
Whether examples are labelled or not. | |
CStreamingFile * | working_file |
The StreamingFile object to read from. | |
bool | seekable |
Whether the stream is seekable. | |
![]() | |
CSubset * | m_subset |
Additional Inherited Members | |
![]() | |
SGIO * | io |
Parallel * | parallel |
Version * | version |
Parameter * | m_parameters |
Parameter * | m_model_selection_parameters |
![]() | |
virtual void | load_serializable_pre () throw (ShogunException) |
virtual void | load_serializable_post () throw (ShogunException) |
virtual void | save_serializable_pre () throw (ShogunException) |
virtual void | save_serializable_post () throw (ShogunException) |
Default constructor with no args. Doesn't do anything yet.
Definition at line 5 of file StreamingFeatures.cpp.
CStreamingFeatures | ( | CStreamingFile * | file, |
bool | is_labelled, | ||
int32_t | size | ||
) |
Constructor with input information passed.
file | CStreamingFile to take input from. |
is_labelled | Whether examples are labelled or not. |
size | Number of examples to be held in the parser's "ring". |
Definition at line 9 of file StreamingFeatures.cpp.
|
virtual |
Destructor
Definition at line 86 of file StreamingFeatures.h.
|
pure virtual |
End the parser. Wait for the parsing thread to complete.
Implemented in CStreamingStringFeatures< T >, CStreamingVwFeatures, CStreamingSparseFeatures< T >, and CStreamingSimpleFeatures< T >.
|
virtual |
Return whether the examples are labelled or not.
Definition at line 20 of file StreamingFeatures.cpp.
|
pure virtual |
Return the label of the current example.
Raise an error if the input has been specified as unlabelled.
Implemented in CStreamingStringFeatures< T >, CStreamingVwFeatures, CStreamingSparseFeatures< T >, and CStreamingSimpleFeatures< T >.
|
pure virtual |
Indicate to the parser that it must fetch the next example.
Implemented in CStreamingStringFeatures< T >, CStreamingVwFeatures, CStreamingSparseFeatures< T >, and CStreamingSimpleFeatures< T >.
|
pure virtual |
Get the number of features in the current example.
Implemented in CStreamingVwFeatures, CStreamingSparseFeatures< T >, CStreamingStringFeatures< T >, and CStreamingSimpleFeatures< T >.
|
virtual |
Whether the stream is seekable (to check if multiple epochs are possible), i.e., whether we can process examples in a batch fashion.
A stream can usually seekable when it comes from a file or when it comes from another conventional CFeatures object.
Definition at line 25 of file StreamingFeatures.cpp.
|
pure virtual |
Indicate that processing of the current example is done. The parser then considers it safe to dispose of that example and replace it with another one.
Implemented in CStreamingStringFeatures< T >, CStreamingVwFeatures, CStreamingSparseFeatures< T >, and CStreamingSimpleFeatures< T >.
|
virtual |
Function to reset the stream (if possible).
Reimplemented in CStreamingSparseFeatures< T >, CStreamingVwFeatures, and CStreamingSimpleFeatures< T >.
Definition at line 30 of file StreamingFeatures.cpp.
void set_read_functions | ( | ) |
Set the vector reading functions.
The functions are implemented specific to the type in the derived class.
Definition at line 14 of file StreamingFeatures.cpp.
|
pure virtual |
The derived object must set the function which will be used by the parser for reading one vector and label from the file. This function should be a member of the CStreamingFile class.
See the implementation in StreamingSimpleFeatures for details.
Implemented in CStreamingVwFeatures, CStreamingSparseFeatures< T >, CStreamingSimpleFeatures< T >, and CStreamingStringFeatures< T >.
|
pure virtual |
The derived object must set the function which will be used for reading one vector from the file. This function should be a member of the CStreamingFile class.
See the implementation in StreamingSimpleFeatures for details.
Implemented in CStreamingVwFeatures, CStreamingSparseFeatures< T >, CStreamingSimpleFeatures< T >, and CStreamingStringFeatures< T >.
|
pure virtual |
Start the parser. It stores parsed examples from the input in a separate thread.
Implemented in CStreamingStringFeatures< T >, CStreamingVwFeatures, CStreamingSparseFeatures< T >, and CStreamingSimpleFeatures< T >.
|
protected |
Whether examples are labelled or not.
Definition at line 185 of file StreamingFeatures.h.
|
protected |
Whether the stream is seekable.
Definition at line 191 of file StreamingFeatures.h.
|
protected |
The StreamingFile object to read from.
Definition at line 188 of file StreamingFeatures.h.