|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectweka.filters.Filter
weka.filters.unsupervised.attribute.RandomProjection
public class RandomProjection
Reduces the dimensionality of the data by projecting it onto a lower dimensional subspace using a random matrix with columns of unit length (i.e. It will reduce the number of attributes in the data while preserving much of its variation like PCA, but at a much less computational cost).
It first applies the NominalToBinary filter to convert all attributes to numeric before reducing the dimension. It preserves the class attribute.
For more information, see:
Dmitriy Fradkin, David Madigan: Experiments with random projections for machine learning. In: KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, 517-522, 003.
@inproceedings{Fradkin003,
address = {New York, NY, USA},
author = {Dmitriy Fradkin and David Madigan},
booktitle = {KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining},
pages = {517-522},
publisher = {ACM Press},
title = {Experiments with random projections for machine learning},
year = {003}
}
Valid options are:
-N <number> The number of dimensions (attributes) the data should be reduced to (default 10; exclusive of the class attribute, if it is set).
-D [SPARSE1|SPARSE2|GAUSSIAN]
The distribution to use for calculating the random matrix.
Sparse1 is:
sqrt(3)*{-1 with prob(1/6), 0 with prob(2/3), +1 with prob(1/6)}
Sparse2 is:
{-1 with prob(1/2), +1 with prob(1/2)}
-P <percent> The percentage of dimensions (attributes) the data should be reduced to (exclusive of the class attribute, if it is set). This -N option is ignored if this option is present or is greater than zero.
-M Replace missing values using the ReplaceMissingValues filter
-R <num> The random seed for the random number generator used for calculating the random matrix (default 42).
| Field Summary | |
|---|---|
static int |
GAUSSIAN
distribution type: gaussian |
static int |
SPARSE1
distribution type: sparse 1 |
static int |
SPARSE2
distribution type: sparse 2 |
static Tag[] |
TAGS_DSTRS_TYPE
The types of distributions that can be used for calculating the random matrix |
| Constructor Summary | |
|---|---|
RandomProjection()
|
|
| Method Summary | |
|---|---|
boolean |
batchFinished()
Signify that this batch of input to the filter is finished. |
java.lang.String |
distributionTipText()
Returns the tip text for this property |
Capabilities |
getCapabilities()
Returns the Capabilities of this filter. |
SelectedTag |
getDistribution()
Returns the current distribution that'll be used for calculating the random matrix |
int |
getNumberOfAttributes()
Gets the current number of attributes (dimensionality) to which the data will be reduced to. |
java.lang.String[] |
getOptions()
Gets the current settings of the filter. |
double |
getPercent()
Gets the percent the attributes (dimensions) of the data will be reduced to |
long |
getRandomSeed()
Gets the random seed of the random number generator |
boolean |
getReplaceMissingValues()
Gets the current setting for using ReplaceMissingValues filter |
java.lang.String |
getRevision()
Returns the revision string. |
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on. |
java.lang.String |
globalInfo()
Returns a string describing this filter |
boolean |
input(Instance instance)
Input an instance for filtering. |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
java.lang.String |
numberOfAttributesTipText()
Returns the tip text for this property |
java.lang.String |
percentTipText()
Returns the tip text for this property |
java.lang.String |
randomSeedTipText()
Returns the tip text for this property |
java.lang.String |
replaceMissingValuesTipText()
Returns the tip text for this property |
void |
setDistribution(SelectedTag newDstr)
Sets the distribution to use for calculating the random matrix |
boolean |
setInputFormat(Instances instanceInfo)
Sets the format of the input instances. |
void |
setNumberOfAttributes(int newAttNum)
Sets the number of attributes (dimensions) the data should be reduced to |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setPercent(double newPercent)
Sets the percent the attributes (dimensions) of the data should be reduced to |
void |
setRandomSeed(long seed)
Sets the random seed of the random number generator |
void |
setReplaceMissingValues(boolean t)
Sets either to use replace missing values filter or not |
| Methods inherited from class weka.filters.Filter |
|---|
batchFilterFile, filterFile, getCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputPeek, toString, useFilter, wekaStaticWrapper |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final int SPARSE1
public static final int SPARSE2
public static final int GAUSSIAN
public static final Tag[] TAGS_DSTRS_TYPE
| Constructor Detail |
|---|
public RandomProjection()
| Method Detail |
|---|
public java.util.Enumeration listOptions()
listOptions in interface OptionHandler
public void setOptions(java.lang.String[] options)
throws java.lang.Exception
-N <number> The number of dimensions (attributes) the data should be reduced to (default 10; exclusive of the class attribute, if it is set).
-D [SPARSE1|SPARSE2|GAUSSIAN]
The distribution to use for calculating the random matrix.
Sparse1 is:
sqrt(3)*{-1 with prob(1/6), 0 with prob(2/3), +1 with prob(1/6)}
Sparse2 is:
{-1 with prob(1/2), +1 with prob(1/2)}
-P <percent> The percentage of dimensions (attributes) the data should be reduced to (exclusive of the class attribute, if it is set). This -N option is ignored if this option is present or is greater than zero.
-M Replace missing values using the ReplaceMissingValues filter
-R <num> The random seed for the random number generator used for calculating the random matrix (default 42).
setOptions in interface OptionHandleroptions - the list of options as an array of strings
java.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlerpublic java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic java.lang.String numberOfAttributesTipText()
public void setNumberOfAttributes(int newAttNum)
newAttNum - the goal for the dimensionspublic int getNumberOfAttributes()
public java.lang.String percentTipText()
public void setPercent(double newPercent)
newPercent - the percentage of attributespublic double getPercent()
public java.lang.String randomSeedTipText()
public void setRandomSeed(long seed)
seed - the random seed valuepublic long getRandomSeed()
public java.lang.String distributionTipText()
public void setDistribution(SelectedTag newDstr)
newDstr - the distribution to usepublic SelectedTag getDistribution()
public java.lang.String replaceMissingValuesTipText()
public void setReplaceMissingValues(boolean t)
t - if true then the replace missing values is usedpublic boolean getReplaceMissingValues()
public Capabilities getCapabilities()
getCapabilities in interface CapabilitiesHandlergetCapabilities in class FilterCapabilities
public boolean setInputFormat(Instances instanceInfo)
throws java.lang.Exception
setInputFormat in class FilterinstanceInfo - an Instances object containing the input
instance structure (any instances contained in the object are
ignored - only the structure is required).
java.lang.Exception - if the input format can't be set
successfully
public boolean input(Instance instance)
throws java.lang.Exception
input in class Filterinstance - the input instance
java.lang.IllegalStateException - if no input format has been set
java.lang.NullPointerException - if the input format has not been
defined.
java.lang.Exception - if the input instance was not of the correct
format or if there was a problem with the filtering.
public boolean batchFinished()
throws java.lang.Exception
batchFinished in class Filterjava.lang.NullPointerException - if no input structure has been defined,
java.lang.Exception - if there was a problem finishing the batch.public java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class Filterpublic static void main(java.lang.String[] argv)
argv - should contain arguments to the filter:
use -h for help
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||