Package es.upm.etsisi.cf4j.data
Class RandomSplitDataSet
- java.lang.Object
-
- es.upm.etsisi.cf4j.data.RandomSplitDataSet
-
- All Implemented Interfaces:
DataSet
public class RandomSplitDataSet extends Object implements DataSet
This class implements the DataSet interface by random splitting the collaborative filtering ratings allocated in a text file. Each line of the ratings file must have the following format:<userId><separator><itemId><separator><rating>
Where <separator> is an special character that delimits ratings fields (semicolon by default).
Training and test ratings are selected randomly by the probability of an user and an item to belong to the test set.
-
-
Field Summary
Fields Modifier and Type Field Description protected static String
DEFAULT_SEPARATOR
protected List<DataSetEntry>
ratings
Raw stored ratingsprotected List<DataSetEntry>
testRatings
Raw stored test ratings
-
Constructor Summary
Constructors Constructor Description RandomSplitDataSet(String filename)
Generates a DataSet form a text file.RandomSplitDataSet(String filename, double testUsersPercent, double testItemsPercent)
Generates a DataSet form a text file.RandomSplitDataSet(String filename, double testUsersPercent, double testItemsPercent, long seed)
Generates a DataSet form a text file.RandomSplitDataSet(String filename, double testUsersPercent, double testItemsPercent, String separator)
Generates a DataSet form a text file.RandomSplitDataSet(String filename, double testUsersPercent, double testItemsPercent, String separator, long seed)
Generates a DataSet form a text file.RandomSplitDataSet(String filename, String separator)
Generates a DataSet form a text file.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
getNumberOfRatings()
This method indicates the number of (training) ratings.int
getNumberOfTestRatings()
This method indicates the number of test ratings.Iterator<DataSetEntry>
getRatingsIterator()
This method generates an iterator to navigate through the raw ratings stored in DataSetEntries.Iterator<DataSetEntry>
getTestRatingsIterator()
This method generates an iterator to navigate through the raw test ratings stored in DataSetEntries.
-
-
-
Field Detail
-
DEFAULT_SEPARATOR
protected static final String DEFAULT_SEPARATOR
- See Also:
- Constant Field Values
-
ratings
protected List<DataSetEntry> ratings
Raw stored ratings
-
testRatings
protected List<DataSetEntry> testRatings
Raw stored test ratings
-
-
Constructor Detail
-
RandomSplitDataSet
public RandomSplitDataSet(String filename) throws IOException
Generates a DataSet form a text file. The DataSet is loaded without test items and test users.- Parameters:
filename
- File with the ratings.- Throws:
IOException
- When the file is not accessible by the system with read permissions.
-
RandomSplitDataSet
public RandomSplitDataSet(String filename, double testUsersPercent, double testItemsPercent) throws IOException
Generates a DataSet form a text file. The DataSet is loaded with a specific percentage of test items and test users.- Parameters:
filename
- File with the ratings.testUsersPercent
- Percentage of users that will be of test.testItemsPercent
- Percentage of items that will be of test.- Throws:
IOException
- When the file is not accessible by the system with read permissions.
-
RandomSplitDataSet
public RandomSplitDataSet(String filename, double testUsersPercent, double testItemsPercent, long seed) throws IOException
Generates a DataSet form a text file. The DataSet is loaded with a specific percentage of test items and test users. This constructor allows to define an specific random seed to ensure the reproducibility of the experiments.- Parameters:
filename
- File with the ratings.testUsersPercent
- Percentage of users that will be of test.testItemsPercent
- Percentage of items that will be of test.seed
- Seed applied to the random number generator.- Throws:
IOException
- When the file is not accessible by the system with read permissions.
-
RandomSplitDataSet
public RandomSplitDataSet(String filename, double testUsersPercent, double testItemsPercent, String separator) throws IOException
Generates a DataSet form a text file. The DataSet is loaded with a specific percentage of test items and test users.- Parameters:
filename
- File with the ratings.testUsersPercent
- Percentage of users that will be of test.testItemsPercent
- Percentage of items that will be of test.separator
- Separator char between file fields.- Throws:
IOException
- When the file is not accessible by the system with read permissions.
-
RandomSplitDataSet
public RandomSplitDataSet(String filename, String separator) throws IOException
Generates a DataSet form a text file. The DataSet is loaded without test items and test users.- Parameters:
filename
- File with the ratings.separator
- Separator char between file fields.- Throws:
IOException
- When the file is not accessible by the system with read permissions.
-
RandomSplitDataSet
public RandomSplitDataSet(String filename, double testUsersPercent, double testItemsPercent, String separator, long seed) throws IOException
Generates a DataSet form a text file. The DataSet is loaded with a specific percentage of test items and test users. This constructor allows to define an specific random seed to ensure the reproducibility of the experiments.- Parameters:
filename
- File with the ratings.testUsersPercent
- Percentage of users that will be of test.testItemsPercent
- Percentage of items that will be of test.seed
- Seed applied to the random number generator.separator
- Separator char between file fields.- Throws:
IOException
- When the file is not accessible by the system with read permissions.
-
-
Method Detail
-
getRatingsIterator
public Iterator<DataSetEntry> getRatingsIterator()
Description copied from interface:DataSet
This method generates an iterator to navigate through the raw ratings stored in DataSetEntries.- Specified by:
getRatingsIterator
in interfaceDataSet
- Returns:
- Iterator of ratings
-
getTestRatingsIterator
public Iterator<DataSetEntry> getTestRatingsIterator()
Description copied from interface:DataSet
This method generates an iterator to navigate through the raw test ratings stored in DataSetEntries.- Specified by:
getTestRatingsIterator
in interfaceDataSet
- Returns:
- Iterator of test ratings
-
getNumberOfRatings
public int getNumberOfRatings()
Description copied from interface:DataSet
This method indicates the number of (training) ratings.- Specified by:
getNumberOfRatings
in interfaceDataSet
- Returns:
- Number of (training) ratings
-
getNumberOfTestRatings
public int getNumberOfTestRatings()
Description copied from interface:DataSet
This method indicates the number of test ratings.- Specified by:
getNumberOfTestRatings
in interfaceDataSet
- Returns:
- Number of test ratings
-
-