- Download library Version 1.2 - Now reads out and saves counts for each node. This has been added to conform to the new datasets for Assignment 2. (Java 1.4 compatible)
- Update 02/28 Datasets for Assignment 2 - with labels
Info on the Data: The datasets we are using for Assignment 2 are hierarchies of word relationships. The underlying dataset is a book entitled "Gamer Theory 2.0" by Mackenzie Wark. A word has been chosen randomly from the book and is placed at the root of the tree. In the dataset, you will see trees starting at the word "activity","artifact","book", etc. The rest of the hierarchy starting from this root word consists of the linguistic "IS-A" relationship, or hyponymy relation, extracted from a library called WordNet. Each node in the tree decends from the root according to this relationship. A new addition to the dataset is that each node has a given weight or count. The "count" values are the number of occurrences of that word in the book divided by the number of meanings for that word in WordNet. This works to reduce the visual strength of words whose meaning is ambiguous and emphasize the ones that are more certain. You can use this weight information in your interaction, for example, as the basis of a DOI calculation or to visually enlarge nodes that have a higher weight but you don't have to. Each dataset is named with the root word, the maximum depth, and the number of nodes. Note that there are a number of smaller datasets included for testing purposes but the assignment asks for you to explore those of at least depth 8.
Note that some of the trees, such as entity (which contains all the nouns in WordNet), may contain a very small fraction of words considered offensive by some. WordNet endeavours to catalogue the entire language, including common colloquial words.
Thanks to Chris Collins for generating the datasets!
Tips and Tricks
- To load a file path-independenty, put all your datasets in your projects data directory and use this call:
File file = new File(dataPath("filename.tree"));
- Datasets for Assignment 2 (without labels)
- Download library Version 1.0 for Java 1.4
- Download library Version 1.1 - Now reads out and saves titles for the nodes which can be displayed in the visual representation. Note: this and all future versions will be Java 1.4 compatible.
- Tree Data Sets for Assignment 1
Please send me bug reports if you encounter any difficulties. Files will be updated.