Acknowledgement - We thank Dr. Raghu Machiraju of the Ohio State University for his permission to use a version of this assignment.
Build a user interface in D3 which displays word tags learned from a text document. Let us call this wordmiasma. In this lab, you are required to create a word cloud that highlights “words” in a document. You can use any document you want, but if you need one, here is a suggestion :). Many others call this creation a wordcloud. Why call it differently? We will only implement simpler parts of the whole workflow. My pedagogical goal is to really allow you to go behind the scenes and learn the process rather than make you the patron saint of wordclouds. Below are examples of word clouds (the one in the middle is from https://tagul.com/).
This technique first originated online in the 1990s as tag clouds (famously described as "the mullets of the Internet"), which were used to display the popularity of keywords in bookmarks. However, they are somewhat controversial for a variety of reasons. For example, this guy hates them.
Since the goal of the lab is to create a wordcloud it is helpful to think about this exercise this way: You are “re-encoding” the document using visual metaphors. Here you amplify information. You want to pick out word gems and highlight or embellish them visually. Since, we belong to the dojo of “task-centric” design, we first write down the tasks which are:
To make wordmiasmas you can use any tools you want to do the initial analysis but you must use d3.js for the visualization part. You can use sites and each other to help you code but the code you write must be your own. Do not simply copy and paste code from a website.
To help give you ideas about creating interesting word miasmas we've listed some examples below. Take a note of the problem context and identify the tasks which include: comparison of word clouds, likely inferences and hypotheses from the word miasma. For any of the application below ensure that there is enough user interaction to generate hypotheses.
Create a simple food chain word miasma representing each population of animal species by font size. Thus, create a whole food web of two geographical areas or the same geographical area over multiple times.
You can do the same with cities and climates, size of the city represented by font size, location, and orientations.
Planets, galaxies, and their size represented by font size.
Miasma for text classification. Positive words in green color, and negative words in red. A dictionary is needed.
Compare text of novels from different genres. Make word clouds for each genre like, horror, sci fi , thriller, non-fiction etc.
Make word cloud of speeches/ramblings of famous and infamous folks and have the class guess the speaker? The goal is to increase the success of recognition.
Analyze presidential speeches, or historical speeches by MK Gandhi or Martin luther. (I want to see who uses the word non-violence’' more).
Shakespearean English vs Normal Joe English. Here the emphasis will be on the phrases than on just individual words, so tonkenizer should tokenize phrases and not words (HARD)
Compare the works of rappers and find out who makes the most use of English vocabulary :).
Or make your own ...
Please submit a folder containing an index.html file which will open the word miasma, the data, a readme describing what you did, why, and the data source(s) you used, as well as any other associated files to moodle.
A selection of helpful JS/d3/mockup tools that may be helpful