Project Topics
I'd like it if you come up with your own ideas! However, if you'd like to get some ideas, there are possibilities of projects in the areas of:
climate science (together with Niklas Röber, DKFZ, Hamburg)
The goal is the visualization of thermohaline circulation which is also known as the ocean conveyer belt. The task is finding it in a large data set and getting an idea what characteristics (speed, salt concentration, temperature, density) it might have in different parts of the ocean. The data set is a volumetric data set of size 800x400x50. You could use your own software to visualize it or some standard tools, like Avizo, ParaView, SimVis, Vapor, Visit. The key is finding the right transfer function! (download data, 2.3GB)
summary of algorithmic performances (together with Martin Polascheck)
The lecture on Algorithm and Data Structures includes an assignment in which each student has to implement a data structure for sorting. At the end of the course, it is possible to compare one's own results with everybody else. However, there is a large set of graphs that are being produced and one looses the overview very quickly. The task is to create an interactive program to create a better overview of all the data as well as good interaction techniques to quickly drill down into the relevant details a user would like to see. (download data, 365KB)
aiding in the modelling of engines (together with Harald Piringer, VRVis)
The data set consists of 1000 different simulations of an engine. Different design parameters (such as "EVO_shift", "IVO_shift", "ROI_shift", "Vane_p") have been changed for each simulation run. There are further parameters, such as "EnSpeed" and "Load_sig" that can take any values while the engine is running. The engineers are interested in minimizing ("TRAPPED-FUEL_MB", "COMBUSTION-NOISE_C of ENGINE 1") and maximizing (e.g. "TORQUE_PF") a number of other parameters. The simple question is -- what is the best parameter setting? Since there is not one single parameter setting that will fulfill all the constraints, can you build a system that let the engineer better understand the tradeoffs? (download data csv, 106KB and txt, 1.9KB)
Temperature Data
Would you like to play with the temperature data by yourself? This is no problem. I provide them to you as an xls file. What's the best way to convince the Rektor that we need an AC unit in our building? How can you create a proper overview. Do you find questionable data points? Are there differences in the floors? What hypothesis can you create / verify / invalidate with this data?! Help! (download data, 8.9MB and Building's Floor Plan, 300KB)
Open Data
There has been a deluge of open data by various government and governmental organization over the last few years. While this is admirable, what good is all this data doing if the common citizen is not being able to understand, explore, nor learn from this data. Hence, the goal is to develop a tool (ideally) web based that helps people to explore such data. One of the challenges will be to gear this tool toward a broad set of people, hence you cannot assume a great visual literacy (a problem the New York times has been struggling with and perhaps is providing some ideas for). Further, it is unrealistic to provide a universal tool where all types of data can be explored with and all questions can be answered with. Hence, it'll be important to narrow your focus on specific aspect of civic life. There are quite a number of open data sources that you can choose from:
IEEE Vis / BioVis contests
There is a visualization contest running right now which is on visualization of climate data. Further, BioVis is running a Data Contest as well as a Redesign Contest right now. Don't be afraid! Check it out and see what you can do. It'll be fun, and might get you a trip to Paris in November to attend the Visualization conference. I do believe these are doable for you!
... there may be others :)
Open Data
Agriculture, Food and Nutrition
- World wine statistics - Information on worldwide wine production and consumption.
- USDA PLANTS Database - The PLANTS Database provides standardized information about the vascular plants, mosses, liverworts, hornworts, and lichens of the U.S. and its territories. It includes names, plant symbols, checklists, distributional data, species abstracts, characteristics, images, plant links, references, crop information, and automated tools.
Demographics
- Frequently occurring first and last names - U.S. Census Bureau genealogical data on names.
- Popular baby names - Social Security Administration data on distributions of given names.
- Human Mortality Database - The Human Mortality Database (HMD) was created to provide detailed mortality and population data to researchers, students, journalists, policy analysts, and others interested in the history of human longevity.
National Surveys of 8th Graders
A nationally representative sample of eighth-graders were first surveyed in the spring of 1988. A sample of these respondents were then resurveyed through four follow-ups in 1990, 1992, 1994, and 2000. On the questionnaire, students reported on a range of topics including: school, work, and home experiences; educational resources and support; the role in education of their parents and peers; neighborhood characteristics; educational and occupational aspirations; and other student perceptions.
The .xls file contains 2000 records of students' responses to a variety of questions and at different points in time. The codebook explains the question and answer codes.
Other
- Baseball Statistics - The Lahman baseball database, 1871-present.
- Google Trends - Track the average worldwide traffic of any search term. Once you get the results, scroll to the bottom of the page and look for "Export this page as a CSV file". You must be logged into Google for the feature to work
Politics and Government
Florida 2000 Ballot Data
This data set is Florida election data from the
CMU Statistical Data Repository. (Note: when downloading these files, be sure to use the correct "save-file" operation for your browser ... IE tends to add extra characters that confused the programs.)
U.S. House of Representatives Roll Call Data
This contains roll call data from the 108th House of Representatives: data about 1218 bills introduced in the House and how each of its 439 members voted on it. The data covers the years 2003 and 2004. The individual columns are a mix of information about the bills and about the legislators, so there's quite a bit of redundancy in the file for the sake of easier processing in Tableau.
Government Spending Data
Have you ever wanted to find more information on government spending? Have you ever wondered where federal contracting dollars and grant awards go? Or perhaps you would just like to know, as a citizen, what the government is really doing with your money.
Visualization Contest
For a number of years, the Vis, InfoVis, and VAST conferences have created
a visualization contest. For each contest a problem scenario together with
the relevant data sets have been provided to the research community and a
price has been awarded to the best visualization. Some of the problems have
been quite challenging. However, for the most part, these are great problems
to work on. Have a look: