Due March 3 | CS 7450 - Information Visualization | Spring 2011 |
This assignment will familiarize you with a number of systems that have been built for analyzing multivariate data sets. You will be working with InfoZoom, Spotfire, Tableau, and QlikView.
The goals of the assignment are for you to learn the capabilities provided by these types of systems, learn the visualization methods that they provide, and assess their utility in analyzing information repositories. You will work with some provided data sets in the assignment. Think about the kinds of questions that an analyst would be asking about the data sets. IMPORTANT: For the assignment, you only need work with three of the four commercial systems. The choice of which three is up to you.
The assignment has four parts:
1. Gain familiarity with the systems
Familiarize yourself
with the visualization techniques and the user interfaces of the
different systems. Each one has a tutorial that you should try out
with a sample data set. Work your way through the tutorial and become
familiar with the system, its interface and its capabilities.
2. Examine the sample data sets
Each tool includes a few sample data sets, but often it's best to
learn with something new. Five data sets are supplied the Resources
page of t-square for you
to consider: foods' nutritional data (5976 items, 32 vars.), stocks
(500 items, 30 vars.), baseball statistics (322 items, 24 vars.),
college information (51
items, 22 vars.) and professor's salaries (1160 items, 16 vars.).
You must work with the food nutrition data set and you are free to
pick the one other set that is most interesting to you.
Briefly scan the text of the files and familiarize
yourself with the variables. Generate and write down (you will need
to turn them in) a few hypotheses to be
considered, tasks to be performed, or questions to be asked about the
data elements. Think about all the different kinds of analysis tasks
that a person might want to perform in working with data sets such as
these. For instance, someone working with a data set about breakfast
cereals might have tasks like:
3. Load and examine the data sets into the systems
Load the
nutrition and other data set that you selected into each of the three
visualization tools you selected, then consider your
hypotheses, tasks, and questions. Also use the systems to explore the
data sets and see if you can discover other interesting or unexpected
findings in the data sets. Put yourself in the shoes of a data
analyst, and consider questions that such a person would confront.
4. Write a report on your findings
Write up a summary of your exploration process, findings, and
impressions of the systems. Include your
hypotheses/tasks/questions and
what you found. Furthermore, critique the different tools in a general
sense. (Feel free to include screenshots to help explain your
analyses and critiques.) What are the systems' strengths and
weaknesses? How do their visualization capabilities differ? For what
kinds of user tasks is each tool suited? Focus more here on the
visualization techniques as opposed to the particular user interface
quirks, though you should feel free to comment on UI aspects when they
are particularly good or bad. Additionally, for each tool, list one
unexpected finding, insight, or discovery made while exploring one of
the datasets with that tool.
Explain how the system helped to facilitate the finding.
We recommend that you not walk through each question/task one-by-one for each of the three systems you used. (There simply won't be space to do so.) You might want to include specific examples of how the systems assisted or did not assist work on specific tasks, however. Point out interesting, insightful observations; you don't need to tell us how a system works -- we already know that. Think of this like a report to your manager who wants to know what each system can provide, its pros and cons. Focus specifically on how its visualizations help or hinder analysis. How dod the systems compare?
Your document is limited to a maximum of 8 pages, single-spaced, reasonable font size, including embedded screenshots. Please bring two hardcopies to class on the day that it is due.
Acknowledgments: Special thanks go out to Chris Ahlberg of Spotfire, Jock Mackinlay and Chris Stolte of Tableau, and Michael Spenke and Christian Beilken of the Fraunhofer Institute for Applied Information Technology for all their help in the past working with the systems.