CSCE Capstone

Student Site for Individual and Collaborative Activites

Team 15 – Data Visualization

Team Members:

Michael Fahr

Rafael Del Carmen

Forrest Tennant

Bryce Mendenhall

Pao Yang

Project Summary:

Large amounts of data are generated on an everyday basis and can be identified in different categories. At Sorcero, it could be about the corpus or the user interactions with the corpi. With such a large amount of data, it is hard to navigate and be able to find the information that is needed. For example, this information could be useful to help find insights on key performance indicators (KPIs). The objective is to analyze the large volumes of data and providing a meaningful visual context that could help provide means of navigation and insights into key performance indicators. 

The approach is to design and implement a program that collects the data, develops relations in the data using a set of algorithms, and outputs a straightforward visual context based on the information provided. The data visualization resulting from this approach is important because it will assist the user in processing and understanding the data shown. By being able to navigate through the data in an understandable way, it becomes easier to detect trends or patterns. It also communicates the data quickly and effectively to other people who may not be familiar with the information.

Schedule of Tasks
Task# Task Assigned Members Date Completed/Future
1 Do some research and understand the background of data visualization and natural language processing. All 1/13 – 1/20 Completed
2 Research other modern implementations to receive an idea of other approaches. This includes exploring other alternatives for processing large amounts of data and determining the advantages/disadvantages based on the different alternatives. All 1/21 – 1/27 Completed
3 Finalize architecture design and language of implementation. All 1/28 – 2/10 Completed
4 Develop the code to intake the large volumes of data by deciding the best method for data storage without running out of memory. In addition, determine the best way to process large volumes of data that does not require expensive servers with high processing power and large memory. Michael Fahr, Rafael Del Carmen, Bryce Mendenhall 2/11 – 2/24 Completed
5 Determine the group/sort method that will be used in the algorithm. Create an algorithm that sorts/groups the data into related fields based on the method that was chosen. In addition, incorporate a filter to reduce the amount of natural language that has to be processed. Michael Fahr, Rafael Del Carmen, Bryce Mendenhall 2/25 – 3/9 Completed
6 Use the data to provide a meaningful visual context that suits the data by choosing different parameters to find which data is important to provide a meaningful visual representation. Pao Yang, Michael Tennant 3/10 – 3/23 Completed
7 Finalize the program by testing the application on multiple large sets of data. All 3/24 – 4/6 Completed
8 Document the final results. All 4/7 – 4/21 Completed

Final Proposal: Final Proposal

Poster: Poster

Proposal Slides: Capstone Proposal Presentation

Final Report: Final Report

Final Presentation Slides: Final Capstone Presentation

Design Document: Design Document

GitHub Project Link: