PhyloGeoViz: plan

Showing posts with label plan. Show all posts

Monday, August 20, 2007

Phyloinformatics Summer of Code Meetup at NESCent

At the end of last week the NESCent SoCoders all met up in Durham, NC for a wrap up conference. It was really neat to see what others had been doing all summer long. The diversity of projects was to be expected, though there were some common threads between us.

One message that was hammered home to me was the need for file standards in evolutionary biology. The amount of time spent massaging your data into different formats to interface with interesting programs is many times prohibitive to actually using the programs. I was struck by all the development going on with NEXUS (ex: Nexplorer, NEXML, and libraries for NEXUS) in particular, and I feel like I need to jump on this bandwagon too! I was quick to realize that I'm part of the problem in that PhyloGeoViz currently doesn't accept NEXUS (or anything else for that matter besides my own special format). Well, being able to work with NEXUS files is now on the top of my list of priorities.

Another common theme of the meeting was the visualization of tree data. Multiple projects were coming at this problem using different languages and in different contexts.

A great meeting! I had a lot of fun hearing other students' and mentors' experiences. Hope to run into you again soon!

Monday, July 2, 2007

Weekly plan

This week is a bit frantic because I leave for the Botany conference on Saturday. I've been making a big push to have PhyloGeoViz ready for an alpha release and debut there. Here's my to do list for this week:

Transfer homepage style to the other app pages so the look is consistent.
Figure out export.php bug on server so that it will run!
Write function that will output the legend as overlay data in the exported kml.
Write function that can change the boundaries of the bins for the different pie sizes.
Write feature so that user can choose whether all the pies are the same size or have relative sizes.
Write a color picker function for changing the haplotype colors.
Import and manipulate my own data to create slide for talk!!!

Monday, June 25, 2007

Weekly update

This past week focused on development of a basic data input page. I decided to use form variables (at least to start with) as the way to pass information between pages. I had to learn how to write the html to allow that plus some php to parse the passed information. I also had to figure out how to pass the variables between php and javascript to get the input data and viewer scripts I'd written to get along. Anyways, as a result I have a rudimentary input page tied to my viewer now. I, of course, still need to clean it up, add error handling, different input types, etc. But it's a start!

On the administrative side of things, I added my project blog to the Planet SoC site. Basically it's a neat aggregation of a bunch of blogs concerning many different Google Summer of Code projects. I also updated the wiki with my design doc.

To do for this week:
Get basic output page written and in synchrony with the other pages.
If time, add some functionality to the viewer (like changing colors).

Monday, June 18, 2007

updated project wiki to include design doc

Finally got around to putting the design doc up on the wiki. I didn't get a chance to incorporate the haplotype grouping feature into the design, but I image it would mean added a table of haplotyes x group identifier on the manual data input and the data management pages. Also, I'd like to incorporate Norm's suggestions of a grayscale feature and export as .ai feature eventually too.

Sunday, June 10, 2007

Detailed design document, draft 1

If the user selects 'input manually',

The default numbers of haplotypes and populations are set to 10. Users can update these values to get the appropriate number of rows and columns in the data matrix. Unless the data matrix is small, the user will likely have to scroll within the table to input all the data. To facilitate this the population, lat, and long columns and the header row will be frozen. After the data is saved, the application checks the input. If the data is validated, we go to

We arrive at this page once the data have been validated (whether input by hand or uploaded). The purpose of this page is to allow the user to include/exclude populations and/or haplotypes. By default all populations and haplotypes are included. After any edits the user is taken to

This window previews and allows the user to edit the visualization. There are 3 possible visualizations: 1) Show just the sampling localities. On this option, the map options 'haplotype color', 'pie size (absolute)', and 'pie size (relative)' are grayed out. 2) Display with circles relative to the sample size for each locality. In this case, the map option 'marker appearance' is grayed out. 3) Display full haplotype information for each locality. In this case, the map option 'marker appearance' is grayed out.

When editing any of the map options a panel will pop out with options for that task. Map option definitions:

marker appearance: select what icons you want to identify each population
haplotype color: select colors for each haplotype
pie size (absolute): set the max diameter for each circle or pie
pie size (relative): choose if pies are all the same size or relative to the sample size; allows the user to set the bounds on the sample size bins

The map itself is fully functional. Users can zoom in, pan, and access satellite imagery as they can in other google maps applications. Furthermore, the users can click on drag any of the markers, circles, or pies to reposition them. This should be very useful especially for avoiding overlapping pies. If there's time, I'd like to add a button here for 'auto-fix overlapping pies', where the application detects collisions and repositions the pies for the user.

Below the preview screen is the legend. It shows the current color of each haplotype as well as the relative circle sizes and their corresponding sample sizes.

I know on the mock-up the preview screen is fairly small. However, the page will scale with the window, so it should be big enough for most folks. I will also consider moving the 'map options' below the map, so the map can be bigger. I was thinking, though, that the user might find it annoying to constantly scroll up and down to see the effects of the 'map options'.

Now to export the finalized map and legend.

From this page the user can export the visualization in four ways. It's important for the user to do the repositioning, coloring, and other editing work here before exporting to Google Earth. It's not possible to click and drag polygons in Google Earth as it is in Google Maps. If the user selects the .jpg option, they will be prompted to choose either saving the map and legend together or separately. The other formats handle the map and legend separately anyways.

Error handling: If the application has trouble reading the input file, then the validation will fail and generate this error page.

The user is then directed back to either uploading or manually inputting their data.

Overall flow chart for the application

In words:
The user starts by inputting their data. There are three options for data input. After the data has been input, we validate that the data is appropriate and interpretable. If not, we send an error message to the user, and ask them to resubmit their data. If the data input is successful, we display the data back to the user and allow the user to include/exclude any populations and/or haplotypes. Following data management, the user is taken to the preview visualization page. Here a google map is displayed showing a visualization of the data. There are various map and view options here that update the page. The user can also return to the data management screen and edit previous choices. Finally, the visualization can be exported in four formats.

Tuesday, May 15, 2007

Detailed project plan

Here's my preliminary weekly project plan. Dates refer to the beginning of that work week. Any comments are appreciated!

Now til Start:

Phase 0: Getting development environment set up
- Set up homepage, wiki, repositories, etc.

May 28

Phase 1: Exploratory phase
- Learn how to embed maps on a webpage.
- Learn the relationship between Google Earth and Google Maps.
- How are they the same, how are they different? What can you do with one that you can't do with the other?
- Explore KML and general XML.
- Explore Google Earth and Google Maps APIs.
- Create a pie chart using KML.

June 4

Finish exploratory work if neccessary.
Phase 2: Finalize design
- Page by page description of what the user sees.
- How to input data. Are they going to upload files, input in a text box? What format?
- How to export data. Format? Data persistence? Can users store data, results, maps, etc.?
- What is the viewer? Google earth? Google maps?
- How large are the pie charts going to be in comparison with the geography? How do we deal with the problem of overlapping pie charts?
- How are we going to color the pie charts? What if there are large numbers of haplotypes, how do we color them all distinctly and usefully?
- The results of these decisions will be a comprehensive design document posted on the wiki.

June 11

Implement the basic pie chart generation functionality. Functionality should include:
- Basic import of data.
- Basic KML output writer.
- Function that draws a pie chart.
- Function that plots objects (working up to pie charts, but starting with placemarks) on a map.
Write corresponding documentation.

June 18

Combine pie chart generation and chart plotting functionality.
Write the functions that allow adjustments to the output (e.g. changing pie sizes, allowing the user to change haplotype colors, etc).
Write corresponding documentation.

June 18

Work on a function that allows the user to move pie charts around spatially (and to save those movements).
Write corresponding documentation.

June 25

Write the functions that display the KML back to the browser.
Write corresponding documentation.

July 2

Prepare for Botany conference.
Get code submitted to Google for midterm code check in.
Get a prototype of the viewer available for download.
Make sure I'm meeting the midterm evaluation criteria

July 9

Work on bugs that arose from earlier code.
Revisit full data manager design. Finalize UI design.

July 16

Implement UI for the data import functionality.
Expand (?) data files that are acceptable (haplotype, genotype, etc.)
Write the corresponding documentation.

July 23

Implement UI for customizing data analysis.
- Example: the user should be able to select what loci/alleles/populations to include/exclude in the analyses.

July 30

Implement UI for output data manipulation.
- Example: the user should be able to change the relative pie sizes, move pies around, change haplotype colors, etc.

August 6

Implement functions that allow the user to save the map visualization (e.g. jpg) or to save the KML file.

August 13

Perform user tests.
Ensure that the viewer and the data manager are well integrated.
Deposit code with Google.
Update website with new product, and all documentation.

August 20

Done coding. Final evaluations.

Sunday, May 13, 2007

Goals for this week

~~Complete detailed project plan, submit it to David, post on wiki.~~
~~Focus on how to generate these KML files:~~
~~Install Apache and PHP on laptop.~~
~~Investigate PHP.~~
Investigate java as an alternative.
~~Set up code repository on google.~~

Friday, April 27, 2007

Minutes from meeting 1 with David, Hilmar, and Xianhua

Yesterday I had my first meeting with my mentor, David, and others from NESCent (Hilmar and Xianhua). See the "meeting1 agenda.txt" for agenda items.

We answered most of the (I) general IT support questions.

We'll be using some corner of EvoViz for documentation, ideas, pages, project plan, etc. David reassured me that he doesn't have ny particular format or structure in mind for the page.
I should set up the code repository through google.
The development/testing server will be hosted by NESCent
For now the project homepage can be the wiki. We can reevaluate later if there needs to be an 'official' page somewhere.
For mailing list help, wg-phyloinformatics sounds right.

A lot of the meeting centered on what language should I learn and be used for the application. Hilmar said, why not C++!?! David is partial to java because he wants to learn it. There was a lot of confusion about java and javascript on my part(which later Jack cleared up). Hilmar mentioned that javascript is difficult because there is no debugger. Xianhua mentioned that PHP is not object-oriented like C++. Their suggestions were for me to consider the following in my decision: what do I want to get out of this summer? better understanding of java? php? let that guide my decision. What libraries are available in each language that would be useful? I think Xianhua mentioned grass funtions or open source GIS functions in C, hmmm... Hilmar made it sound like they could make any of these languages work server side at NESCent.

We didn't discuss the project plan or overall timeline, or expectations too much. I told David multiple times that I am unable to devote full time to this project. I did mention keeping the project to something within 10-15hrs/week. He seemed ok with this. Hilmar said that we are evaluated based on our completion of milestones based on our project plan. I think that just means I need to make a realistic plan with the scope of 120-150 hours total for the project.

Things to do in the next two weeks:
Research languages.
Decide on a language.
Do phase 0.
Write up detailed (weekly) project plan.
Post things to the wiki.
Post things to NESCent's SoC wiki.

PhyloGeoViz