Workshop Data transformation 2008 Nov

Data Transformation Workshop 2. Nov 2008

Present:

Ryan Brinkman - Terry Fox Laboratory, Vancouver

Melanie Courtot - Terry Fox Laboratory, Vancouver,

Philippe Rocca-Serra - EBI

Monnie McGee - Southern Methodist University, Dallas

Tina Boussard, Dept Surgery, Stanford

Helen Parkinson, EBI

James Malone, EBI

Ricardo Pietrobon, Duke University

Ted Liefeld, Broad Institute

Robert Stevens, Manchester University


Action Items by Person https://wiki.cbil.upenn.edu/obiwiki/index.php/Data_transformation_workshop_action_items

Day 1

Day 2

Day 3

Day 4

Variables document

http://docs.google.com/Doc?id=dzprnmw_68cs654hfh&invite=cpqk62b

Variables notes page https://wiki.cbil.upenn.edu/obiwiki/index.php/Data_transformation_workshop2_variables


Action Items Day 1:

AI:Hammer BFO/IO for answers on where variables will be in the ontology. Add this to the agenda for Vancouver. MC

AI:Need to add the fuzzy clustering methods - http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/cmeans.html RB added c-means term proposal to tracker and fuzzy clustering as an objectve

AI:'center calculation' and 'averaging data transformation' defined classes need attention. MM thinks that these could be conflated. Moving average needs to be under both. Data imputation possibly incorrect - could be a class in itself. Need a new defined class for that - probably objective. - MM

AI:We will ask the metagenomic community for a use case. We will work on this when we have use cases for these. PRS

AI: Sequence analysis becomes and objective, anything underneath will have it as an objective. Objective is 'sequence feature identification' or 'prediction' JM

AI:Find out what WAS classified under polynomial transformation (in some previous version) - need to have a property that is associated with it somehow. JM

AI:Send an email to RS about Network analysis objectives - ie what these are MC/RS

AI:Add a synonym to Network analysis - network topology analysis. - JM

AI:Need some way to relate multiple testing with the need for multiple test correction RB has a use case. RB

AI:Add more children for specific methods of multiple testing procedure. - MM

AI:We need a new objective - 'Type 1 Error Rate correction' - JM

AI:JM will email Elisabetta offline to ask about MA transformation objective. JM

AI:MM will look to see whether other scaling adjustments are used instead of loess ever. MM

AI:Loess scale group transformation is a scaling adjustment. CHange tree to reflect this. Just the scaling, not the transformation. Add a scaling objective will be added, loess scale gp trans is-a loess group transformation, followed by a scaling adjustment. This will be a defined class - any dt with objective scaling. JM

AI:Add feature selection - objective. In this sense doing machine learning as in wikipedia http://en.wikipedia.org/wiki/Feature_selection - RB. Random forests should be under feature selection. RB

AI:MM will ask a colleague what to do for 3D and 2D feature extraction and if they are different. MM

AI: Feature extraction suggested to be made an objective also suggested in branch review. JM

AI:EH transformation - belongs where B transformation is sibling - B/H/EH/S transformations should have objective normalization JM

AI:Discriminant analysis should be a synonym of classification or class discovery MM

AI:Differential expression analysis becomes an objective - as any test can be used - this is the objective not a test JM

AI:Review branch to check for missing subsumption of terms for objectives - ALL on branch call

AI:Add CART random forest to OBI MM will define , QT clustering - RB will define

AI:Need definition for Ward's method under agglomerative h'arch clustering - MC

AI:Descriptive stats also becomes an objective - JM

AI:Def trimmed mean calc - has part trimming process, suceeded by mean calculation. Outlier removal to be added as an objective. - MM

AI:add variable into the DT or DENRIE hierarchy - see how we have done it, and then it will get done. - MC/JM

AI:create a sheet for variables as these determine what test you do - ALL workshop

AI:Address dimensionality reduction vs. data vector reduction - if these are the same thing - then we can have a single term for both. - MC will also add this to the email.

AI:gating is not a dimensionality reduction and is therefore incorrectly placed under property based vector reduction - PBVR. Gating has objective data partioning, goes under the root. PBVR - was this created so that gating will fall into the hierarchy? Needs checking with RS - MC. will Send an email Richard to describe the issue and see if he is happy with the proposed changes in gating - MC

AI:Background correction - changes to an objective. RB

AI:Change 'data partitioning' to an objective - JM

AI:Find information for B transformation (existing term) - need an objective. - RB added note to tracker

AI:Send the variables docs (see link top of page) to IO, Plan, dt for discussion. -MC will be added to the agenda for the dev call 18 Nov 2008 - MC