WSU & AFRL Window-on-Science Seminar on Data Mining

From Knoesis wiki
Jump to: navigation, search

9:00AM – 5:00PM, Wednesday 05 August 2009

Room 116 Health Sciences Bldg, Wright State University Campus (see: www.wright.edu/aboutwsu/maps/)

The purpose of this seminar is to exchange ideas on Data Mining and establish communication for possible future collaboration. The Air Force Office of Scientific Research (_AFOSR_ <http://www.afosr.af.mil>), through its International Office and foreign detachments (i.e. EOARD and AOARD), acts as a technology broker linking prominent U.S. Air Force scientists and engineers with university and industry counterparts from foreign countries. The WOS program facilitates AFOSR's policy to build partnerships of excellence and relevance by funding visits by distinguished foreign science and technology researchers to U.S. Air Force Research Laboratory sites and other research organizations. Visitors may also attend technical conferences to present research to USAF scientists. The program is designed primarily to meet the needs and requests of the Air Force Research Laboratory (_AFRL_ <http://www.afrl.af.mil>).


ABSTRACT: Automatically finding information structures in huge amount of data is required in many fields of applications. Various techniques are bundled under the name 'data mining'. Data Mining is part of the process of Knowledge Discovery in Databases (KDD) with the following definition (according to Fayyad et al.): Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data.


SCHEDULE:

09:00 AM - Welcome - Dr. Camberos (AFRL/RB)

09:15 - 10:15AM - Dr. Larry Lambe (MSSRC), Statistical Learning Theory for Engineering Problems [Presentation Download]

10:15 - 10:30AM - Break

10:30 - 11:30AM - WOS Guest: Dr. Grabmeier, University of Applied Sciences Deggendorf, Germany #Data Mining Techniques: Overview & Recent Developments

12:00 - 01:30PM - Lunch

01:30 - 02:00PM - Dr. Mateen Rizki (WSU), #Mining and Visualization of Metabonomics Data Sets

02:00 - 02:45PM - Dr. Guozhu Dong (WSU), #Overview of Data Mining Research Results and Applications of WSU' Data Mining Research Lab

02:45 - 03:00PM - Break

03:00 - 03:45PM - Dr. Amit Sheth (WSU), #Semantics empowered Understanding, Analysis and Mining of Nontraditional and Unstructured Data

03:45 - 04:15PM - Dr. Ray Kolonay (AFRL/RB), Optimal Kriging Model Parameters for an Euler-Based Induced Drag Function[Presentation Download]

04:15 - 05:00PM - Open discussion

5:00PM - Adjourn

Abstracts

Data Mining Techniques: Overview & Recent Developments

Dr. Johannes Ludwig Grabmeier
University of Applied Sciences Deggendorf, Edlmairstr. 6+8, Deggendorf, D-94469, Germany<br\> [Presentation Download]

Automatically finding information structures in huge amount of data is required in many fields of applications. Various techniques are bundled under the name “data mining: Data Mining is part of the process of Knowledge Discovery in Databases (KDD) with the following definition (according to Fayyad et al.): Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data. There are standard techniques that are used in data mining to fulfill these requirements, including: knowledge represented in rules (association rules and sequential rules); prediction of discrete values (classification); prediction of continues values (regression); and clustering (segmentation). In addition to the standard techniques, new approaches will be defined and explained with examples. These will include: Link Analysis; Prediction of discrete variables, in particular decision trees; Regression with continuous variables; and Cluster Analysis.

Mining and Visualization of Metabonomics Data Sets

Dr. Mateen Rizki
[Presentation Download]<br\> Finding interesting relationships in biological data sets presents a variety of challenging computational problems. In most cases, there is a large volume of raw sensor data available from a relatively small number of samples. This effectively limits the use of traditional statistical data analysis techniques while making techniques based on data mining, pattern recognition and machine learning viable alternatives. In this talk we will describe the challenges associated with mining metabonomics data for chemical metabolites that are responsive to specific toxins. We will describe a system that extracts salient features from raw data produced by a liquid chromatography / mass spectrometry (LC/MS) sensor, registers these features across multiple experiments, and presents this information in an interactive visualization environment that provides a multifaceted view of the experimental results.

Dr. Rizki is a professor in the Department of Computer Science and Engineering at Wright State University. His areas of interest include evolutionary computation, neural networks, data mining, pattern recognition and image processing. Dr. Rizki has published over 60 articles in variety of areas and has served as principal investigator on numerous federal contracts and grants. Dr. Rizki has received the Wright State University’s College of Engineering and Computer Science’s Award for Overall Faculty Excellence, Award for Excellence in Professional Service, and the Award for Excellence in Teaching. Dr. Rizki is an associate editor for the journals: IEEE Transactions on Evolutionary Computation, BioSystems and International Journal of Engineering Application. He has serves on over 40 conference program committee and is a member of Tau Beta Pi, ACM, SPIE, IEEE, and Sigma Xi.

Overview of Data Mining Research Results and Applications of WSU' Data Mining Research Lab

Dr. Guozhu Dong
[Presentation Download]
Knowledge discovery in large amount of structured data continues to grow in the last decade. Such structured data can take various complex forms, including feature vectors, sequences, images, graphs, texts, etc. While such data can contain a wealth of useful knowledge, they can be challenging for mining because they have very high dimensions (involving thousands of variables/factors). Powerful tools that can identify potential hypotheses (patterns or models) from such high dimensional data are needed to help scientists to pinpoint potential hypotheses.

In this talk, the speaker will give an overview of some recently developed data mining algorithms together with example applications. Example topics include: a) How to find multi-factor contrast patterns that identify/distinguish one class of data against another? b) How to find distinguishing sequence patterns that contrast one sequence family against another family? c) How to build highly accurate classifiers using contrast patterns? d) How to use contrast patterns to identify outliers? e) How to perform multi-dimensional multi-level data analysis on high dimensional data? f) How to study time series of microarray data? How to mine knowledge patterns/models to facilitate knowledge transfer between application problems?

Semantics empowered Understanding, Analysis and Mining of Nontraditional and Unstructured Data

Dr. Amit Sheth
[Presentation Download]<br\> Kno.e.sis Center performs research in understanding, extracting, and mining of broad variety of data including structured (eg relational and/or transactional), semistructured (XML, Web), and unstructured (textual, experimental, sensor, social networking/user generated/Web 2.0) data. The unique aspect of our approach and capability is use of semantic (Web) and knowledgebase techniques/technologies, complementing or extending statistical and machine learning and NLP techniques. The resulting semantic metadata extraction /annotations leads to next generation (Web 3.0) semantic search, integration, and analysis (e.g., supporting pattern extraction, reasoning) techniques manifesting leading to deeper insights, knowledge discovery and situational awareness applications. My talk will complement Prof. Dong’s presentation on data mining and will provide a broad overview of our work with biomedical, social media (e.g., twitter), and sensor data.

Amit Sheth (http://knoesis.org/amit) is the LexisNexis Ohio Eminent Scholar for Advanced Data Management and Analysis. He directs the Kno.e.sis Center for Knowledge enabled Information & Services Science, a world leader in Semantic, Services, Sensor and Social computing over Web and mobile platforms (also called Semantic Web, Web 2.0, and Web 3.0). Kno.e.sis's extensive collaborations currently focus on human sciences and defense & intelligence (including sensor web, human performance enhancement). Prof. Sheth is an IEEE fellow, has received well over $14 million in basic research funds (from NIH, NSF, AFRL, DoD, IBM, Microsoft, HP, etc) and over $5 million in R&D/commercialization. Based on h-index he is currently listed among 25 most cited Computer Scientists world-wide (250+ publications,15,500+ citations, h index = 58). He has given more than 200 invited talks including 35 keynotes, is theEIC of the Intl. Journal of Semantic Web & Information Systems, is joint-EIC of Distributed & Parallel Databases, and serves on several editorial boards. By licensing his funded university research, he has also founded and managed two successful companies.