The BD2K Guide to the Fundamentals of Data Science Series
Every Friday beginning September 9, 2016
12pm - 1pm Eastern Time / 9am - 10am Pacific Time
Working jointly with the BD2K Centers-Coordination Center (BD2KCCC) and the NIH Office of Data Science, the BD2K Training Coordinating Center (TCC) is spearheading this virtual lecture series on the data science underlying modern biomedical research. Beginning in September 2016, the seminar series will consist of regularly scheduled weekly webinar presentations covering the basics of data management, representation, computation, statistical inference, data modeling, and other topics relevant to “big data” biomedicine. The seminar series will provide essential training suitable for individuals at all levels of the biomedical community. All video presentations from the seminar series will be streamed for live viewing, recorded, and posted online for future viewing and reference. These videos will also be indexed as part of TCC’s Educational Resource Discovery Index (ERuDIte), shared/mirrored with the BD2KCCC, and with other BD2K resources.
View all archived videos on our YouTube channel:
https://www.youtube.com/channel/UCKIDQOa0JcUd3K9C1TS7FLQ
https://www.youtube.com/channel/UCKIDQOa0JcUd3K9C1TS7FLQ
Please join our weekly meetings from your computer, tablet or smartphone.
https://global.gotomeeting.com/join/786506213
You can also dial in using your phone.
United States +1 (872) 240-3311
Access Code: 786-506-213
First GoToMeeting? Try a test session: http://help.citrix.com/getready
https://global.gotomeeting.com/join/786506213
You can also dial in using your phone.
United States +1 (872) 240-3311
Access Code: 786-506-213
First GoToMeeting? Try a test session: http://help.citrix.com/getready
SCHEDULE
9/9/16: Introduction to big data and the data lifecycle (Mark Musen, Stanford).
9/16/16: SECTION 1: DATA MANAGEMENT OVERVIEW (Bill Hersh, Oregon Health Sciences).
9/23/16: Finding and accessing datasets, Indexing and Identifiers (Lucila Ohno-Machado, UCSD).
9/30/16: Data curation and Version control (Pascale Gaudet, Swiss Institute of Bioinformatics).
10/7/16: Ontologies (Michel Dumontier, Stanford).
10/14/16: Provenance(Zachary Ives, Penn).
10/21/16: Metadata standards (Susanna-Assunta Sansone, Oxford).
10/28/16: SECTION 2: DATA REPRESENTATION OVERVIEW (Anita Bandrowski, UCSD).
11/4/16: Databases and data warehouses, Data: structures, types, integrations (Chaitan Baru, NSF).
11/11/16: No lecture - Veteran's Day.
11/18/16: Social networking data (TBD).
12/2/16: Data wrangling, normalization, preprocessing (Joseph Picone, Temple).
12/9/16: Exploratory Data Analysis (Brian Caffo, Johns Hopkins).
12/16/16 Natural Language Processing (Noemie Elhadad, Columbia).
The following topics will be covered in January through May of 2017:
SECTION 3: COMPUTING OVERVIEW
Workflows/pipelines
Programming and software engineering; API; optimization
Cloud, Parallel, Distributed Computing, and HPC
Commons: lessons learned, current state
SECTION 4: DATA MODELING AND INFERENCE OVERVIEW
Smoothing, Unsupervised Learning/Clustering/Density Estimation
Supervised Learning/prediction/ML, dimensionality reduction
Algorithms, incl. Optimization
Multiple testing, False Discovery rate
Data issues: Bias, Confounding, and Missing data
Causal inference
Data Visualization tools and communication
Modeling Synthesis
SECTION 5: ADDITIONAL TOPICS
Open science
Data sharing (including social obstacles)
Ethical Issues
Extra considerations/limitations for clinical data
Reproducible Research
SUMMARY and NIH context
9/9/16: Introduction to big data and the data lifecycle (Mark Musen, Stanford).
9/16/16: SECTION 1: DATA MANAGEMENT OVERVIEW (Bill Hersh, Oregon Health Sciences).
9/23/16: Finding and accessing datasets, Indexing and Identifiers (Lucila Ohno-Machado, UCSD).
9/30/16: Data curation and Version control (Pascale Gaudet, Swiss Institute of Bioinformatics).
10/7/16: Ontologies (Michel Dumontier, Stanford).
10/14/16: Provenance(Zachary Ives, Penn).
10/21/16: Metadata standards (Susanna-Assunta Sansone, Oxford).
10/28/16: SECTION 2: DATA REPRESENTATION OVERVIEW (Anita Bandrowski, UCSD).
11/4/16: Databases and data warehouses, Data: structures, types, integrations (Chaitan Baru, NSF).
11/11/16: No lecture - Veteran's Day.
11/18/16: Social networking data (TBD).
12/2/16: Data wrangling, normalization, preprocessing (Joseph Picone, Temple).
12/9/16: Exploratory Data Analysis (Brian Caffo, Johns Hopkins).
12/16/16 Natural Language Processing (Noemie Elhadad, Columbia).
The following topics will be covered in January through May of 2017:
SECTION 3: COMPUTING OVERVIEW
Workflows/pipelines
Programming and software engineering; API; optimization
Cloud, Parallel, Distributed Computing, and HPC
Commons: lessons learned, current state
SECTION 4: DATA MODELING AND INFERENCE OVERVIEW
Smoothing, Unsupervised Learning/Clustering/Density Estimation
Supervised Learning/prediction/ML, dimensionality reduction
Algorithms, incl. Optimization
Multiple testing, False Discovery rate
Data issues: Bias, Confounding, and Missing data
Causal inference
Data Visualization tools and communication
Modeling Synthesis
SECTION 5: ADDITIONAL TOPICS
Open science
Data sharing (including social obstacles)
Ethical Issues
Extra considerations/limitations for clinical data
Reproducible Research
SUMMARY and NIH context
No comments:
Post a Comment