Site maintained by David Wild, email djwild @ indiana.edu / © All Rights Reserved

Welcome to the homepage of the Integrative Data Science Lab (IDSL) in the School of Informatics, Computing and Engineering (SICE) at Indiana University in Bloomington, Indiana. Integrative Data Science is about bringing together complex data sources, novel analysis methods that cross these networked data silos, and diverse skill sets to solve real world problems.  IDSL's research areas include digital health, molecular therapeutics, and crisis & emergency response. 

PEOPLE
PROJECTS

Dr David Wild, Director
Samuel Bentum, Ph.D. Student
Natalie Franklin, Ph.D. Student

Stefan Furrer, Ph.D. Student
Chris Gessner, Ph.D. Student

Logan Paul, Ph.D. Student

Jeremy Yang, Ph.D. Student



Former members:

Dr Rajarshi Guha (Vis. Asst. Prof), NIH NCATS
Dr Qian Zhu (Postdoc), Mayo Clinic
Dr Varsha Kulkarni (PhD), Harvard University

Dr Xiao Dong (PhD), UIC 

Dr Huijun Wang,(PhD), Pfizer Inc

Dr Pulan Yu (PhD), Dow Agrochemical

Dr Bin Chen (PhD), Stanford University
Dr Hari Machina (PhD), Amgen

Dr Abhik Seal (PhD)

Dr Jae Hong Shin, (PhD)
Dr Anurag Passi, (PhD Fulbright Fellow)

Dazhi Jiao
Alex Christou

 

 

  • Breaking down data silos. Including landmark Chem2Bio2RDF project, which uses semantic technologies to integrate networks of chemistry, biology and biomedical data

  • Association finding across data silos using higly novel path-based prediction tools. Includes SLAP algorithm and recent random walk methods

  • Integrative data mining in healthcare data including biomedical, adverse event, and electronic medical records.

  • Big data mining for Automated Chemical Synthesis. Cheminformatics big data approaches to the next generation of chemical synthesis

  • Integrative data science for disaster risk, resilience and expenditure helping local communities and federal agencies better plan for climate change, and to better use disaster recovery funds to increase resilience

  • Integrative data mining for emergency response. Mining critical information for emergency responders

NEWS & RECENT PRESENTATIONS

EDUCATIONAL PROJECTS & RESOURCES

IU Data Science Program
MS and Certificate Programs in Data Science at Indiana University SOIC

Data Science in Drug Discovery, Health and Translational Medicine.
An online IU course with freely accessible online resources

Cheminformatics OLCC
An NSF project to develop a radical new hybrid online/local approach for teaching cheminformatics to undergraduate chemistry students

LearnCheminformatics.com
Free and low cost resources for learning cheminformatics, including an eBook and course materials

Informatics in Disasters and Emergency Response
An online IU course with freely accessible online resources

LATEST PUBLICATIONS

  • Gao, Z. Fu, G., Ouyang, C., Tsutsui, S., Liu, X., Yang, J., Gessner, C., Foote, B., Wild, D.J., Yu, Q., and Ding, Y. Edge2vec: Representation learning using edge semantics for biomedical knowledge discovery. BMC Bioinformatics. 2019, 20:306. 

  • Meng, G., Huang, Y., Yu, Q., Ding, Y., Wild, D., Zhao, Y., Liu, X, Min, S. Adopting Literature-based Discovery on Rehabilitation Therapy Repositioning for Stroke. Frontiers in Neuroscience. March, 2019. https://doi.org/10.3389/fninf.2019.00017

  • Seal, A., Wild. D.J. Netpredictor: R and Shiny package to perform drug-target network analysis and prediction of missing links. BMC Bioinformatics, 2018, 19(1), 265. 

  • Passi, A., Rajput, N.K., Wild, D.J., Bhardwaj, A.RepTB: a gene ontology based drug repurposing approach for tuberculosis. Journal of Cheminformatics, 2018, 10:24

  • Correia, R.B., de Araújo, L.P., Mattos, M.M.,Wild, D., Rocha, L.M. City-wide Analysis of Electronic Health Records Reveals Gender and Age Biases in the Administration of Known Drug-Drug Interactions. arXiv:1803.035712018/3. 

  • Djokic-Petrovic, M., Cvjetkovic, V., Yang, J., Marko Zivanovic, M., Wild, D.J.PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets. Journal of Biomedical Semantics, 2017, 8(1) 42. 

  • Kulkarni, V., Wild, D.J. An activity canyon characterization of the pharmacological topography. Journal of Cheminformatics 2016, 8 (1), 41.

  • Fox, G., Maini, S., Rosenbaum, H., Wild, D. Data Science and Online Education. 2015 IEEE 7th International Conference on Cloud Computing Technology and Science.

  • Seal, A., Ahn, Y.Y., Wild, D.J. Optimizing drug-target interaction prediction based on random walk on heterogeneous networks. Journal of Cheminformatics. 2015, 7 (1), 1.

  • Lee, J.A., et. al. Novel phenotypic outcomes identified for a public collection of approved drugs from a publicly accessible panel of assays. 2015, PLoS One 10(7) e0130796

  • Chen, B., Wang, H., Ding, Y., Wild, D. Semantic Breakthrough in Drug Discovery. Synthesis Lectures on the Semantic Web: Theory and Technology, Morgan & Claypool, 2014, 4(2) p1-142.

  • Henschel, R., et. al. Applications of the YarcData Urika in Drug Discovery and Healthcare.

  • Joshi, H., Parihar, A., Jiao, D., Murali, S., Wild, D.J. A possible gut microbiota basis for weight gain side effects of antipsychotic drugs. Eprint arXiv:1401.2389, 2014/01. 

  • Chen, B., Wild, D.J. Practice and challenges of building a semantic framework for chemogenomics research. Molecular Informatics, 2013, 32:11/12 pp1000-1008

  • Machina, H., Wild, D.J., Dey, P., Merchant, M. Effective integration of informatics tools to enhance the drug discovery process. Industrial & Engineering Chemistry Research, 2013, 52(47), pp16547-16554

  • Wild, D.J. Cheminformatics for the masses: a chance to increase educational opportunities for the next generation of cheminformaticians. Journal of Cheminformatics, 2013, 5:32

  • Willighagen E., Waagmeester, A., Spjuth, O., Ansell, P., Williams, A.J., Tkachenko, V., Hastings, J., Chen, B. and Wild, D.J. The ChEMBL database as linked open data. Journal of Cheminformatics, 2013, 5:23

  • Machina, H.K. and Wild, D.J. Electronic laboratory notebooks: progress and challenges in implementation. Journal of Laboratory Automation, 2013, in press.

  • Seal, A., Yogeeswari, P., Sriram, D., Wild, D.J. Enhanced ranking of PknB inhibitors using data fusion methods. Journal of Cheminformatics, 2013, 5:3.

  • Machina, H.K. and Wild, D.J. Laboratory informatics tools integration strategies for drug discovery, Journal of Laboratory Automation, 2013, 18(2), 126-136.

IDSL TOOLS
Resilience2

Rapid and live virtual analysis of disaster resilience by state and county

Screen Shot 2019-09-30 at 12.40.31 PM.pn
NetPredictor

R package for prediction of missing links in any given unipartite or bipartite network using Random Walk with Restart and Network inference algorithm.

Screen Shot 2019-10-02 at 8.32.02 AM.png

NetPredictor is described in Seal, A. et al, BMC Bioinformatics, 2018, 19, A265

T2DM-NET for Diabetes

A Knowledge Network (KN) of data relating drugs and targets related to Type-II Diabetes. The network uses SLAP and Chem2Bio2RDF to suggest potential new Diabetes drugs and targets.

This work was funded by Indiana CTSI. A publication is in progress.

SLAP

A missing-link prediction tool derived from social networking that uses the Chem2Bio2RDF network to predict association between drugs and gene targets. 

A searchable semantic network of public drug discovery linking chemical compounds with genes, diseases, targets, pathways and adverse effects that allows cross-dataset querying using SPARQL.

Chem2Bio2RDF is described in Chen, B. et al., BMC Bioinformatics 2010, 11, 255. The associated Chem2BioOWL ontology is described in Chen, B., et al., Journal of Cheminformatics 2012, 4:6.

Drug Repurposing Explorer - A prototype tool for ranking known drugs to queries using fused similarity of chemical structure fingerprints, side effects, biological targets, shape, and disease association

Bioterm Literature Association Score Calculator (BLASC). Predicts association between genes, drugs and diseases from data mining of recent PubMed scholarly journal articles using a BioLDA Topic Model. You can specify a start node (e.g. a gene), and intermediate and end node types, and the tool will produce a list of the most strongly associated end node types (e.g. drugs). The BioLDA algorithm is described in our recent paper Wang, H. et al., PLoS One, 2011, 6(3), e17243.

Drugbank Semantic Faceted Browser - A prototype semantic browsing tool that demonstrates how semantic annotation can be used for faceted browsing of drug data. Currently works on a small subset of Drugbank as a proof-of-concept only.



WENDI - a tool for finding non-obvious relationships between chemical compounds and biology that aggregates information from databases, extracted from the literature, and computational predictions. Funded by Eli Lilly. Extended to use an RDF inference engine to make predictions of compound-disease relationships using a rule-base. For more information, see our recent papers Zhu et al., Journal of Cheminformatics, 2010, 2:6 and Zhu et al., BMC Bioinformatics, 2011, 12, 256.

 

PARTNERS & RECENT SUPPORT

IDSL is affiliated with the IU Network Science Institute (IUNI)

IDSL works in close collaboration with IU's Web Science Lab, led by Dr Ying Ding

We are an affiliated partner of the EU OpenPHACTS project

Recent support given by

This site was designed with the
.com
website builder. Create your website today.
Start Now