top of page
Screen Shot 2019-09-30 at 12.34.18

Welcome to the homepage of the Integrative Data Science Lab (IDSL) in the School of Informatics, Computing and Engineering (SICE) at Indiana University in Bloomington, Indiana. Integrative Data Science is about bringing together complex data sources, novel analysis methods that cross these networked data silos, and diverse skill sets to solve real world problems.  IDSL's research areas include digital health, molecular therapeutics, and crisis & emergency response. 


David Wild, PhD, Professor and Director

Abhik Seal, PhD, Adjunct Professor

Jeremy Yang, PhD, Adjunct Professor

Logan Paul, PhD Student

Former members (and current affiliations):

Samuel Bentum, PhD

Chris Gessner, PhD

Rajarshi Guha, PhD (Vertex)
Qian Zhu, PhD, (Mayo Clinic)
Varsha Kulkarni, PhD (Harvard)

Xiao Dong, PhD (UIC)

Huijun Wang, PhD (Pfizer)

Pulan Yu, PhD, Dow Agrochemical

Bin Chen, PhD (Michigan State)
Hari Machina, PhD (Amgen)

Jae Hong Shin, PhD (NetTargets)
Anurag Passi, PhD (UCSD)

Dazhi Jiao (Amazon)
Alex Christou

Stefan Furrer (Givaudan)

Natalie Franklin (Lilly)



  • Breaking down data silos. Including landmark Chem2Bio2RDF project, which uses semantic technologies to integrate networks of chemistry, biology and biomedical data

  • Association finding across data silos using higly novel path-based prediction tools. Includes SLAP algorithm and recent random walk methods

  • Integrative data mining in healthcare data including biomedical, adverse event, and electronic medical records.

  • Big data mining for Automated Chemical Synthesis. Cheminformatics big data approaches to the next generation of chemical synthesis

  • Integrative data science for disaster risk, resilience and expenditure helping local communities and federal agencies better plan for climate change, and to better use disaster recovery funds to increase resilience

  • Integrative data mining for emergency response. Mining critical information for emergency responders

  • Christopher Gessner successfully defends his PhD dissertation: "An Integrative in silico Approach to Preclinical Drug Discovery", on May 26, 2023. Congratulations Dr. Gessner!

  • Samuel Bentum successfully defends his PhD dissertation: "Digital Transformation Strategies for Applied Science Domains", on April 11, 2023. Congratulations Dr. Bentum!

  • Jeremy Yang, PhD, appointed Adjunct Professor in the Department of Informatics and Computing, September 2022.

  • Abhik Seal, PhD, appointed Adjunct Professor in the Department of Informatics and Computing, September 2022.

  • Jeremy Yang successfully defends his PhD dissertation: "Evidence evaluation in biomedical knowledge graphs for pharmaceutical discovery", on March 13, 2022. Congratulations Dr. Yang!


  • Jeremy J. Yang, Christopher R. Gessner, Joel L. Duerksen, Daniel Biber, Jessica L. Binder, Murat Ozturk, Brian Foote, Robin McEntire, Kyle Stirling, Ying Ding & David J. Wild. Knowledge graph analytics platform with LINCS and IDG for Parkinson's disease target illumination. BMC Bioinformatics, 2022.

  • Gao, Z. Fu, G., Ouyang, C., Tsutsui, S., Liu, X., Yang, J., Gessner, C., Foote, B., Wild, D.J., Yu, Q., and Ding, Y. Edge2vec: Representation learning using edge semantics for biomedical knowledge discovery. BMC Bioinformatics. 2019, 20:306. 

  • Meng, G., Huang, Y., Yu, Q., Ding, Y., Wild, D., Zhao, Y., Liu, X, Min, S. Adopting Literature-based Discovery on Rehabilitation Therapy Repositioning for Stroke. Frontiers in Neuroscience. March, 2019.

  • Seal, A., Wild. D.J. Netpredictor: R and Shiny package to perform drug-target network analysis and prediction of missing links. BMC Bioinformatics, 2018, 19(1), 265. 

  • Passi, A., Rajput, N.K., Wild, D.J., Bhardwaj, A.RepTB: a gene ontology based drug repurposing approach for tuberculosis. Journal of Cheminformatics, 2018, 10:24

  • Correia, R.B., de Araújo, L.P., Mattos, M.M.,Wild, D., Rocha, L.M. City-wide Analysis of Electronic Health Records Reveals Gender and Age Biases in the Administration of Known Drug-Drug Interactions. arXiv:1803.035712018/3. 

  • Djokic-Petrovic, M., Cvjetkovic, V., Yang, J., Marko Zivanovic, M., Wild, D.J.PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets. Journal of Biomedical Semantics, 2017, 8(1) 42. 

  • Kulkarni, V., Wild, D.J. An activity canyon characterization of the pharmacological topography. Journal of Cheminformatics 2016, 8 (1), 41.

  • Fox, G., Maini, S., Rosenbaum, H., Wild, D. Data Science and Online Education. 2015 IEEE 7th International Conference on Cloud Computing Technology and Science.

  • Seal, A., Ahn, Y.Y., Wild, D.J. Optimizing drug-target interaction prediction based on random walk on heterogeneous networks. Journal of Cheminformatics. 2015, 7 (1), 1.

  • Lee, J.A., et. al. Novel phenotypic outcomes identified for a public collection of approved drugs from a publicly accessible panel of assays. 2015, PLoS One 10(7) e0130796

IU Data Science Program
MS and Certificate Programs in Data Science at Indiana University SOIC

Data Science in Drug Discovery, Health and Translational Medicine.
An online IU course with freely accessible online resources

Cheminformatics OLCC
An NSF project to develop a radical new hybrid online/local approach for teaching cheminformatics to undergraduate chemistry students
Free and low cost resources for learning cheminformatics, including an eBook and course materials


Rapid and live virtual analysis of disaster resilience by state and county

Screen Shot 2019-09-30 at 12.40.31

R package for prediction of missing links in any given unipartite or bipartite network using Random Walk with Restart and Network inference algorithm.

Screen Shot 2019-10-02 at 8.32.02 AM.png

NetPredictor is described in Seal, A. et al, BMC Bioinformatics, 2018, 19, A265

T2DM-NET for Diabetes

A Knowledge Network (KN) of data relating drugs and targets related to Type-II Diabetes. The network uses SLAP and Chem2Bio2RDF to suggest potential new Diabetes drugs and targets.

This work was funded by Indiana CTSI. A publication is in progress.


A missing-link prediction tool derived from social networking that uses the Chem2Bio2RDF network to predict association between drugs and gene targets. 

A searchable semantic network of public drug discovery linking chemical compounds with genes, diseases, targets, pathways and adverse effects that allows cross-dataset querying using SPARQL.

Chem2Bio2RDF is described in Chen, B. et al., BMC Bioinformatics 2010, 11, 255. The associated Chem2BioOWL ontology is described in Chen, B., et al., Journal of Cheminformatics 2012, 4:6.

Drug Repurposing Explorer - A prototype tool for ranking known drugs to queries using fused similarity of chemical structure fingerprints, side effects, biological targets, shape, and disease association

Bioterm Literature Association Score Calculator (BLASC). Predicts association between genes, drugs and diseases from data mining of recent PubMed scholarly journal articles using a BioLDA Topic Model. You can specify a start node (e.g. a gene), and intermediate and end node types, and the tool will produce a list of the most strongly associated end node types (e.g. drugs). The BioLDA algorithm is described in our recent paper Wang, H. et al., PLoS One, 2011, 6(3), e17243.

Drugbank Semantic Faceted Browser - A prototype semantic browsing tool that demonstrates how semantic annotation can be used for faceted browsing of drug data. Currently works on a small subset of Drugbank as a proof-of-concept only.

WENDI - a tool for finding non-obvious relationships between chemical compounds and biology that aggregates information from databases, extracted from the literature, and computational predictions. Funded by Eli Lilly. Extended to use an RDF inference engine to make predictions of compound-disease relationships using a rule-base. For more information, see our recent papers Zhu et al., Journal of Cheminformatics, 2010, 2:6 and Zhu et al., BMC Bioinformatics, 2011, 12, 256.



IDSL is affiliated with the IU Network Science Institute (IUNI)

Informatics in Disasters and Emergency Response
An online IU course with freely accessible online resources

IDSL works in close collaboration with IU's Web Science Lab, led by Dr Ying Ding

We are an affiliated partner of the EU OpenPHACTS project

Recent support given by

bottom of page