Mobilizing the World’s Knowledge to
Drive Innovation

Reimagining the Web with NLP + Networks + Knowledge + AI

I'm excited by science that reimagines the Web and how humans navigate knowledge in the world. My expertise is on extracting insights from large corpora of text, structuring knowledge, and connecting the dots to drive innovation. For my PhD, I applied this framework to the complex domain of drug discovery. In my thesis, I demonstrated how extracting a metascientific web of knowledge from research literature can uncover drug candidates to repurpose as new treatments.

Daniel N. Sosa, Georgiana Neculae, Julien Fauquer, Russ B. Altman, "Elucidating the Semantics-Topology Trade-off for Knowledge Inference-Based Pharmacological Discovery." In preparation.

While knowledge systems have great potential for discovery, using embedding-based methods for inference may result in confounding by network structure instead of emulating logic-like behavior for prediction.

Daniel N. Sosa, Rogier Hintzen, Alex de Giorgio, Julien Fauqueur, Mark Davies, Jake Lever, Russ B. Altman, "Associating Biological Context with Protein-Protein Interactions through Text Mining at PubMed Scale." In review.

In appealing to the KISS principle, we demonstrate a feature-based approach for associating essential context found in biomedical literature (cell type and tissue information) with extracted protein-protein interactions to enrich biological knowledge graphs.

Daniel N. Sosa, Malavika Suresh, Christopher G. Potts, and Russ B. Altman, "Detecting Contradictory COVID-19 Drug Efficacy Claims from Biomedical Literature." Association for Computational Linguistics, 2023.

Large language models (LLMs) can be used to identify inconsistencies in large corpora of text (e.g. misinformation detection ), and we show how they can quickly help annotators detect contradictory claims about the efficacy of candidate drugs for COVID-19 treatment across thousands of papers.

Daniel N. Sosa and Russ B. Altman, "Contexts and Contradictions: A Roadmap for Computational Drug Repurposing with Knowledge Inference." Briefings in Bioinformatics, 2022.

Creating knowledge graphs from massive bodies of text is tricky, and biomedical science is a particular challenging domain because knowledge is dynamic and highly contextual.

Margaret Guo*, Daniel N. Sosa* , and Russ B. Altman, "Challenges and Opportunities in Network-Based Solutions for Biological Questions." Briefings in Bioinformatics, 2022.

Networks are excellent for modeling knowledge of interacting entities, but for biological systems we must be diligent that these models don’t stray too far from reality.

Daniel N. Sosa, Binbin Chen, Amit Kaushal, Adam Lavertu, Jake Lever, Stefano Rensi, and Russ B. Altman. "Repurposing Biomedical Informaticians for COVID-19." Journal of Biomedical Informatics, 2021.

Biomedical informaticians cultivate a sophisticated toolkit of AI, data science, genomics, electronic health record (EHR) analysis, computational imaging, and more, and during the COVID lockdown, informaticians repurposed their tools to help meet the moment.

Daniel N. Sosa*, Alex Derry*, Margaret Guo*, Eric Wei, Connor Brinton, and Russ B. Altman, "A Literature-Based Knowledge Graph Embedding Method for Identifying Drug Repurposing Opportunities in Rare Diseases." Pacific Symposium on Biocomputing, 2020.

By extracting knowledge from millions of biomedical papers and connecting the dots, we can predict how drugs already in the pharmacy can have a second life to address unmet clinical needs.