Episode 187: Karthik Ram on Research Software Sustainability
June 23, 2023 · 41 min
Guest
Karthik Ram
Panelists
Richard Littauer | Abby Cabunoc Mayes
Show Notes
Hello and welcome to Sustain! The podcast where we talk about sustaining open source for the long haul. In this episode, Richard and Abby are joined by guest Karthik Ram, a research scientist at UC Berkeley’s Institute for Data Science and Berkeley Initiative for Global Change Biology, as well as co-founder and director of the rOpenSci Project, and lead at URSSI. Karthik’s journey from field ecologist to data scientist has propelled him into roles that advocate for sustainable scientific software and open science. He currently manages projects, fundraises, and mentors while also overseeing initiatives aimed at developing best practices in software development, advocating for supporting policy, building user and developer communities. He emphasizes the significance of reproducibility and sustainability in research software and offers an empowering approach to maintaining academic software. Hit download to hear much more!
[00:02:00] Karthik explains what he does as a senior data scientist, and he tells us that he views himself as an “engineering manager” rather than an individual contributor.
[00:03:01] His transition from a field ecologist to a data scientist was triggered by handling large amounts of data and developing software to work with it.
[00:06:21] The conversation turns to the JOSS, the Journal of Open Source Science, and Karthik shares the origin story for the software review process.
[00:09:03] Karthik dives into the UC Berkeley’s Science Institute, he tells us how it started, and what his role was there.
[00:11:11] Karthik’s involved with the URSSI, where they aim to collect and disseminate best practices in software development, advocate for supporting policy at a national level, and grow user and develop communities around their projects.
[00:12:55] One of the projects coming up in the fall for URSSI is they’re going to run a school for research software engineering.
[00:15:16] Karthik and Kyle assembled a course focusing on the best practices for developing sustainable research software by drawing on topics from past workshops and classes they’ve conducted.
[00:17:12] We hear about the commonalities between scientific software sustainability versus normal open source software sustainability, and Karthik explains that scientific software sustainability is unique because it caters to niche groups, making it expensive to build and maintain.
[00:20:20] Karthik tells us about a project he’s working on with Patrice Lopez and James Howison, to identify what tools researchers use in various domains, how their usage evolves over time, and which clusters of tools drive research in certain areas.
[00:23:34] As part of this project, Karthik and his team are using a tool called, GROBID, to process structured documents to XML, extract entities, and analyze the usage of software mentioned in scientific papers.
[00:28:23] Karthik highlights the difficulties researchers face in keeping with best practices for code hosting and archival copies and discusses the misconceptions about GitHub being a permanent archive and the need for a safer, more reliable repository like Zenodo.
[00:31:31] Richard brings up the issue of measuring the impact of code repositories and whether a similar system to academic journal impact factors could arise.
[00:33:02] Karthik details an approach for maintaining academic software.
[00:38:02] Find out where you can learn more about Karthik and his work on the web.
Quotes
[00:07:43] “They would bring their puppy and ask us to adopt it.”
[00:15:45] “Even today, we do not have a good appreciation for research software and the role that it plays in driving research on all the things that we care about.”
[00:16:21] “Another pet peeve that I have is that people think money is the solution to everything.”
[00:16:38] “If we teach more projects about best practices, it’s very likely that software that integrates those best practices will actually continue to exist.”
[00:17:51] “The challenge with research software is there’s a lot of software that sits on the long tail.”
[00:28:39] “I think the challenge is that we don’t really need to invent anything new.”
[00:36:14] “Part of the work we want people to do is invest community early on.”
Spotlight
[00:38:47] Abby’s spotlight is Governing Open by Shauna Gordon-McKeon.
[00:39:15] Richard’s spotlight is Bertram Ludäscher and William Michener.
[00:39:43] Karthik’s spotlight is Patrice Lopez.
Links
SustainOSS (https://sustainoss.org/)
SustainOSS Twitter (https://twitter.com/SustainOSS?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor)
SustainOSS Discourse (https://discourse.sustainoss.org/)
podcast@sustainoss.org (mailto:podcast@sustainoss.org)
SustainOSS Mastodon (https://mastodon.social/tags/sustainoss)
Open Collective-SustainOSS (Contribute) (https://opencollective.com/sustainoss)
Richard Littauer Twitter (https://twitter.com/richlitt?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor)
Richard Littauer Mastodon (https://mastodon.social/@richlitt)
Abby Cabunoc Mayes Twitter (https://twitter.com/abbycabs?lang=en)
Karthik Ram Website (https://ram.berkeley.edu/)
Karthik Ram Twitter (https://twitter.com/_inundata?lang=en)
Karthik Ram GitHub (https://github.com/karthik)
Karthik Ram LinkedIn (https://www.linkedin.com/in/karthik-ram-93334954)
rOpenSci (https://ropensci.org/)
The Journal of Open Source Software (https://joss.theoj.org/)
Arfon Smith-Chatops-Driven Publishing (https://www.arfon.org/)
DJ Patil (https://en.wikipedia.org/wiki/DJ_Patil)
Berkeley Institute for Data Science (https://bids.berkeley.edu/)
URSSI (US Research Software Sustainability Institute) (https://urssi.us/)
Software carpentry (https://software-carpentry.org/)
Report from the URSSI Winter School pilot (https://urssi.us/blog/2020/01/29/report-from-the-urssi-winter-school-pilot/)
Kyle E. Niemeyer, Ph.D. (https://niemeyer-research-group.github.io/)
Science-miner (https://science-miner.com/)
GROBID (https://github.com/kermitt2/grobid)
James Howison-Associate Professor (https://www.ischool.utexas.edu/people/people-details?PersonID=175)
Issuing a persistent identifier for your repository with Zenodo-GitHub Docs (https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content#issuing-a-persistent-identifier-for-your-repository-with-zenodo)
Governance of Open Source Software by Shauna Gordon-McKeon (https://governingopen.com/)
Bertram Ludäscher (https://scholar.google.com/citations?user=nYx9xasAAAAJ&hl=en)
William Michener (https://scholar.google.com/citations?user=TJ5xlKsAAAAJ&hl=en)
Patrice Lopez (https://scholar.google.com/citations?user=xDfUqfcAAAAJ&hl=en)
Credits
Produced by Richard Littauer (https://www.burntfen.com/)
Edited by Paul M. Bahr at Peachtree Sound (https://www.peachtreesound.com/)
Show notes by DeAnn Bahr Peachtree Sound (https://www.peachtreesound.com/) Special Guest: Karthik Ram.