In this interview, Jane Kraprayoon, talks about her experience as researcher for the Data for Good Scholars Rights CoLab team. Jane is a second-year undergraduate student at Columbia University studying Computer Science in the Intelligent Systems track. She is enthusiastic about research in AI and using Data Science for social good.

Rights CoLab: What drew you to the Rights CoLab data science project? 

Jane: I am a Computer Science student and I wanted to work on a Data for Good Scholars project that uses natural language processing (NLP) so that I could get more practical, hands-on experience with it and learn to apply it in a way that would have an impact. The labor rights in supply chains focus of the Rights CoLab project is especially interesting to me because it is a very serious issue and the solutions have been few.

Rights CoLab: What have you been working on and what has been the most interesting aspect of the project? 

Jane: I started as a summer research intern last summer. My first task was to build a sentiment analysis model that we could apply to our diversity, equity, and inclusion (DEI) data sets. This task helped me to familiarize myself with our data and the project, while also being able to work independently. My other task was to process data through the DEI pipeline, which was developed by the lead team DEI researcher, Sabrina Shih. The pipeline consists of a set of processes used to search for terms that we deemed important in a large dataset of articles, while also allowing us to see recurring trends in which terms or company practices were material. I also developed data visualizations, which allowed me to experiment with data visualization tools and libraries as well as gain more experience in website development.

In the fall, together with the rest of our team I focused on the topic, labor conditions in supply chain. I have been preparing other datasets, including trade publications and NGO reports, and then processing the resulting articles through the pipeline to see what kind of results it yields. Most recently, I sorted through the results from FactSet’s Truvalue SASB Spotlight Events (TVL) dataset. This task involved delving deeper into the data and manually validating articles that were flagged by our labor conditions in the supply chain pipeline. I am also helping to write the report, which  involves working beyond the numbers and sifting through data for valuable findings.

Rights CoLab: What did you find?

Jane: We are looking for instances of co-occurrences between “practice terms” and “outcome terms,” as defined by our keyword dictionary. The co-occurring practice terms that we identify point to evidence of their financial materiality. With the TVL articles, the most frequently co-occurring practice terms are “transparency” and “traceability,” whereas the most frequently co-occurring outcome terms are “lawsuit,” “investigation,” and “allegations.” Other frequently occurring practice terms are “direct sourcing” and “gig work.”  I find this result validating as these factors are salient to supply chain management, so it tells us that we are on the right track.

Rights CoLab: What are you thinking in terms of how you will build on this experience in your future?

Jane: This experience allowed me to witness firsthand how the fields of data science and human rights intersect – or can intersect productively. During my time on the project, I have worked closely with people with experience applying data science to the real world, which has given me a good foundation to work on interdisciplinary projects. In the future, I want to continue working on projects like this one where I can help use data science to create positive social impact.

About the photo: Jane says, “I love to travel. This is me at the Jim Thompson Farm in Nakhon Ratchasima Province in Thailand, which is also the country where I am from.”