“It is great to witness how data science can be applied, and I have enjoyed working with such talented people.”

Catalina is a recent graduate of the MS in Applied Analytics from Columbia University. In this interview, Catalina talks about her experience working on the Data for Good (DfG) Project since February 2022 and her interest in actively learning different tools developed in the data science field. With a versatile background and diverse interests, Catalina shows the importance of applying classroom knowledge in real-life activities and how this has helped her grow professionally.


Rights CoLab: Tell us about what brought you to this project?

Catalina: My story is somewhat different from that of the other team members in the DfG Project since I have a background in finance. I did my bachelor’s degree in economics in Colombia, my home country. I started my professional career working in investment banking, and then joined a digital bank called Lulo Bank, where I was able to use the machine learning models, I learned during undergrad to predict financial variables. I was impressed by how efficient processes became through the use of these techniques. To keep expanding my knowledge and grow professionally, I decided to pursue a Master’s in Data Analytics at Columbia.

Rights CoLab: How did you learn about the DfG Project with Rights CoLab?

Catalina: At the end of my first semester one of my classmates told me about the projects at the Data Science Institute and the Data for Good program. I was interested in Rights CoLab’s project because it uses natural language processing, so it was an excellent opportunity to gain hands-on experience and master this technique. Moreover, the project’s subject matter is fascinating. I come from a country where corporate human rights harms are very common, so I have always wanted to get involved in research projects related to these types of social issues.

Rights CoLab: What is your role in the project?

Catalina: Initially I participated in the DEI sub-project and helped extract and process Korean financial filings to be parsed through the DEI pipeline. In the Fall 2022, I joined the Labor Conditions in Supply Chains (LCSC) sub-project, and I have been working on it since then. For that sub-project, each team member is assigned to run the pipeline and analyze results for one data source. I am working with the Social Science Research Network (SSRN) dataset, which is one of the world’s largest repositories of social science research papers, to investigate whether scholars have identified financial material practices that may not have been picked up in news reports, company financial filings (Form 10-ks), or datasets that reflect investor interest, such as earnings calls or shareholder resolutions. After running the pipeline through this data set, I manually verified the results. I’ve been doing the same with earnings calls, sharing that work with project coordinator Yuwen Zhang, as well as contributing to writing the final report that will include the DfG Project’s findings.

Rights CoLab: Is the Project what you expected?

Catalina: The Project exceeded my expectations. It is constantly evolving, it is great to witness how data science can be applied, and I have enjoyed working with such talented people. All team members are brilliant, and I have learned a lot from them.

Rights CoLab: Any frustrations?

Catalina: Finding a large number of false positives in our results has been frustrating. We have been doing a lot of manual work to go through the results, and check if they are what we are looking for. There is a positive side though: because of this limitation, we have found ways to improve and automate our validation methods.

Rights CoLab: You are about to graduate, aren’t you?  Did you enjoy your experience at Columbia?

Catalina: Yes! I graduated this May. My experience at Columbia was great.  I have had the opportunity to meet the best professionals in the field and learn in many different ways. I also love living in NYC, so I think coming here was the right decision.

Rights CoLab: What do you want to do next?

Catalina: I want to continue working in the data field. I have recently joined a project where one of my main tasks is building basic business tools with the help of Artificial Intelligence, and I think this topic is fascinating. In general, the data domain is constantly advancing, and there is so much to learn and do that I think is exciting!

Rights CoLab: Any hobbies?

Catalina: I have multiple hobbies, but I’m really into sports; I do CrossFit and horseback riding. and I pursue both with the same dedication and discipline as my academic work. I also love traveling.


Photo: I really enjoy walking around NYC! The photo was taken on one of those sunny days walking around the city and hanging out with friends.