The project taught me more about the field of data science than I could have imagined.”

Jay Trevino is a versatile undergraduate and researcher who takes pride in his interdisciplinary approach. While his primary focus lies in data science, Jay has also delved into the realms of astronomy and cosmology through previous research experience. Jay’s diverse interests are reflected in his courses, which encompasses a wide range of subjects, including neuroscience and linguistics. In his free time, Jay also pursues his passions for photography and music with the same level of dedication and creativity as his academic work.

Rights CoLab:  Tell us a bit about yourself – your background and interests – which brought you to the Rights CoLab DfG project.

Initially, I applied for the Rights CoLab DfG project because I recognized immediately that the work you are doing reflects a genuine interest in understanding human rights issues from a data-driven standpoint. At the time, I wanted to pursue undergraduate research in natural language processing. I found the idea of language in machine learning fascinating, especially given the problem of semantics that until recently had felt  impossible to approach computationally. I had done data science research in the past, but had no experience in natural language processing specifically so coming to this project was an opportunity for me to explore this new interest.

Rights CoLab: Is the project what you expected?  Were there any surprises?

The project taught me more about the field of data science than I could have imagined. Most of my technical expertise in data processing has come very directly from the work I’ve done here. I’ve had the opportunity to explore every facet of the data pipeline, from collection of novel datasets to communicating technical results to a non-technical audience, and all the “in the mud” work that comes in between. I was surprised at just how seamless it can be to collaborate with my mentors for high-level guidance on the project, while at the same time having the trust and freedom to figure out inevitable technical issues.

Rights CoLab: Any frustrations? 

Compute resources. Never in my life have I been bound by such strict adherence to the resources available for free. Previously, much of my work had been done on HPC clusters (supercomputers), and I never had to worry about optimizing the code I wrote. Not having this available gave me one thing that I will carry on with me in all of my future endeavors: perspective. It required that I solve the exact same problem over and over again, exploring every possible representation of the same piece of data, the same problem until finally my solution works within the bounds of the problem.

Rights CoLab:  What has interested you about the project the most?

I’ve been on this project for about a year and a half now, and have had the opportunity to explore all chapters, start to finish, in the data science pipeline. What fascinated me most was having the opportunity to work on the data collection of Korean financial filings. Here, I saw just how powerful natural language processing could be. I don’t speak Korean. Not in the slightest. However, being able to automate the translation of these documents meant that thousands of pages of new information was now available to me within minutes. This is something that simply was not possible 15 years ago, perhaps even less. And the fact that I could do all this on my personal laptop shows just how accessible these powerful tools are becoming. I’m very excited to see where this project lands and how future intersections between data science and human rights can and will legitimately impact the daily lives of all of us.

Rights CoLab: You are graduating soon, so we have to ask the question: What’s next for you? 

I wish I knew, but to be honest, I’ve never tried to lay down a rigid path for myself. I know that just as the world is constantly changing, so am I. One thing that I have always found fascinating is language. Perhaps, this is what spurred my interest in natural language processing specifically. The rise of generative language-based AI models has been in the news (and in my browser history) a lot recently. I could absolutely see myself working towards a PhD in this in a few years. However, I’ve always found the brain to be very fascinating and again could see myself working towards this as well. I’m in no rush. At my age, I’m more than happy using this time to expand my knowledge-base as wide as I possibly can. I’m sure this will serve me well once I’m ready to settle down. To answer the question, I’m not sure but I’m excited to see where the darts land.

About the photo: “As I wrap up my final year here at Columbia University, I’m busy snapping shots around campus and capturing memories. Whether I’m behind the camera or posing for a photo, it’s all part of cherishing my time here.”