‘One of the benefits of growing an open-source product is the community around it. They contribute to the code and grow their skills together.’
I have now interviewed quite a few inspiring members of the data science community, and I have noticed that they all have one thing in common: they are not afraid to take a break from their corporate jobs and spend their time exploring what they like.
Philip Vollet is one of them. We had a chat to talk about his career path, his passion for communication and NLP, and his participation in the open-source community.
Huifang: What did you learn from your first experience at MCL IT Solutions — from the junior technical role to the managerial position?
Philip: My junior roles required a lot of technical knowledge, for example about the frameworks we were using in the Operations team. However, people’s skills were also central — to manage the team, to communicate with clients and deal with the challenges we were facing. Communication is part of my major and it’s also one of my favourite subjects.
That is probably why, with time, I gravitated towards a role that allows me to use my communications skills more, while also integrating all of the technical knowledge I have acquired during the years.
Huifang: What do you think about the importance of communication, even in the data and technology community?
Philip: It’s essential. There are different steps to implement new technologies and approaches and there is always some degree of resistance from stakeholders. It almost always boils down to stakeholder management — it’s about understanding the covert interests and communicating the big vision behind a project right.
“People function better with stories — and technology implementations are all about telling a good story”
Huifang: What drove you to start your own consultancy company, the moonwalk venture, and what have you learnt from that experience?
Philip: I needed a break from corporate games. moonwalk GmbH was a small freelancing business that gave me more time to study and learn new stuff. For five years, I got to choose the projects that I wanted to work on and where to focus my attention.
I was freelancing for a KPMG project when they offered me a role there. At that point, I was deciding whether to hire new people and grow my business or to go back to a bigger company.
All freelancers struggle with income stability from time to time — you never know how many projects you will work on next month, maybe many, maybe none.
So I decided to give KPMG a go.
Huifang: What are you working on at KPMG?
Philip: It’s a mix between data science and data engineering. We are the stakeholders of the internal data and we build data modulation pipelines with machine learning to analyze the internal KPMG data. We act as a filtering layer for projects that need an API or our internal data. We also build KPIs for our management.
Huifang: Do you have any tips for someone new to this industry — transitioning from technical roles into the data science arena?
Philip: My role in KPMG is mostly about data and how to train the model. However, looking at our machine learning projects, it boils down to data engineering, which keeps growing in importance.
Overall, companies need two types of data specialists — the data science people and the engineering people. Data is the golden nugget for both, so people should really study how data pipelines work to understand where they fit and what they need to learn to progress in the field.
It’s also good to stay up-to-date with research on the state-of-the-art and the latest implementations in machine learning. I do that regularly through my LinkedIn hobby and my work at KPMG as a mentor for students writing their theses. I am reading a lot of academic papers and keeping track of what’s new in the industry.
Another interesting point to note is that people approaching the data community often hide themselves behind their math anxiety. However, you really don’t need to be a math expert to enter the world of machine learning. It’s enough to understand how math works.
Math is about logic — you need to understand the concept rather than being a pro at linear algebra.
Huifang: Did you fall in love with NLP at KPMG?
Philip: NLP is an old lover from 2016, when I was heading into machine learning. However, it was mainly statistical machine learning at the time. It’s a growing field and there’s so much more in store.
As part of the job at KPMG, I supervise academic theses. At the moment, we are writing about supplement reviews and adverse drug effects, using NLP to analyze the data. There are also fun projects such as training chatbots to tell jokes.
Huifang: What’s the relationship between NLP and open source/open science?
Philip: Big companies have big machine learning models in place and spend millions of dollars to train them, but they are always in need of open source and open science to develop.
One of the benefits of growing an open-source product is the community around it. They contribute to the code and grow their skills together, so it’s a win-win for everyone involved.
Open-source communities are super supportive, you get help when you need it. We also need to watch out for some aspects of open source — for instance how cloud providers adopt open-source software without adding value or supporting future development.
Huifang: How would you encourage people to contribute to open source?
Philip: Make them curious. If they are interested in the technology, they are digging into it. It’s a journey that you have to start. Don’t hold back thinking that my code isn’t good or not working. I also started as a script kid — and now, we are deploying big machine learning pipelines!