Jia Wu, part of the DPhi community, who worked her way from the Research background at Yale to the data science industry in the energy subdomain at Budderfly has a very inspiring journey. She worked towards bettering her Data Science skills with sheer consistency and made it happen. Along the way, she documented her journey on LinkedIn under the hashtag #jiawuphdtoindustry. We are immensely proud of all that she has accomplished and have tried our best to put her journey onto a page in a form of Q&A for the community to learn and get inspired from. Happy reading 🙂
Q. What was your previous role and how would you describe it?
A. My previous role was as a research scientist at Yale University. We studied EEG in children and adolescents in relation to cognitive processing and psychopathology. Over the years I had built a complete data processing pipeline from data collection, cleaning, reduction, to advanced analytics, statistical analysis, and visualization. I oversaw data processing for all the projects and mentored students and faculty on how to use the data pipeline to analyze their project data.
Q. If you have to compare your job duties to the industry standards, how would you put it forward?
A. Now that I have worked in the industry for 2 months, here is what I feel is different from my previous role in academia.
In the previous role, I was the only person who was responsible for developing tools. So, I learned the concepts of testing, agility, object-oriented programming, docstring, etc through practice. In my current role, I work with a team of software engineers. I am learning best practices from them every day!
In my previous role, I was both the programmer and the domain expert. In my current role, I’m a data scientist who uses code to manipulate data and it works well with the software engineering team and domain experts in the company!
Q. What made the transition process from academia to industry easier for you?
A. First of all, the transition from academia to industry was not easy. I thought with a solid coding background and tons of experience in statistics, I ought to be useful in the industry. But then I realized all the companies were looking for machine-learning capable data scientists which I wasn’t. I quickly went through some courses to learn about Machine learning. Fortunately, ML is rooted in math and statistics to a large extent, so I was able to pick it up quickly.
In my previous role, I mostly relied on MATLAB. For industry, I realized I needed Python and R. So, I went through courses to learn them as well. It was easy to learn a language but to use the language the way it is supposed to take a lot of practice. Especially in the open-source world, there are many ways of doing the same thing, it is important to learn the best practices and practice on real-world projects.
Q. Taking that first step transitioning into the industry must have been challenging. How did you push yourself to take that first step?
A. It was very challenging because most of the time you receive either nothing or failure. I started a Linkedin post series #jiawuphdtoindustry to hold myself accountable. At the end of the tunnel when I looked back, it was essential for me to have done that. Accountability was one of the major reasons I kept going on the path.
Q. For anyone starting their data science journey, what is that one simple suggestion you would want to give from your personal experience?
A. Keep learning and improving yourself, socialize with the data scientist community. I didn’t get any job referrals from socializing, but I have learned a ton, made friends, and it feels good to be part of a community.
Q. With the many people applying for a data scientist role how easy or difficult is it to get into one?
A. I applied for ~150 jobs within 5 months. I got 15 interviews, 7 take-home exams, 1 offer. I would say it was a very difficult experience.
Q. How has your data science journey been so far?
A. It is fantastic. I work with a software engineering team but am also utilizing my data science expertise to solve real-world problems. We are in the energy sector and are trying to make small to mid-size buildings more energy efficient. I’m having a ton of fun working on the power data, developing metrics, and experimenting with different machine learning algorithms to help with the process.
Q. Last but not the least, how would you describe your Deep Learning Bootcamp experience with DPhi?
A. DPhi is a great resource for deep learning as well as general machine learning. (And it’s free!) It provides:
- Great videos and animations on the full stack of topics in ML/DL, featuring code basics
- It has both the high-level general explanation, as well as low-level algorithm implementations. Choose as you need.
- Both technical knowledge (how to build and optimize models) as well as helpful utilities (how to set up tensorboard). Very thoughtful about what you need.
- Code examples and this was emphasized several times: “You don’t need to memorize code!”. Instead, try to “understand the code and use it in another context”. I believe it is one of the efforts that make ML and DL easier access!
Q. With the abundant availability of Data Science courses can you share some courses that helped you when you started the journey?
A. As the bar goes up, there is also an abundance of resources. There is not going to be one particular thing that can help you, but many combinations of many possibilities. The key is to focus on your goal.
- I started with a decade of stats and coding experience in academia. But it wasn’t enough to become a DS in the industry nowadays.
- To convert my coding skills to python and R, I used Dataquest.io for a few months.
- To convert my stats to ML, I watched the whole series of Andrew Ng’s videos.
- To learn deep learning I read books: Deep learning in python by François Chollet, and Deep learning in python by Jason Brownlee. Jason also has many other awesome coding examples in NLP and Time series.
- To practice coding, I participated in DPhi’s bootcamps. A very good resource that people can definitely benefit from!
- To get more exposure to industry projects, Kaggle, and take-home exams!
- Linkedin posts and medium articles gave me the big picture of the field, and what interesting projects people are doing.
- Many resources have the source code so try to make some implementation yourself!
- Just keep in mind your goal and the area you need to improve on!
Q. During your initial days of doing data science projects, what is that one thing that you discovered about ML?
A. The traditional way of doing math and engineer is to use a formula that was developed and proved by experts. ML is to develop your own formula, so it fits your data and environment the best, and do so in a systematic and automated way.
Q. Most researchers are considered to be really serious all the time. But you do have a funny bone, what would you say on that?
A. Most researchers are not serious at all. They are when conducting research design, experiments, and writing papers, but day-to-day interaction is always very casual.
Well, that’s the end of the Q&A. We would like to thank Jia Wu for sharing her experience.