Shining a Light on Transferable Skills for Your Data Science Path
Written on
Introduction
Having dedicated five years to research in laser physics, nonlinear optics, and solid-state laser engineering, I found myself immersed in my work. However, I eventually made the shift to the commercial data science sector.
After an additional six years in data science, I've realized that the skills I cultivated in applied physics translate exceptionally well to commercial projects, even those unrelated to laser physics.
While much has been said about the value of academic experience, I wish to share my personal perspective on this topic.
To illustrate my views, I will evaluate various skill sets based on their utility and rationale.
Audience for This Article
This article primarily targets individuals contemplating a shift from academia to the commercial sector, but it also serves as a reflection for myself on the convergence of tools, skills, and mindsets across both domains.
Familiarity with Literature Review? 7/10
Why is conducting literature reviews such a vital and transferable skill for commercial data science?
I believe that literature reviews are often undervalued and misconstrued in the commercial data science realm. While we excel at staying updated on new model architectures and frameworks, the ability to efficiently gather structured and valuable information on project-related topics remains a significant gap in the field.
The term "literature review" might not even be the most appropriate; alternatives such as "background research" or "state-of-the-art analysis" could fit better.
When addressing a business challenge, establishing a theoretical foundation is crucial. A literature review can:
- Establish a basis for informed data strategy decisions. Familiarize yourself with existing methods and practices in your domain.
- Accelerate the onboarding process. Quickly acquiring knowledge about a new domain is essential for generating value.
- Enhance communication with field experts. Collaborating with domain experts requires data scientists to grasp domain-specific terminology and concepts for effective dialogue.
- Significantly elevate insight quality. In my experience, a literature review informs decisions regarding data collection, preprocessing, modeling, and evaluation, ultimately leading to better insights.
Investing time and effort into literature reviews embodies an open-minded, humble, and inquisitive attitude. This process helps avoid the pitfalls of reinventing the wheel and confirmation bias.
I anticipate that the advent of large language models will transform literature review processes, but we haven't reached that point yet.
Journaling? 9/10
Transferring journaling habits from academia to commercial data science has been immensely beneficial for me. Beyond practical advantages, it offers a valuable sense of continuity through the fluctuations of a researcher's work life. By adopting the key habit of maintaining a lab notebook, data scientists can effectively track experiments, record ideas and observations, and monitor personal and professional growth. I've written a separate piece discussing the merits of this practice, which you can explore!
Lab Notebook as a Weapon of Choice for a Data Science Practitioner
#### My set of rules and principles for effective note-taking in a lab notebook
Programming Knowledge? 6/10
Throughout my scientific career, I engaged in experimental data processing, numerical simulations, and statistical learning daily. Programming was essential for developing and testing new laser designs before physical prototypes were evaluated.
I employed various tools for typical data science tasks:
- Experimental data processing (Python, Wolfram)
- Numerical simulations (Wolfram, Matlab, Python)
- Statistical learning (Wolfram, Matlab, Python)
- Data visualization (Origin Pro, Python, R)
Wolfram Mathematica was my go-to tool, with its robust capabilities for solving non-linear differential equations. Python was my preferred choice for processing experimental data, such as beam shapes and oscillograms.
For data visualization, Origin was my primary tool, allowing me to embed visuals in text documents while keeping them editable. It was excellent for creating line charts, histograms, and conducting regression analysis, employing its GUI rather than coding.
Despite my solid experience with these tools, I rated my programming skills as only 6/10. This stems from the fact that good software practices are often overlooked in many academic settings.
> Caution: This observation is based solely on my experience in applied physics and may not be representative of all academic environments.
Researchers often prioritize the speed of research and publication quantity over code quality and maintainability. Additionally, there is a lack of individuals with software development expertise in academia, leading to insufficient production-level knowledge. The demanding nature of simultaneously designing experiments, conducting literature reviews, gathering measurements, and coding leaves little room for learning software development practices.
Measurement Proficiency? 9/10
This skill can be challenging to articulate, so bear with me. Measuring in applied laser physics is a specialized discipline that takes years of training to master. Delivering precise measurements requires understanding the underlying physics, adhering to measurement protocols, and having specialized knowledge to operate complex and costly instruments.
For instance, I've worked with diode-pumped pulsed solid-state lasers, measuring various beam parameters: pulse duration, pulse energy, repetition rate, beam profile, divergence, polarization, spectral content, temporal profile, and beam waist. Each of these measurements presents unique challenges.
In theory, you can direct a laser beam to a CCD camera to measure the beam profile quickly. However, in practice, this process is much more complex. For instance, if you're working with a pulsed solid-state laser, you need to carefully manage pulse energy to protect the CCD camera from damage. This requires a series of meticulous adjustments, including using beam attenuators and ensuring proper synchronization for stable images.
The skill of measuring beam shape involves vigilance (never take anything at face value) and meticulous attention to metadata (how data was recorded, the tools used, and the reasons behind each step). Both are invaluable when handling real-world data, enhancing efficiency in generating impactful insights, a quality appreciated in both academia and commercial data science.
Data Communication Skills? 10/10
During my academic tenure, I didn't regard data communication as particularly noteworthy. However, years of research have equipped me with substantial skills in data communication across various levels.
Writing a scientific paper is one of the more demanding formal communication skills to acquire. Crafting a well-structured paper (abstract, introduction, literature review, methodology, results, discussion, conclusion, acknowledgments) requires both practice and an ability to produce engaging visual representations of data to convey messages effectively.
I rate this skill a perfect 10/10 for transferability, as successful commercial data science relies heavily on effective communication among individuals and presenting your findings clearly.
Conclusion
In summary, individuals with a scientific background possess unique insights and valuable skills that can significantly benefit the data science field. For those in academia worried that moving into commercial data science means discarding their expertise, I offer a different viewpoint: you bring immense value to the table. The most effective strategy is to leverage your existing skills while continuously learning new techniques and best practices in your new field, recognizing that this is an ongoing journey.