Data science is an exciting field with a lot of expected growth and interesting opportunities. But what exactly is a data scientist? What do they do?
One writer gave some insight into the day-to-day life of a data scientist in this article.
Some questions you may want to ask yourself as you read this article that might help you see how a data scientist working in industry thinks and works, and how you would fit in to this kind of a job:
To be an effective data scientist, you must become a problem solver. Everything you will do in your career will be about solving some kind of problem using data. This usually requires learning to think about problems in an organized way. This article discusses the practice of structured thinking:
Some things you may want to ponder:
Another article to help show how data scientists work and think: What I do when I get a new data set as told through tweets
Being a good data scientist requires a lot more than just being able to write code well, but not being able to code well is a sign of a poor data scientist. Currently, three programming languages drive the data science community. If you want to argue that you are a data scientist, you need to be proficient in at least one and able to use all three.
Knowing a language doesn’t make you a data scientist, just like knowing English doesn’t make you a poet. You will also need to have analytics and visualization capabilities.
Currently, three languages drive the data science community. If you want to argue that you are a data scientist, you need to be proficient in at least one and able to use all three.
BYUI students can take MATH 325 to be introduced to R for statistics and MATH 335 to learn R for data wrangling and visualization.
BYU-I students can take CSE 110 to be introduced to Python and CSE 250 to be introduced to Python for data science.
BYU-I students can take CIT 111 or CIT 225 to be introduced to SQL.
It’s important to remember that most of the people that you’ll interact with in your career won’t be data scientists, and may not have any experience working with data in the way that a data scientist does. You may be the only “data person” on a team, and will need to communicate with your teammates about your work and present it in a way that they will understand. Read this article again for an example of this.
Similarly, you will need to get access to the data that you will be working with. Usually that will come from people who either aren’t familiar with your project, aren’t data people, or both. Without good work doesn’t happen without good data. There’s an art to requesting data from and communicating with people unfamiliar with what you are doing. Part of getting good at doing so can only come through time and practice, but there are things you can do from the get-go. Listed below are links to some articles that offer some good advice:
Here is a real example of some of the pains of requesting data that you should be prepared to handle in your career. Note the actions that the data scientist took to ensure that he had the data that he needed to solve the client’s problem. - What questions did he have to ask? - What data was important to him? What didn’t matter? Why? - What principles can you learn from this about requesting and communicating data?
Apart from being a good demonstration of what it’s like to acquire data in real-world data science work, this is also an excellent example of great data science work in solving a client’s problem.
While data science is a rapidly growing field with a lot of opportunities for employment, it’s also very competetive. Simply having a degree isn’t enough to land the best jobs, you will need to be able to show employers that you’re capable of meeting their data needs. Three of the best tools for accomplishing this are your resume, personal Github repository, and LinkedIn profile.
In many cases, your resume will be the first thing that a potential employer will see about you. It should showcase your skills, contributions to past projects, and value as a data scientist. Here are some resources to help guide you through gearing your resume geared towards data science jobs, as well as some tools for helping you make a resume in general.
Github has become a staple in the software world for collaborative work, and data scientists also make great use of it. It’s a great way to show potential employers the work that you’ve done in the past so that they can get see a concrete example of some of your technical abilities. Use it as a place to store your work for projects, both professional (where appropriate - don’t share anything sensitive in a public place) and personal. Do work that isn’t required by work or school to show that data science work is something that you enjoy and post it to your personal Github repository.
You’ve probably heard of LinkedIn before, it’s a popular social media platform designed to help people connect with potential employers and other people in a professional setting. Many recruiters will use it as a tool for finding potential employees to fill openings in companies, so it’s worth your time to build up a good LinkedIn profile. It’s a place where you can post your resume, experience, skills, a link to your Github repostiory, and establish a professional presence as a data scientist. You should also use it as a networking tool to connect with potential employers and interact with professionals already in the industry who can offer advice and further connections and opportunities.
— Found in Supplemental Reading