Data Scientist Spotlight: Jordan Meyer
Jordan Meyer is a Customer-Facing Data Scientist at DataRobot and winner of the Zillow Prize! We talked with Jordan to learn about his background and interests in data science, his experience competing in the Zillow Prize contest, and what he likes to do in his spare time.
Jordan Meyer (right) receiving the Zillow Prize from Stan Humphries.
What’s your background experience and how did you come to join DataRobot?
My career has been focused mostly on data engineering and data science with datasets that are typically found in relational databases and data warehouses. So, plenty of predictive models, but no search engines or self-driving cars.
I was contacted by a recruiter about joining DataRobot after getting to the ‘Master’ level on Kaggle. At that time, I was also in the middle of the Zillow Prize contest round two, but I couldn’t pass on the opportunity to interview with such an awesome company.
Can you give us some background information about the Zillow Zestimate contest? What’s the contest all about?
The goal of the contest was to predict the value of about 10 million homes using attributes of each home, like square footage and lot size, as well as historical data, like tax values and previous sales. We submitted predictions for all 10 million homes at the end of July and were then scored on the homes that actually sold during September and October of 2018.
How did you find out about the contest? And why did you join?
It was hard to miss. The million-dollar prize certainly got a lot of attention. I hadn’t participated in a Kaggle competition before, and I thought this competition would be a great place to apply some of the research I was doing into deep learning techniques for relational data.
How did you find your teammates for the Zillow Contest?
Nima and Chahhou are seasoned Kaggle veterans, who have worked together to finish in the top three of several competitions. I started round two of the contest as a solo competitor and they reached out to invite me to their team. I’m really glad they did!
Did you use DataRobot? If so, can you explain how that process went?
I didn’t, but I wish I had. I started working at DataRobot after the competition ended. Now that I’m familiar with the platform, I don’t like thinking about how much time I would have saved had I used it.
Can you share how you made the leap from BI and analytics into data science?
My coursework was always pretty focused on data science, though I wasn’t aware of that term at the time. I worked for university analytics departments while taking classes in Statistics, Information Science, and Operations Research. I transitioned from university jobs to consulting, work where I built predictive models and other data products using R — almost always as a component of much larger data warehousing and business intelligence engagements.
Any tips for others who want to make the same career evolution?
I’d say they should take advantage of the access they already have to potentially useful datasets. One of the big challenges when starting with such a broad discipline is finding a problem you care to focus on as you go. Toy examples can help in classroom settings, but having a problem you’re truly passionate about solving will make the research more tangible.
How can others get started with data science?
I’d recommend fast.ai as a starting place. Jeremy Howard and Rachel Thomas are doing amazing work at making data science accessible to anyone interested in learning. DataRobot allows learners to take an approach that is similar to fast.ai, which shares the goal of building models first and then diving into the details with the big picture in mind.
What must-have skills should new data scientists start learning?
If 80% of data science is preparing data, that means most BI professionals are already prepared for 80% of the workload. For the 20% that’s new, I’d recommend picking a supervised machine learning framework (python/scikit-learn and R/caret are both great) and not getting distracted by other options after making the choice. A common trap I see is that new data scientists will scratch the surface of many different frameworks instead of devoting the time to master any one of them. Think of the tools as a means to an end, and focus on the end result.
Do you have any tips and tricks for joining data science competitions?
What worked for me was waiting for one with a dataset I could relate to. Definitely join a team and read the kernels and discussions. Kaggle is an amazing community with people willing to share their techniques and approaches, even though that hurts their chances of winning.
What do you enjoy most about working in data science?
I like the science in data science, which is why I love working with DataRobot. It automates all of the engineering and lets me focus on hypothesis testing and analyzing results.
When you aren’t spending hours perfecting algorithms, what do you enjoy doing in your spare time?
I enjoy playing music and making digital art. I’m excited about Google’s Magenta project, which I installed on my server after Zillow Prize finished. It combines deep learning, art, and music into a single suite of tools.