Senior Data Scientist
Primer is pushing the edge of artificial intelligence. We are building machines that can read natural language text documents, understand them, and share what they learn by writing their own natural language text documents. Join us!
At Primer we use machine learning and natural language processing to automate the analysis of very large corpuses of unstructured text. We build systems that read documents, extract insights, and write reports comparable to those of a human analyst. Our objective is to help our customers understand the world around them ‐‐ from geopolitical events and scientific research to changes in the risk profiles of companies. Our clients include some of the world’s largest corporations, financial institutions, and government agencies.
You can learn more about Primer's technology and the sort of problems we solve at our blog, as well as recent media coverage of our work. Primer was chosen by the World Economic Forum among a select group of Technology Pioneers.
General overview of the project(s)
As a Senior Data Scientist, you will take the lead on shipping new data driven product features - leveraging cutting edge algorithms, solid software engineering and a good intuition about data. You’ll be constantly learning and constantly teaching, expanding your skills across the stack and bringing in new technologies and methods to the team. We’re looking for engineers who will build products that seem magical to our users, and help us expand what we believe to be possible.
Train and deploy new ML models to structure new facts from news documents
Scale clustering algorithms to work on millions of documents at once
Build summarization algorithms that work on document types we’ve never seen beforeD
Discover patterns of disinformation across news and social media, and propose product features that make it easier to uncover
Bring our algorithms to many different languages
Curiosity and enthusiasm, and a love for teaching and learning
Strong programming skills, including in Python, with at least 3 years experience in a production engineering team
Masters/Ph.D. in a quantitative field or at least 5 years building analytical/data driven products in an industry setting
Skills in data exploration, visualization, and cleaning
Experience applying machine learning algorithms to real data sets, preferably including text analysis and NLP
Experience in using one or more of the following: NumPy, SciPy, Scikit-Learn, NLTK, SpaCy, TensorFlow, Keras, PyTorch
Experience taking ambiguous problem statements through to delivered products
Experience interacting with end users/clients or strong interest in doing so
Skills considered as a good plus
Experience with ElasticSearch and Postgres
Foreign language proficiency / fluency