We are seeking a strong data science generalist with a background in computer science and either applied math or statistics to join our team of artificial intelligence system developers as a Data Scientist.
In this role, you can expect the full lifecycle of data science — everything from project ideation and data collection through modeling, results presentation, productionization, and model refinement.
- Java — You don’t have to be a Java expert (we have plenty of these), but you should be capable of reading and processing data from a text file or REST endpoint, writing classes, working with time-series data, and writing jUnit tests.
- Python, pandas, scikit-learn, and your favorite data visualization library.
- Experience processing data using a “big data” platform that uses the map-reduce processing pattern (Spark, Hadoop, H2O, dask, etc.).
- Experience processing large CSV and JSON datasetsMachine learning:
- Be able to explain the math and logic of your work to non-technical people in a way that they understand. This is admittedly a “squishy” requirement, but it is the most important requirement of all – because selling any AI product requires establishing customer trust in our methods. For example, the typical customer of our trade surveillance product is a lawyer, and the system is explicitly designed as a first step towards detecting potentially criminal behavior. We explain our calculations to customers in English, math, and code.
- Be able to work with various model types: classifier models, regression models, unsupervised models, time-series models, etc..
- Be familiar with common modeling trade-offs.
Math and Statistics:
- Basic linear algebra – mainly matrix math and solving systems of linear equations.
- Multivariable calculus.
- How to explore the statistical properties of data — visualize the distribution of attributes/features, test for Gaussian distribution, test for stationarity, check independence assumptions, detect outliers, etc..
- Familiarity with Markov processes.
- Familiarity with Git and Github.
- Basic familiarity with Linux and SSH.
- Basic familiarity with data security/privacy practices. In our surveillance product, our input data is confidential; our outputted results are extremely confidential.
- You take ownership of your work, for good and for ill.
- You learn independently, yet aren’t afraid to ask for help when stuck.
- Nice-to-haves, in descending order of necessity:
- Any experience with the H2O.ai platform (distributed in-memory machine learning).
- Trading industry experience and/or familiarity.
- Any experience communicating/presenting to C-level executives and/or lawyers.
- Familiarity with network analysis.
- Familiarity with signal processing.
- Familiarity with hidden Markov models.
- Other data science languages: R, Scala, etc..
- Familiarity developing multithreaded distributed systems.
- Any experience with differential equations (especially stochastic PDEs) may eventually be useful.
Trading Technologies (TT) is an equal opportunity employer. Equal employment has been, and continues to be a required practice at the Company. Trading Technologies’ practice of equal employment opportunity is to recruit, hire, train, promote and base all employment decisions on ability, rather than race, color, religion, national origin, sex, age, disability, sexual orientation, genetic information or any other protected status. Additionally, TT participates in the E-Verify Program.