Data Engineer II

7 months ago
Job ID
Amazon Corporate LLC
Position Category
Business Intelligence

Job Description

Machine Learning is changing the way new products, programs and services are created and intelligently marketed to customers to increase customer engagement. The Consumer Analytics team is uniquely organized with SDEs, DEs, BIEs and Research Scientists to solve challenging predictive analytics problems at Amazon scale. Are you passionate about Big Data (Amazon scale), Machine Learning and Predictive Analytics software?

Big Data Processing
We are responsible for the production, processing, and analysis of several TB’s of customer grain data on a daily basis. We analyze data coming from various traffic channels such as onsite, free search, paid search, social, paid social email, associates etc. We heavily use AWS services such as AWS Flow, S3, EC2, EMR (Hadoop/Spark), Kinesis, DynamoDb to manage our data workflows.

Machine Learning
We build various Machine Learning solutions that learn and become better with time by the addition of new data and validation methodologies. We work with both supervised and unsupervised machine learning approaches not limited to regression, classification, clustering etc.

Our products have become Amazon wide programs and are accelerating in adoption across businesses. We have services that provide predictions from our models to influence Amazon customer experience. However, as we look forward our rate of innovation is dependent on the quality and breadth of data we input to these models. As Amazon is pivoting towards worldwide simultaneous launches of programs like Prime and Amazon Video our fundamental data models need to be re-engineered. We also need serious engineering behind the data pipelines to establish and support stronger SLAs as businesses take dependency on our outputs.

We are looking for an outstanding individual who combines superb technical, communication, and analytical capabilities with a demonstrated ability to get the right things done quickly and effectively. This person must be comfortable working with a team of software development engineers to raise the bar of the data pipelines we build and maintain. Given the cross Amazon nature of our products, the individual should be highly self-directed having good cross-team collaboration skills.

The ideal candidate for our team is a thinker and a doer: someone who loves sophisticated algorithms and mathematical precision, but at the same time enjoys implementing real systems, and is motivated by the prospect of spectacular business returns.

Basic Qualifications

* A desire to work in a collaborative, intellectually curious environment.
* Degree in Computer Science, Engineering, Mathematics, Physics, or a related field and at least 4 years work experience
* Industry experience as a Data Engineer or related specialty (e.g., Software Engineer, Business Intelligence Engineer, Data Scientist) with a track record of manipulating, processing, and extracting value from large datasets.
* Demonstrated ability in data modeling, ETL development, and data warehousing.
* Experience with Oracle, Redshift, Teradata, etc.
* Experience with Big Data Technologies (Hadoop, Hive, Hbase, Pig, Spark, etc.)
* Coding proficiency in at least one modern programming language (Python, Ruby, Java, etc)

Preferred Qualifications

* Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
* Experience building data products incrementally and integrating and managing datasets from multiple sources
* Experience leading large-scale data warehousing and analytics projects, including using AWS technologies – Redshift, S3, EC2, Data-pipeline and other big data technologies
* Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
* Experience using machine learning and statistical tools such as Python/Pandas, R etc .
* Linux/UNIX including to process large data sets.
Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed