• Senior Data Engineer, Amazon Web Services

    Location US-WA-Seattle
    Posted Date 2 weeks ago(11/28/2018 12:09 PM)
    Job ID
    Amazon.com Services, Inc.
    Position Category
    Business Intelligence
    Company/Location (search) : Country (Full Name)
    United States
  • Job Description

    Amazon Web Services is seeking an outstanding Data Engineer to join the AWS Data Lake team in the AWS Commerce Platform.

    The AWS Commerce Platform provides the back and front-end services that enable AWS customers to purchase AWS services and understand and manage their infrastructure costs. Our teams tackle some of the hardest scalability, performance, and distributed computing challenges the world. We process trillions of events per month using stream processing techniques (Kinesis), process billions of line items via map reduce (EMR) and manage artifacts through the latest in database technologies (DynamoDB and Aurora). We process big data and provide tools for customers to interactively understand their bills. We also provide the analytics that let customers manage billions of dollars of IT usage and spending. Because we sit at the nexus of all AWS services and interact directly with end-customers, we also work closely across all AWS teams to ensure that we offer a great customer experience.

    The AWS Data Lake team's vision is to help customers manage the full life cycle of data at all levels of granularity, simplify data collection, integration, and aggregation of AWS data assets, and provide services (compute, storage, security) to access datasets at scale. We collect and process billions of usage and billing transactions every day into actionable information in the Data Lake and make it available to our internal service owners to analyze their business and service our external customers.

    We are truly leading the way to disrupt the data warehouse industry. We are accomplishing this vision by leveraging Big Data technologies like Elastic Map Reduce (EMR) in addition to data warehouse technologies like Redshift to build a data platform capable of scaling with the ever-increasing volume of data produced by AWS services. The successful candidate will have the ability to shape and build AWS' data lake platform and supporting systems for years to come.

    You should have deep expertise in the design, creation, management, and business use of large datasets, across a variety of data platforms. You should have excellent business and communication skills to be able to work with business owners to understand data requirements, and to build ETL to ingest the data into the data lake. You should be an expert at designing, implementing, and operating stable, scalable, low cost solutions to flow data from production systems into the data lake. Above all you should be passionate about working with huge data sets and someone who loves to bring datasets together to answer business questions and drive growth.

    Basic Qualifications

    This position requires a Bachelor's Degree in Computer Science or a related technical field, and 5+ years of relevant employment experience.

    5+ years of work experience with ETL, Data Modeling, and Data Architecture.
    • Expert-level skills in writing and optimizing SQL.
    • Experience with Big Data technologies such as Hive/Spark.
    • Proficiency in one of the scripting languages - python, ruby, linux or similar.
    • Experience operating very large data warehouses or data lakes.
    • Solid communication skills and team player.
    • A passion for technology. We are looking for someone who is keen to leverage their existing skills while trying new approaches.

    Preferred Qualifications

    Preferred Qualifications
    • Expert level in ETL optimization, designing, coding, and tuning big data processes using Apache Spark or similar technologies.
    • Experience with building data pipelines and applications to stream and process datasets at low latencies.
    • Demonstrate efficiency in managing data - tracking data lineage, ensuring data quality, and improving discoverability of data.
    • Sound knowledge of distributed systems and data architecture (lambda)- design and implement batch and stream data processing pipelines, knows how to optimize the distribution, partitioning, and MPP of high-level data structures.
    • Knowledge of Engineering and Operational Excellence best practices.

    Amazon is an Equal Opportunity-Affirmative Action Employer – Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation

    Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
    Share this job