The Catalog Quality group is looking for an exceptional software engineer with experience in big data processing to join us in making the world’s biggest and best product catalog even better!
The product catalog that powers Amazon’s growing global marketplace is the biggest, most comprehensive and most dynamic record of products in the world. The 10-digit sized catalog is updated by millions of merchants, who range from small home-run businesses to large scale wholesalers. Amazon’s merchants update their inventories in multiple languages several million times daily to react to the ever changing demand and trends of the market. Yet, each search result, detail page and the overall shopping experience on Amazon is information rich, accurate and offers an unparalleled selection of buying choices to the global Amazon customer. The different color, size and buying choices for each product from several merchants are organized on the same detail page. Detail pages with inaccurate and missing information that matters to the customer are detected and fixed. The variety of ways that merchants represent information (e.g. brands) transforms to the most canonical and customer friendly representation. Navigation options appear dynamically on the left pane, assisting customers in refining their product search experience.
Catalog Quality is the group of highly motivated, innovation oriented, data driven teams that make this possible. We excel in big data processing, information retrieval, machine learning, artificial intelligence, crowd sourcing and statistical modelling. We apply these techniques on massive and acutely engineered distributed system architectures to achieve and maintain Amazon’s product catalog at the highest quality. A high quality catalog is a key strategic asset for Amazon and a prerequisite for a delightful customer experience.
You bring your experience in full-cycle software development and love of designing scalable systems and writing beautiful code. You enjoy diving deep into the data and come up with innovative solutions. By participating in teams meetings, design reviews, and code reviews you will help drive a culture of performance and operational excellence. You will continuously evaluate metrics and strive to improve our tools, processes, and maximize out impact. You will find challenges in:
Data science: We build data analytical workflows to dig into the huge amounts of data available at Amazon using data mining, machine learning, and statistics. We look for patterns, train thousands of models and use them to build solutions that improve the catalog quality. We collect knowledge from experts and train models that scale it across the catalog.
Scalability: We process billions of records daily. We build systems and design algorithms that are able to handle these large amounts of data and make sure the cloud usage scales sub-linear with the ever growing data size. Where traditional solutions fail we develop approximate, distributed, and streaming algorithms.