Amazon changed the face of retail by changing the way consumers shop and buy. It went from a bookstore to an everything-store with the largest product selection on earth provided by millions of merchants. Amazon strives to ensure that products are organized and represented with accurate information to help customers make the best buying decisions.
Yet, products features can be hard to find and shopping is riddled with ambiguity –
- Does this TV have built-in WiFi and which streaming services are supported?
- Can this wireless speaker play music from a flash drive?
- Is the “Pack of 4 with 6 bottles of 8 fluid ounce” at $50 baby bottled formula cheaper than the “3 packs of 36 ounces” for $110 powder version?
- Which cell-phone case is the most durable, ultra-thin and the best value for my money?
- Is this mat made of silicone and what are its dimensions?
- Where is this product manufactured?
- When searching for “apple case” do you mean a cell-phone case compatible with an iPhone or a crate of apples?
The Catalog Knowledge Discovery team aims to build the next generation product attribute knowledge graph; We want to understand every feature behind every product and their relationships to one another to build the world’s most comprehensive product knowledge-base. We will capture and extract semantics behind seller provided information, customer reviews, product manufacturer data and sales data and represent them in a product knowledge graph. We will then surface them in the most relevant way to our customers. We will also leverage the knowledge about our products to enhance the shopping experience, from query understanding to price comparison.
If you are excited about making the Amazon catalog more dynamic, smarter and changing the way we model and understand products and help customers discover, compare and purchase products, come join us! We are looking for people with initiative who enjoy diving deep into the data and coming up with innovative solutions. You will find challenges in: Data Analytics
: We build analytical workflows to dig into the huge amounts of data available at Amazon using data mining and machine learning. We extract information and patterns and use them so that every product is uniquely represented with a complete, structured, accurate and correct set of facts. We collect knowledge from experts and train models that scale it across our catalog. Scalability
: We process billions of records about products every day ranging from electronics to cosmetics. We build highly distributed systems and design algorithms that are able to handle these large amounts of data and operate with latencies in the tens of milliseconds for millions of transactions per second. Where traditional solutions fail we develop approximate, distributed and streaming algorithms. Responsibilities:
- Establish scalable, efficient, automated knowledge discovery and data mining systems including building data processing pipeline, tools to collect training and evaluate data
- Analyze large amounts of data to discover patterns and build models to extract valuable information from various sources (e.g. product catalog, search query ...)
- Owning and improving customer-facing features derived from scalable & automated data mining systems