Tuesday, March 23, 2021

7 Steps to get Started with Large-Scale Labeling

    Instacart is one of America’s most leading company that operates grocery shopping for people in need of groceries. This company simply works through an app or through an online site where shoppers can order their selected food and a personal shopper will do the food shopping for them and deliver it to the doorstep. The article I chose to write about talks about Instacart created a crowdsource data labeling process through different ways of collecting data that make it possible for other people to do as well. 
    In order for the crowdsourcing data to be collected, Instacart follows seven key rules: assess the lay of the land, identify the use cases, understand the products data, design the human intelligent task, determine the guidelines, communicate the task, and maintain high quality. When assessing the land, you want to examine previous projects for strengths and weaknesses after clarifying that human evaluated data has already been collected. Identifying the use cases means that the data is collected, but it can be time-consuming. However, that data can be used to test the outputs of future models. It is then important to understand the product data because this allows you to move forward in the process once understood. What data is presented to the user? What data is sent by the user? Do we have all the necessary data? In order to determine the guidelines, you have to track information from your team and users to be able to get the data you need to evaluate yourself. If you cannot communicate with people analyzing your data, then your product will not be successful. In order for the data you collect to be properly analyzed, communication is key. Lastly, to maintain high quality, there has to be set instructions and guidelines to be followed. They say you should collect data about the analyzer such as language spoken, race, gender, age, etc. to be fully familiar with you who dealing with the data collected on your specific product.
    Crowdsourcing has been one of Instacart’s most successful ways in getting data. Services such as Amazon Mechanical Turk and FigureEight are examples of companies that upload data sets and create different task while paying for work. Even though it may sound simple, it is actually a lot of work to create these data sets before moving on to final processes. Crowdsourcing is a great alternative for collecting data to construct training data sets in comparison to how companies normally collect their data for projects. 
    After having personal experience being an Instacart employee, it was interesting reading this article because I never once thought about how they go about collecting data to make sure their shoppers are happy with the choices. While working, the only steps I have to do is go on the app and chose an order and then shop for the person. If something goes wrong, it affects that data collected which I never once thought about. 

 Source 
 
https://tech.instacart.com/7-steps-to-get-started-with-large-scale-labeling-1a1eb2bf8141


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.