The topics covered in this article are based on a project in 2015. Many of the issues like the encoding confusion still exist.
Amazon: hazardous selling
The times when brick and mortar businesses were the preferred place to buy things are slowly fading away. It is simpler and saves time to go online to do the shopping. Amazon as one of the global players offers most products anyone would want to purchase. It is easy to use and very popular. That, of course, makes it attractive for sellers, too. Amazon suggests that selling products using its platform is straightforward and simple. Alas, it wasn’t.
A retailer who had difficulties selling his products efficiently and reliably asked us to help him out by developing a software solution that would enable him to automate the way he could add or update his products on Amazon.
Amazon Marketplace Web Service - The API
To sell products on Amazon, one is able to upload the product information using three different methods. To add a single product one could use the simple and intuitive web interface Amazon offers. If one wants to sell a broader range of products, the downloadable template - which can be found using the web interface as well - is one possibility. In addition to that, Amazon offers an API that can be used to upload product data without having to use the web interface. The template is a ﬁle in TSV format. Each record - i.e. row in the ﬁle - would represent either a product or a variant of a product. In addition to that Amazon offers an API to upload product data without having to use the web interface. The data sent to the API is practically the same as with the aforementioned ﬁle based method. We chose the API based solution for a couple of reasons:
- We could let our software talk directly to the API.
- The retailer would not have to upload the data of his products manually any more.
We asked the retailer about his experiences with uploading data to Amazon to gain a preliminary understanding of its workings and characteristics. The more our analysis of the Amazon API progressed, the more we came to the conclusion that the task at hand wouldn’t be an easy one after all.
In what follows we want to show just how right this hunch was.
The path through the encoding maze of the Amazon-API
Our software was supposed to take the data of the retailer - our client - and process it such that it could be sent to the Amazon API. This data contains various information about our client’s products, but in the end it is only text (either as a TSV ﬁle or as text sent to the API).
Let’s do a little detour to be better prepared to understand what follows: Like we humans communicate in different tongues and are unable to understand one another sometimes, a computer can have difficulties interpreting the data sent to it by another computer correctly. To enable successful communication between computers, an agreement is made as to how the incoming data should be processed. In our example it is agreed upon in what way the textual data is encoded. There are many different encodings, among which Latin-1 (IOSO 8859-1) and UTF-8 are probably best known. (Which encoding is used does not only tell the computer how the incoming bytes are to be read to obtain the correct textual data, but furthermore describes which characters the text can contain and which not.)
The ﬁrst thing we noticed was that the text that we wanted to upload to the Amazon API had to be encoded in Latin-1. This was a surprise, because Amazon is a company that operates internationally (75 countries in 5 continents), and we assumed it would use an encoding that supports as many characters as possible. Latin-1 is an encoding that supports the latin alphabet and not much more. So why not use an encoding that supports a manifold of internationally used characters? Our client encoded his data using UTF-8, but to support uploading the data to Amazon, we had to convert it to Latin-1. There were encoding errors, of course, since the client used characters that were not supported by Latin-1, for example zero-width whitespace and ellipsis.
At our client’s request we added the ability to export the data as TSV ﬁle in the format Amazon requests if one wants to use it to upload data via the web interface. And that was what our customer wanted to be able to do as well. And it was here we were amazed again: the encoding Windows-1252. Yet again an encoding that supports only the latin alphabet and thus much less characters than UTF-8. We still cannot think of any good reason why Amazon uses different encodings for the same information and why it uses any of those two encodings at all and not simply UTF-8.
But well, we went along with those different encodings and made progress. The next surprise, however, was waiting for us just around the corner.
The ground is moving
To better understand our next challenge, we ﬁrst need to get a clear picture of the structure of Amazon’s market place. On the left side of the image blow we tried to illustrate the (internal) hierarchy in the way it presented itself to us during our work with the templates Amazon provided. On the right side, however, we show the hierarchy as it is shown on the Amazon website.
- Template type: Pools one or more of Amazon’s main categories, SportingGoods for example. This layer is an internal structure and invisible to the shopper. Every template type has its own upload template ﬁle.
- Main categories (root browsenode): The ﬁrst visible node on the Amazon website, e.g. “sport shoes”(Sportschuhe).
- (Parent) browsenodes: The respective subcategories, e.g. “Herren (man) and Damen (woman).”.
- Leaf browsenodes: The subcategories of the subcategories: “Pumps”, “Sport- & Outdoorschuhe (Outdoor shoes)”. If there are several subcategories “Sport- & Outdoorschuhe”, it would be a parent browsenode and a leafbrowsenode at the same time.
And here we came to our second hurdle during the development of the program. Amazon decided to move main categories into other template types during our implementation. Because each template type has its own template for creating and updating products, we got an error message that the main category does not match the template type. What happend? The main category Kitchen originally belonged to the Template-Type Home. Amazon raised Kitchen to its own template type. Surprisingly, Amazon also called this Kitchen. So now we had a new template-type kitchen with only one main category called Kitchen. That is why we adapted our software. However, the next hurdle did not take long.
The opaque hike
During the development period, we have experienced several times that Amazon moved Parent and Leaf Browsenodes to other root-browsing-codes. Each browsing code has a unique identifier, which can be viewed in the "Browsenode tree guide". This is given when the product is uploaded for the correct category placement. For example, the Browsenode "Household / Kitchen / Furniture & Home Accessories" has the identifier 123. There are tables provided by Amazon for the linking of browsenodes and their identifiers. This identifier is also displayed in the URL of the products.
We used this identifier when creating or updating for example the product Butterflysticker. After uploading, however, the product was not in the expected category, but we found it in "Household / DIY / Painter, Tools & Wallpapers." What happened? For reasons unknown to us and without explanation or error message, Amazon had sorted the product into a different browsing code (category) with a different browsing code during the upload. However, the browsing code in the URI was not the same as Amazon's website. At least, not if the most recent table for these links has been consulted. Even when we manually entered the browsing code (123 for "household / kitchen / furniture & living room accessories"), we were redirected to the other browsenode, i.e. "household / DIY / painter's supplies, tools & wallpapers". It was not, that the browsing code with the ID number 123 no longer existed. On the regular Amazon website the category was still to be found. This redirecting to other browsenodes is not transparent to the layman and can also lead to an economic disadvantage, as one might miss his target group.
The longing for the sandbox
One or the other reader may have wondered why we did everything in the Livesystem and not in a test environment. The simple answer: Because this possibility does not exist, neither for the upload via the web interface, nor for the API. For a brief explanation: A sandbox is an isolated environment where you can test program code without changing the live system. Errors in the software or in the product data have no dramatic consequences in this environment, since only the tester has access to the sandbox. The absence of this sandbox at Amazon means that any change to products on the live system is applied. For vendors with many different items, changing descriptions, and frequent price changes, this can cause serious problems because price or description errors are directly visible to the customer. The following is an example.
Amazon API - Deletion impossible
For example, we ﬁlled an optional attribute incorrectly with some 100 products. After we noticed the error, we started a second update upload. This time, we left the previously wrongly ﬁlled ﬁeld blank for the attribute, hoping that this would delete the contents of the incorrectly ﬁlled ﬁelds. The result of this project: nothing happened. Amazon recognizes blank ﬁelds as "nothing to update". In principle, this is also true: interpreting an empty ﬁeld in the upload as "delete the value" would lead to major problems. Care should be taken not to omit any valuable information. The problem was much more that there was no way to empty a ﬁeld. 100 products had to be ﬁxed by hand. We were shivering as we realized that it could have accidentally hit several thousand products. Unfortunately, we were missing the exact documentation of how the Amazon upload works. And then we come to our ﬁnal boss, or rather to the meta enemy.
The meta enemy
The most annoying fact in the implementation of our software for (automated) product management was the meager or completely missing documentation. Much time had to be spent to research the above mentioned facts. Many functions, e.g. the CSV ﬁle upload, we had to learn painfully as mentioned in the last paragraph.
It is normal that in the development of software environment and interfaces must be analyzed. Good and complete documentation, however, facilitates work and helps save time and resources. Unfortunately, the documentation in this case was neither always good, nor complete. As a supplement to the above mentioned problems, here only a few points: - One fact was that the documentation relevant to sellers was not ﬁled in a central location. So we had to ﬁnd links in different forums, which led to the appropriate document. - Problems were also frequently asked by other sellers in Amazons seller forum. However, the solutions to these problems were not included in the related documentation. - Another shortcoming with the descriptions of the interfaces, was that ﬁelds were not marked as optional or mandatory. We could only ﬁnd this with the help of other sellers who had already gained experience with it.
Finally, a few advices from a veteran for the way through the maze
Since Amazon, despite all this, is one of the largest online marketplaces in the world, we give you these advice along the way: - If you use the Amazon API, you need to keep your data in Latin-1. If you use the web interface you need your data in CP-1252. - Always uses the latest version of template-type templates. There you can see whether there have been changes in the templates. - Please update your product range step by step, so that no major changes have to be made in case of an error. - Check once more or better twice what you have changed before you send something. Changes always go live. - Check if the product information has been placed in the correct browses after a successful upload.