Experimenting with WoW data - Part 2
In the last part we went through how to get the WoW auction data using the developer APIs. The auction data dump (auctions.json) is updated once every hour. As I noticed, that this dump is updated just before the hour in UTC. So scheduled job to get the updated auction dump every hour should work fine. In this section we will use Spark to do some basic analysis on the auction data. Simple items have a row like the one shown below. Items like legendary equipment, pets will have additional fields like bonusLists, petSpeciesId etc. Let's take a look at a row of auction data. {"auc":1018440074,"item":41114,"owner":"Lougincath", "ownerRealm":"Dalaran","bid":507084,"buyout":1000000,"quantity":1, "timeLeft":"LONG","rand":0,"seed":0,"context":0} Next we will put the auction json data into a dataframe. As the datadump has some additional me...