MySQL and Storing Data
Hey everyone! Today, we're going to talk about data, the ways that it can be stored, and the reasons that it's so important to structure it correctly. A little over a year ago, Punchmark made a big change to the way we stored product data, so I'd like to use that as an example of some of the different principles behind data storage and management.
Punchmark uses a database query language called MySQL. MySQL is what's known as a relational database system. This means that we store data in pre-defined tables that can then be dynamically linked together in various ways. You can think of it as similar to Excel spreadsheets, except that at any time, you're able to join together and filter multiple spreadsheets at once to organize your data.
Originally, Punchmark clients all had their own product tables, and every product was stored as a row in these tables. This is a great way to store products that are entered manually or uploaded from The EDGE - it allows for quick lookups and easy organization. However, we quickly ran into a problem when we started growing the premium vendor program. For a while, we were able to get away with duplicating premium vendor products across every client table, but then we ran into Stuller's Ever & Ever line.
Ever & Ever was highly requested, so we really wanted to implement it - but there were 130,000 products (due to the large number of item variations). Before then, even our biggest clients only had maybe 20,000 products, so duplicating 130,000 across lots of databases simply wasn't an option because we didn't have the storage space for it. But fortunately, MySQL allows for a fun trick that helped us out a lot. Instead of duplicating across all client tables, we decided to store premium vendor items in one master items table. Then, when loading products on a website, we can get any products in the local items table (such as manually uploaded or EDGE products), grab the client's premium vendor products from the master table, and join them together. With this system, we have lots of clients running well over 150,000 products on their websites, and not using much more storage than we were before.
There is a downside to this. Because we have to join these tables together every time, it ends up being more work for the server when actually pulling the data together. This tradeoff of storage space vs. computational performance is a fundamental issue when storing data, and you'll always have tradeoffs. However, with indexing and query optimization (if you want a dev dive on those, let me know!), we can mitigate a lot of those concerns in this case.