Exploring the JS HTTP Virtual File System with Arcade Games Data
In this follow-up to the JS HTTP virtual file system post, I experimented with a new website using a different data set. This time, I utilized a 10 MB data set of arcade games from the Internet Archive's Arcade project. The data is spread across flash games, categorized under titles and names, and comprises about 11,000 to 12,000 records in a SQLite database. Most of the data is indexed, and I built a view to test the performance.
Performance Insights
CDN Performance
The CDN performance is outstanding, slightly faster than anything else I've deployed to any CDN I've used. This gives me a good idea of where the system excels and where it might fail. The website is impressively fast, featuring infinite pagination for each category. It loads 30 records at a time using normal paginated SQL queries like LIMIT and OFFSET, making the process almost instantaneous. On a fast internet connection, the loading indicator is practically invisible, with subsequent requests for additional pages taking only about 30-40 ms, which is imperceptible.
Cold Start Challenge
However, there's a significant cold start problem. The WebAssembly module must be loaded before the first request, taking anywhere from 900 ms to 1.5 seconds. The module is a 1.3 MB binary WebAssembly file containing the SQLite database driver. This delay can be hidden by using it as a feature, starting the download asynchronously when the webpage loads.
Query Performance
The initial query requires loading about 1.2 MB of the 10 MB database, but subsequent requests are extremely fast, often around 5-10 ms, likely due to caching. For instance, some requests come in at about 40-50 ms for 30 KB of data.
User Experience
When users land on the homepage, the initial cold start problem can be masked since it occurs asynchronously. The webpage loads in under a second, and the download of the WebAssembly file begins, taking about a second. As long as it takes users a couple of seconds to navigate to their desired category, the experience feels quick and seamless.
Considerations for Larger Data Sets
Optimizing further might not be feasible with a 10 MB data set. For larger data sets, the cold start problem can be hidden effectively. I considered how this approach might work for platforms like Netflix or HBO Max, where the entire public database of titles could be stored in a database, with URLs to images, descriptions, and titles.
Memory Usage
One trade-off is that the data gets cached, potentially consuming a significant amount of memory in JavaScript, equivalent to the entire database size. This could be an issue for platforms like webOS or lower-end Android devices, especially Amazon devices. However, higher-end devices like Apple TV could handle it, though implementing such technology might not be worth the effort due to the need to reinvent everything from scratch.
Conclusion
This technique allows deploying a website directly to a static CDN without worrying about a database server, while still providing the ability to query a SQL database in read-only mode. It's a neat solution with potential use cases, especially in data science for exploring data sets.