This is mostly a thought experiment, but it's also something I'd like to do real experiments with if I were to find a practical path forward.
Say you're at a festival or a conference, you and the rest of the crowd are all accessing the same scheduling info through the website, perhaps along with a chat/discussion app, all localized to the event, but every person is getting data coming back and forth from some data center in Utah and everybody's having a really shitty high-latency experience on the crappy network. Even though you're all accessing exactly the same data, you're all independently beaming across the country/world.
How can you overcome this problem, in a practical way? Meaning nobody has to really know anything about the underlying technology, they access the website as usual, but under the hood you treat them differently?
Things that come to mind are some potential peer-to-peer solutions, or perhaps a local server that's plenty powerful to handle all the traffic and keep a warm cache for all the traffic. I have a grip on the way this might be architected, but I'm really not sure where to begin on how this kind of thing might be implemented.
Again, this is more of a personal interest than anything, but if there were a practical implementation, I'd love to try something like this with dev.to at a conference or something. The site is already built to be serve a lot of widely shared caches served from the our CDN's edge-nodes.
Thoughts?
Top comments (10)
I've thought of providing a local DNS server along with a local ultra-fast Wi-Fi network for your specific local event... especially when you mention sometilhing like: "they access the website as usual, but under the hood you treat them differently"
I am aware that some people might feel this kind optimization evil, but if everything is local... I mean everything in real... who cares?
Oh, I remember a real world example for this: the Comic Market.
Comic Market is the largest offline otaku event in Japan... (590,000 / 3days) people attends the single event. The place is basically 3-4 huge domes.
The network issue have been a big (and well-known) problem since then, people started to solve it technically. In recent years, tech team from the Japanese mobile network company (e.g. Softbank, au, NTT Docomo) attends the event officially; you can find their state-of-the-art van with huge antenna. Yes, it's a van-sized yet huge base for huge network (i.e. the Japanese mobile network itself).
Wild Wi-Fi access points (i.e. mobile Wi-Fi station) is strongly discouraged at Comic Market, because they cause congestion. Most people know that their normal mobile network is the best and some people know those hard efforts are being made behind the scene.
This would create a faster experience, but everyone would still ultimately be going to Utah and back right? If your website could be edge-cached right at the event, how might we go about serving that HTML on-location somehow?
Yes. My original thoughts are basically a workaround for getting things faster while being minimally evil™.
You can do edge-caching for your local event if you build the original network like so, but you still need to take control of the local DNS server to point to that specific on-location cache.
I think you can't just do edge-caching (or on-site-caching) for websites where you don't have control to; that's a real evil stuff IMHO. If you really need to empower those external (yet specialized to your event) websites, I think the best way is to make a contract with their company and do the regular edge-caching as the event organizer.
EDIT: if you don't want to modify the network, why not change your domain's A & AAAA records to point to your local server machine at your event location, then make that machine an edge node for HTML caching?
I am not sure if this what you are looking for, but a local (to the conference) Proxy server is how this kind of thing is usually done.
It functions as a cache. If your requests are mostly GETs with no parameters, they are served from the local cache. The user is completely unaware of the caching. You can also make the caching more aggressive and cache some GETs with parameters (for example, search results).
Usually, the biggest problem is the WiFi connection. I have never seen that kind of thing work flawlessly in large conferences such as Re:Invent. There are always some issues.
A peer-to-peer solution would probably create even more problems due to the increased WiFi traffic.
I do not know this for certain, though here's one way I imagine it could be done where a common WiFi is available:
Curious now if this could actually work :D
Edit: Clarifying caching/forwarding server: it could simply be an instance of your webapp speaking with the main database via an isolated connection that uses local redis/tarantool/? for responding to floods of requests.
Well the first thing to ask yourself is where your bottleneck is. If the site is slow and high latency because your servers are collapsing under the sudden load, that's what you need to fix. The fact that everyone is making round trips to Utah or whatever is largely irrelevant.
If it's slow because the WiFi at the conference is congested, you could probably speed it up by caching and serving static assets locally, but this would require a lot of cooperation from the WiFi provider at the event. You could map the DNS for your CDN to a locally hosted server.
You could set up your own WiFi (if the organizers let you do that), but unless you want your users to have to constantly switch networks, you'll need a fat enough pipe to your own WiFi to serve ALL of the bandwidth needs of your users.
Since you mentioned p2p, 2 things pop out of my mind:-
Both solutions however require special application that user need to download and install.
I'd first investigate if the round-trip is actually a problem. As long as the networking equipment isn't overloaded, from anywhere in continental USA to Utah and back shouldn't impose any kind of human-relevant bottleneck. Your ping times could already be lower than 10ms. It's most likely your server that is causing the bottleneck, but is it data or is it processing?
If data is your bottleneck the storing files on edge nodes in a CDN is a good option for speeding it up. This even works for streaming live data: you can provide dynamic data on a CDN. If everybody is receiving sync'd data, then using multicast might be an option.
But say you can't find a CDN, or having processing overhead on the server. The next option is peer-to-peer. The biggest problems here are discovery and connectivity. Discovery can be solved by registering peers on the server in Utah. As servers connect and attach to the same event, they'll get back a list of peers.
Now you just hope they have real addresses that can talk directly to each other (I'd go for IP6 as this point to increase the chance). If there is a WLAN in the area, and they all connect to the same network this will be fastest. If they're connected to the cellular network they'll still end up doing a round-trip through a server, but hopefully something more local. This solution should work within a browser with JavaScript networking -- it won't be pleasant though.
If you're realy adventurous, and can install software, you can look at using NFC and Bluetooth. If the crowd is dense enough you could possible form a network with these. Unfortuantely this would require the users to authorize the connections, as a rightly sane phone won't auto-connect.
You could set up a local area network. All traffic coming from that network could be redirected to a subdomains like local1.event.com, local2.event.com and so on that have DNS-entries that are local. Yes 192.168.xxx.xxx
That way I make my dashboard available for my mobile-phone in my local wifi-network at home :-)