Hey there!
I've done some analysis on the issue but can't for my life figure out how to PM BearyBear. Anybody else know how and could point him towards this post? Anyway, here it goes:
The issue is apparently caused by a mixture of high packet loss combined with server-side gzip encoding. Well, either that or an obscure flash issue triggered by those two factors (I am not familiar enough with flash to say for sure).
Tools used:
- a network protocol analyzer (wireshark, etc)
- firefox (any recent version)
- iptables on Linux (or any Windows debugging network driver equivalent, no idea)
Steps to reproduce:
1. Induce a certain amount of dropped packets. For the sake of this test, I forced about 20% of all TCP packets to be dropped (iptables -I INPUT -p tcp -m statistic --mode random --probability 0.2 -j DROP). This is equal or greater than the amount of dropped packets (indicated by duplicate ACKs) when I first tried to play the game.
2. Start firefox (or any other browser, really) with a fresh profile to rule out the effects of caching.
3. Wait for the game to load, fire up wireshark and play a challenge.
The HTTP POST to /trivia/backend/api-interface.php will work as intended. As a result of the returned data, firefox will then download the question's video flv file:
GET /trivia/media/Season15/Q_1511_03_QuizGame.flv HTTP/1.1
Host: services.southparkstudios.com
[...]
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 200 OK
Server: Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/0.9.8r-fips PHP/5.3.10
Last-Modified: Fri, 28 Oct 2011 23:25:26 GMT
ETag: "1b5b2f533-6a524-4b064343676d8"
Accept-Ranges: bytes
Content-Type: text/html
Content-Encoding: gzip
Content-Length: 423938
Cache-Control: max-age=7200
Expires: Tue, 03 Apr 2012 07:57:23 GMT
Date: Tue, 03 Apr 2012 05:57:23 GMT
Connection: keep-alive
Vary: Accept-Encoding
Due to the packet loss induced in step 1, the flv file will take a bit longer to download than usual but the request will still finish successfully and within very reasonable time. As you can see, firefox advertised its support for gzip and deflate encoding and the server did indeed send the data gzip encoded. For whatever reasons (timeout? garbled data -> gzip decoding failure? tcp retransmit issues?) flash will only play the first frame(s) and then stop, causing the game to get stuck.
4. Open about:config in firefox and set the value of network.http.accept-encoding to the empty string. This will force firefox to no longer advertise its support for gzip and deflate encoding.
5. Repeat step 3 with a different challenge to avoid any effects of caching. The request should now look as follows:
GET /trivia/media/Season16/Q_1601_08_QuizGame.flv HTTP/1.1
Host: services.southparkstudios.com
[...]
HTTP/1.1 200 OK
Server: Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/0.9.8r-fips PHP/5.3.10
Last-Modified: Thu, 15 Mar 2012 00:37:19 GMT
ETag: "1e5c14f71-c7e7b-4bb3d4b6de5ae"
Content-Type: text/html
Cache-Control: max-age=7117
Expires: Tue, 03 Apr 2012 07:54:48 GMT
Date: Tue, 03 Apr 2012 05:56:11 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Connection: Transfer-Encoding
Since gzip is no longer an option, the server has reverted to regular chunked encoding without compression. The video will lag and stutter but the game should be playable.
6. Revert the changes of step 4 and repeat step 5. The game should no longer work.
Long story short, gzip together with packet loss seem to be the root cause. I see two ways to fix this:
1. The proper way: find out why so many duplicate ACKs occur, whether any network appliance mishandles re-transmits and in what way the gzipped data is mangled.
2. The sane way: disable transparent gzip encoding for the game's flv files in /trivia/media/
I'd rather not post the resulting pcap files in public but if you want to have a look at them just drop me a line.
And now that you made me work when all I was trying to do was procrastinate: let me know where to send the bill
