I've updated the page with a MIPSR2 build.
The QoS doesn't really reorder, at least not pfifo or sfq.
pfifo does first in, first out. Unless it's a specially marked packet like ECN, or no_nodelay. Most TCP traffic though is unmarked. So it defaults to FIFO.
sfq also pretty much does FIFO, except it also balances bandwidth between each connection (to make sure packets at the bottom of the buffer don't starve for bandwidth).
QoS with the default builds of Tomato (128 packets and 256 packets) is pretty much only QoS using bandwidth limits like you say. Because the buffers are so large, no packets ever get dropped.
IMHO, packets should be dropped so the TCP protocol knows to back-off on speed and keep the link open.
What I've done is more like QoS properly tuned. Instead of just making QoS ineffective, it should be more effective for latency purposes now.
Another neat property of smaller buffers is that even if QoS is set up incorrectly, it now won't lag so much.
Essentially, we could tune for a "default" QoS behavior by specifying packet limits. 256 packets is just too much, unless you have 100Mbit upload.