December 13th, 2008

ronin - best scam EVER!

Here's how it works:

* You purchase bidding tickets in pre-packaged blocks of at least 30. Each ticket costs you 75 cents, with no volume discount.
* When someone uses a ticket to place a bid, the purchase price goes up by 15 cents and the auction time increases by 15 seconds.
* Once the auction ends, the winner pays the final price and gets the item.

I just watched an 8GB Apple iPod Touch sell on swoopo for $187.65. The final price means a total of 1,251 bids were placed for this item, costing bidders a grand total of $938.25. So that $229 item ultimately sold for $1,125.90.

Collapse )

Man, that is some evil genius level stuff right there! Doing the math, 75 = 15 * 5. So for every dollar the auction goes up, Swoopo has made $5 on bid tickets! And their only real expense to run this whole thing is the cost of their internet connection! They could easily make 450% profit - a number so ridiculous that it would make any casino owner (who tends to make only 15-25% on his games) blush in shame!

The real kicker, though - the one that really makes you laugh out loud - is the "auction for bid tickets". Wow.

"Nobody ever went broke underestimating the intelligence of the American public." -H. L. Mencken

See also: Dollar Auction, All-Pay Auction

Scaling memcached to 8 cores at Facebook.

Finally, as we started deploying 8-core machines and in our testing, we discovered new bottlenecks. First, memcached's stat collection relied on a global lock. A nuisance with 4 cores, with 8 cores, the lock now accounted for 20-30% of CPU usage. We eliminated this bottleneck by moving stats collection per-thread and aggregating results on-demand. Second, we noticed that as we increased the number of threads transmitting UDP packets, performance decreased. We found significant contention on the lock that protects each network device’s transmit queue. Packets are enqueued for transmission and dequeued by the device driver. This queue is managed bv Linux’s “netdevice” layer that sits in-between IP and device drivers. Packets are added and removed from the queue one at a time, causing significant contention. One of our engineers changed the dequeue algorithm to batch dequeues for transmit, drop the queue lock, and then transmit the batched packets. This change amortizes the cost of the lock acquisition over many packets and reduces lock contention significantly, allowing us to scale memcached to 8 threads on an 8-core system.

Since we’ve made all these changes, we have been able to scale memcached to handle 200,000 UDP requests per second with an average latency of 173 microseconds. The total throughput achieved is 300,000 UDP requests/s, but the latency at that request rate is too high to be useful in our system. This is an amazing increase from 50,000 UDP requests/s using the stock version of Linux and memcached.