Archive for August, 2007

Google hell continues!

Saturday, August 25th, 2007

Well, it’s been two weeks since I signed up for Google Sitemaps to use the console to remove the duplicate site that has trashed my rankings.

I logged on again today to make sure that the request to delete the site from the Google index had been processed only to see that it has in fact been denied! The reason was that I hadn’t put in a robots.txt to exclude all of the content.

This has now been done so with any luck over the next few days the duplicate site will be removed, allowing Fab Swingers to return to its rightful place in the Google index over the next couple of weeks.

I’ve fallen in love with mod_wsgi

Thursday, August 23rd, 2007

I think that I am about to embark on a long term relationship with mod_wsgi. It is a huge improvement over mod_python for deployment of CherryPy on the basis of performance, flexibility and simplicity.

There are some benchmarks on the site showing that mod_wsgi is slightly faster than mod_python but it also allows finer control of the Python software so I can do things like share a cache more easily which will have a much bigger performance impact.

It was wonderfully simply to setup — the only gotcha for me was that I needed to add a sys.path.append(’/path/to/script.py’) to get my imports working properly.

I also really like the depth and detail of the documentation and the maintainer is very active and very helpful.

My initial results have been brilliant, I’ve now got a test version of the site up and running which is absolutely leaving the original (mod_rewrite) in the dust in terms of speed. I’ve still got a little bit more work to do on the SQLObject -> handcrafted SQL to finish and then I’m there. Hopefully I’ll get it released this weekend.

CherryPy with mod_python

Thursday, August 23rd, 2007

I’ve not got a beta version of Fab Swingers running with CherryPy and mod_python and the results are absolutely fantastic! It’s given me more than a 100x increase in speed. One of the CherryPy guys suggested that it was because a thread bottleneck in CherryPy’s WSGI server which sounds sensible.

I haven’t really been doing proper releases of the code (I’m just using CVS checkouts) but I think that this will qualify as version 2.

I’ve now banished SQLObject in favour of handcrafted SQL, I’ve written a very fast database results to list of dictionaries method and I am now switching off the Python webserver in favour of mod_python.

I’m pretty happy now that I’ve got the technical cost at a sensible percent of revenue (about 2%) and I’m comfortable just to scale linearly from here.

Having said all of that I’m still going to bang away with further optimisations. There’s still quite a lot I can do, particularly on the templates (I could turn off useNameMapper and useAutocalling for example) and I should look at caching the expensive, frequently used content.

CherryPy Performance

Wednesday, August 22nd, 2007

I’ve been a bit confused by the huge gap between the time per request that Apache’s mod_status is showing compared with CherryPy’s StatusTool. In some cases mod_status was showing that the requests were taking hundreds of times longer to complete than Status Tool was showing!

Of course mod_status will also be including network latency, so as an experiment I tried using wget locally however there was still a huge difference. I posted to the CherryPy mail list and got some advice pointing me to the WSGI code in CherryPy as being a likely culprit.

Normally this code is screaming-fast (1 ms or less) but with my multiple threads it seems to be slowing way down. Debugging this code is probably a bit beyond me so my plan is simply to bypass it completely and use mod_python instead.

I’ve been thinking about mod_python anyway, the only problem is that I’m going to have is that my present cache strategy will no longer work. I’m tempted to simply change the design of the homepage so that it’s much quicker to generate to remove the need to cache at all!

One year on…

Sunday, August 19th, 2007

Well it’s a year ago now since I started Fab Swingers, although it’s less than I year since the site launched. I thought it would be good to review progress.

Well, my initial target was 1,000,000 members in the first year — we’re actually finishing with 10,000 members. Now, I actually think that 10,000 members in a year is great and it makes us the largest free swingers site in the UK. Even better, half of the new members joined in the last few months so the sign-up rate has been accelerating.

The original target was of course based on getting some search engine coverage. So far we’ve only had decent search engine results for about 2 weeks; I’m confident that eventually Google will place us ahead of all the irrelevant/low traffic sites that are currently ahead of us (I believe in their algorithm!) so although it may take months I’m sure it will happen.

I also now understand that there is no way that we could have handled 1 million members in the first year anyway; our business processes and technical infrastructure would have failed. As I write this we’re still not quite there but at least we understand what needs to be done and are much closer to having the processes/infrastructure in place.

So what for the next year? If we simply maintained the present growth we would finish with 25,000 members. But I would like to see Fab Swingers finally get some search engine traffic and we’ve also got the affiliate program, increased word of mouth, newspaper advertising and leaflets at swingers clubs/fetish clubs/sex events. So on that basis I’m going to target 100,000 for the next 12 months.

The great thing about Fab Swingers though is that it is a profitable business with no external pressure so even if it takes much longer to meet these targets than planned then it really doesn’t matter.

Thread safe problem

Monday, August 13th, 2007

It turned out today that some of the code in Fab Swingers was not thread safe. I hadn’t really given much thought to the threading that was going on in the background but it bit on the profile display pages.

Unfortunately only appears under load and roughly one in 400 times that a profile is viewed so I didn’t notice it up until know. There had been some reports in the forums about mail sometimes going to the wrong people but I’d just ignored it thinking it was user error.

My fix was to make sure that I was using cherrypy.thread_data rather than using an object that was created outside the thread.

I’ve scanned through the rest of the code and it looks fine.

Scaling static content

Sunday, August 12th, 2007

I read a post today in the Plenty of Fish Blog about his Content Delivery Network (CDN) for his static content. He posts graphs showing that he has exceeded 1.1 TB/139 Mbps in a day and peaking at a fraction under 3,000 hits a second. All very impressive.

Anyway, it made me think about how we’ll scale our static content. I think that in the very near future it will move off our main server onto a dedicated server. I’ll start with the cheapest hardware and increase the specification as we need to.

I’m guessing that with a million members we would have about 30 GB of photos and something in the region of 600 GB a day of traffic. I’m pretty sure that we could serve this off a single server; one of the factors that would be strongly in our favour is that there would be a small number of photos that would account for most of this traffic so it would all be nicely cached internally.

ServePath sell a dedicated, unmetered 100 Mbps connection for $1500 per month and I think that the server would cost around $500 a month so this would all be very affordable.

I watched an excellent video on how You Tube scaled. On their static content they reported a big gain in moving form Apache to lighttpd to complement their CDN but I think that Fab Swingers is unlikely to be serving that volume of traffic so Apache will do fine.

They also mentioned in passing that they switched from Ext to Reiser because they had far too many files in a directory. I don’t want to make that switch but I think that I should restructure the photos directory, it presently has over 4000 subdirectories in it but as we scale up this is likely to get too big.

It was also good to hear that our architecture is very similar to You Tube (Linux, Apache [on app servers], MySQL, Python) so I am pretty comfortable that we will cope.

I’m much more concerned about the app server side, interestingly he said that their aim was to complete every request in under 100 ms. This is something that we’re presently way over but I’ve got a few ideas for improvement.

Erotic Awards

Friday, August 10th, 2007

The Erotic Awards are on the 1st September but unfortunately we’re much too late to get nominated for the web category, the finalists have already been announced.

Must remember to get a nomination in next year!

Video Chatroom

Tuesday, August 7th, 2007

I’ve just finished getting video in the chatroom working. It turned out to be a pretty simple problem with the XML feed that we generate for our chatroom provider. However the cameras in the room have already gone done very well with our members.

I think that in terms of features there’s nothing too major to add to the site now (although there are lots of minor things that need to be done). When I compare Fab Swingers with the commercial competitors there’s almost nothing that we’re missing feature-wise and in fact there’s lots of unique features that we have.

The technical work now is mainly about scaling up so we can cope with higher volumes of traffic.

Google Duplicate Content Problems

Tuesday, August 7th, 2007

It really is a good that we don’t rely on Google for our marketing. We were on page 1 for “uk swingers” and yesterday we dropped to page 14!

This sort of thing is virtually always the result of a penalty being applied. It’s obviously not the dreaded PR0 penalty because it if was we wouldn’t even be appearing on page 14.

I have a Google Alert set up for “Fab Swingers” and this gave me a clue as to what was happening. I noticed that I was getting a lot of alerts for my content but on another domain. What’s happened is that my ISP has accidentally pointed a number of other domains to my server. Unfortunately these domains are more established than fabswingers.com so these domains have been treated as the authority for my content and the real Fab Swingers has been given a duplicate content penalty!

I’ve fixed this by changing the Apache configuration so that the Fab Swingers site is no longer the default site on the server. This means that any domain erroneously pointed to us will not get duplicate content.

I imagine that this will take a good few weeks to clear. In the meantime, I’m going to step up our non-SEO marketing to make sure that we don’t see a dip in traffic.