Burger King

July 27th, 2010

Me: can I have a Whopper meal with diet coke, but with onion rings instead of chips?
BK: sorry, we cannot do that
ME: Ok, can I have a whopper burger on its own, diet coke, and onion rings?
BK: Ok


Price of chips: 1.99
Price of onion rings: 1.99

WTF?

Categories: personal

Tags: 1 Comment

New Project Launched – ChatUp.com

June 6th, 2010

chatup logoI have launched a new Ruby on Rails website, ChatUp.com. This is a personal project that i’ve been working on in my spare time with a little help from my girlfriend, and it’s great to finally have it out there.

I’ve got some really cool stuff integrated, including Facebook authentication using the new OAuth2 API and HTML5 GeoLocation. Doing GeoLocation *well* and having it work for any user of the site turned out to be a bit tricky, i might post more on that later. Back end searching is done using the latest Sphinx search daemon and ThinkingSphinx plugin.

There are still 101 things I want to improve, i’m working on getting a new design for the blog sorted right now, but as an initial version 1 it does all the things you would expect of any dating site.

I’m going to be writing about my efforts to build out the site further, and probably more importantly how I actually am getting traffic to the website (let’s face it, that’s one of the hardest things to do, programming is easy).

If you want to follow the project along, please check out the ChatUp Blog and grab the RSS feed.

Please check it out, Feedback welcome!

Categories: personal

Tags: , , 1 Comment

Thinking Sphinx Performance – Split your re-indexing into seperate tasks

June 3rd, 2010

This is part 2 of a 3 part series on getting the best performance out of Sphinx / Thinking Sphinx. Subscribe to my RSS feed for the last installment..

The previous article was How to configure Thinking Sphinx to index from your Slave MySQL database.
—-

If you have multiple indexes, you need to consider if they all need to be re-indexed at the same time.

For example, you might have a “users” index that changes frequently that you want to re-index from scratch or merge delta changes, and a “spare car parts” index that may only change once or twice a day.

The accepted way to perform a re-index your data is of course to run “rake ts:index” which will re-create ALL of your indexes, but that assumes all indexes are equal and all need re-indexing at the same time – usually that is not true. When you have to re-index a *huge* index when all you really want is to update the smaller one… it’s inefficient.

So, what does rake ts:index actually do?

The thinking sphinx index task works out what config file to use based on your rails environment and Rails.root directory, re-generates the config file, and then calls the sphinx indexer program with a link to the config file.

The indexer program has a whole range of options which you might like to get familiar with:

$ indexer
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff

Usage: indexer [OPTIONS] [indexname1 [indexname2 [...]]]

Options are:
--config 		read configuration from specified file
			(default is sphinx.conf)
--all			reindex all configured indexes
--quiet			be quiet, only print errors
--noprogress		do not display progress
			(automatically on if output is not to a tty)
--rotate		send SIGHUP to searchd when indexing is over
			to rotate updated indexes automatically
--buildstops  
			build top N stopwords and write them to given file
--buildfreqs		store words frequencies to output.txt
			(used with --buildstops only)
--merge  
			merge 'src-index' into 'dst-index'
			'dst-index' will receive merge result
			'src-index' will not be modified
--merge-dst-range   
			filter 'dst-index' on merge, keep only those documents
			where 'attr' is between 'min' and 'max' (inclusive)
--merge-killlists			merge src and dst killlists instead of applying src killlist to dst
Examples:
indexer --quiet myidx1	reindex 'myidx1' defined in 'sphinx.conf'
indexer --all		reindex all indexes defined in 'sphinx.conf'

We are interested in indexing only specific indexes at specific times, following from the example above, the “spare car parts” index, instead of running rake ts:in from cron to re-index everything every hour (for example), you can split those indexes out to separate tasks like this:

indexer --config /var/www/your-website.com/current/config/production_slave.sphinx.conf spare_car_parts_core --rotate
indexer --config /var/www/your-website.com/current/config/production_slave.sphinx.conf users_core --rotate

Put those into cron with your desired time schedule, and you will have a happier database server. Important to note is that you must include the _core part, as this is how the ThinkingSphinx gem names the index in the sphinx configuration.

When you deploy remember to run “rake ts:config” with the correct Rails environment to generate your config.

Categories: performance

Tags: , , Comments Off

Thinking Sphinx Performance – How to Index from Slave MySQL Database

May 31st, 2010

This is part 1 of a 3 part series on getting the best performance out of Sphinx / Thinking Sphinx. Subscribe to my RSS feed for the next 2 installments.
——

For one of the sites I work with, we are running a load balanced web server setup with separate MySQL database servers in master-slave configuration on the back end. Each of the web servers run their own Sphinx searchd daemon (to reduce latency of client connection/queries) using a non-distributed sphinx index.

While looking at our Munin graphs I was concerned to see the amount of bandwidth and disk IO on the master MySQL database server caused by periodic re-indexing, and that load increases in line with the number of front-end web servers currently in use.

So that got me thinking… these are huge read-only queries, a small time lag is acceptable, they *should* be happening on the Slave MySQL database.

ThinkingSphinx does not have any official documentation for how to configure this, so here is the secret to setup the Sphinx indexer to use your slave MySQL database…

1/. Create a Read Only MySQL User
On your Master MySQL server create a read-only MySQL user if you do not have one already. It will be replicated automatically onto the slave servers. This is just good practice, you do NOT want any changes accidentally made on your slave databases or you run the risk of breaking MySQL replication.

2/. Create new production_slave environment
We will use a separate Rails environment to hold our slave database and sphinx configuration, but you need to get that environment working. To do this in your rails project, copy config/environments/production.rb to config/environments/production_slave.rb.

3/. Configure database.yml
In your database yml, create a new entry for your production_slave environment, pointing to your slave MySQL database, and use your read-only MySQL user.

4/. Edit your config/sphinx.yml file
Take a copy of your production section and duplicate it under production_slave. It is important that you use the same port number and settings of the production environment.

5/. Commit your changes and deploy live
Push our your code to the servers, so your servers now have access to your new production_slave environment. Edit your yml configs on the servers if your deploy process does not mange this for you.

6/. Setup Sphinx Indexer for new Environment
The sphinx indexer should now use the RAILS_ENV=production_slave. As this is the first time you should now run “RAILS_ENV=production_slave rake ts:in” which will automatically generate a new config/production_slave.sphinx.conf file for you that searchd and indexer can use, and generate your first set of indexes in db/sphinx/production_slave/.

7/. Restart Sphinx
Stop your production sphinx instance, and restart sphinx with the production_slave environment. It needs to read the same config file as your indexer in case of future schema changes etc.

8/. Update RAILS_ENV for sphinx elsewhere
This will largely be dependent on your setup, but you need to change your cron-jobs, init.d scripts, deploy scripts, process monitoring etc to use the production_slave environment for your sphinx daemon and sphinx indexing tasks.

That’s it!

The reason this works it because of step #4, where we use the same host & port for our sphinx server for both the production and production_slave environments. Although your app continues to run as production, your sphinx server runs as production slave on the port expected in the production environment so you are now querying the sphinx daemon that is indexed from the slave MySQL database.

Categories: performance

Tags: , , 1 Comment

FFMpeg SWF File Conversion

April 18th, 2010

FFMpeg has multiple problems and bugs with writing out to SWF files, I run into a few of these at my day-job and wrote a patch for one of the problems and submitted it back to FFMpeg. It was a nice change to go back and do some C programming for a while.

Anyway, list of ffmpeg swf audio problems as follows:
1) Converting to SWF was hard coding the number of audio frames to 6000 in the generated SWF file, whereas the SWF Format specification document from adobe says this must match the number of frames in the file. It was also hard-coding an arbitrary file size into the SWF header as well, rather than the correct file size of the generated file. These are the problems I fixed.

2) FFMpeg SWF encoder writes too many frames to a file. Yeah, really. It writes all of your audio, then a few minutes of empty sound as well. Haven’t got a fix for it yet, need to step through an FFMpeg run in GDB to figure out why it’s doing it. You can open a SWF in a binary editor and manually fudge the number of frames in the SWF Header if you have a scrobbler that is dependent upon the number of frames to determine play duration / progress.. or do something a little more automated.

3) The SWF encoder in FFMpeg is writing out audio only streams using version 4 of the SWF file format, where as the file format specification is now up to version 10. Not sure why it’s writing out in such an old version (maximum compatibility?). Could probably use some documentation.

This was the patch I submitted for problem #1.
http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100415/c02bf648/attachment.obj
——
On the mailing list:

Problem Description:
When converting any Audio to SWF, the File Size is always hard-coded
as 104857600, and Frame Count is always hard coded to 6000. The
incorrect frame count can cause problems when ffmpeg generated swfs
are used inside a flash application and the _totalframes method is
used.

Example command line:
ffmpeg -i 157346.mp3 -ar 44100 -ab 96k -ac 1 -y 157346.swf

When inspecting the generated swf using the swfdump tool
(swftools.org), the following is shown:

==== Error: Real Filesize (1495887) doesn’t match header Filesize
(104857600) ====
[HEADER] File version: 4
[HEADER] File size: 104857600
[HEADER] Frame rate: 10.000000
[HEADER] Frame count: 6000
[HEADER] Movie width: 320.00
[HEADER] Movie height: 200.00
[02d] 6 SOUNDSTREAMHEAD2
[013] 317 SOUNDSTREAMBLOCK
[001] 0 SHOWFRAME 1 (00:00:00,000)
[013] 318 SOUNDSTREAMBLOCK
[001] 0 SHOWFRAME 2 (00:00:00,100)
………
[013] 318 SOUNDSTREAMBLOCK
[001] 0 SHOWFRAME 4595 (00:07:39,382)
[013] 317 SOUNDSTREAMBLOCK
[001] 0 SHOWFRAME 4596 (00:07:39,482)
[000] 0 END

As you can see, the last frame is 4596, and the file size specified in
the header does not match the real file size (swfdump gives a warning
about this).

File size should indicate 1495887 bytes:
$ ls -al 157346.swf
-rw-r–r– 1 boomkat default 1495887 Apr 14 11:25 157346.swf
$

Problem Solution:
Correctly set the File Size in the SWF header section to the real
number of bytes written, and correctly set the number of frames
included in the SWF.

Details of the SWF file format header information can be confirmed on
page 25 of the SWF File Format Specification document from Adobe,
available here: http://www.adobe.com/devnet/swf/

After my patch is applied, using ffmpeg with the same command line
options as above and then inspecting the generated swf with swfdump
shows:

[HEADER] File version: 4
[HEADER] File size: 1495887
[HEADER] Frame rate: 10.000000
[HEADER] Frame count: 4596
[HEADER] Movie width: 320.00
[HEADER] Movie height: 200.00
[02d] 6 SOUNDSTREAMHEAD2
[013] 317 SOUNDSTREAMBLOCK
[001] 0 SHOWFRAME 1 (00:00:00,000)
[013] 318 SOUNDSTREAMBLOCK
[001] 0 SHOWFRAME 2 (00:00:00,100)
………
[013] 318 SOUNDSTREAMBLOCK
[001] 0 SHOWFRAME 4595 (00:07:39,382)
[013] 317 SOUNDSTREAMBLOCK
[001] 0 SHOWFRAME 4596 (00:07:39,482)
[000] 0 END

Note that the Frame Count is correct, and the file size is correctly
reflects the total number of bytes in the file:
$ ls -al 157346.swf
-rw-r–r– 1 boomkat default 1495887 Apr 15 06:58 157346.swf
$

Categories: Uncategorized

Tags: , , Comments Off

Feed

http://www.mendable.com /