Last few weeks were somewhat crazy time. But I’m seeing finish line not so far away and I hope to find more time for blogging in upcoming weeks. At least I have few things to share, which could be interesting for other Rails developers.
So let’s start with MySQL and it’s collations. I have wrote about them few months ago. Few days ago, I had finally installed Linux on my notebook (last two months I was running Linux from USB drive). And it seems that there were some minor changes in my environment. I was working on CARtoteka, and when I was running migrations they were stopped in half with error like :
Mysql::Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and
(utf8_general_ci,COERCIBLE) for operation '=':
SELECT count(DISTINCT companies`.id) AS count_all FROM `companies`
LEFT OUTER JOIN `addresses` ON addresses.company_id = companies.id
WHERE (city = 'Pozna?' and moderated)
What the heck? I have doublechecked I have set connection collation and other – utf_general_ci as I could expect.
I do love MySQLs collate options. There are so many of them. One for server, one for database and one for client connection. There is one missing – One to rule them all ;))
I don’t know if it is FreeBSD specific (no it is not – I have checked only Linux box I got access to, and it is the same), but when You set UTF8 charset in database connection, You are not safe at all. Strange things still are waiting to happen :)
First – when You do use UTF8 You need give database hint how it should compare chars – it is important when somebody will send some non ASCII characters and want to compare against data You already have in database. This is what for collation is for – in most cases what You need is just utf8_general_ci. But MySQL, glorious product of Swedish engineering sets collation for latin1_swedish_ci. Do You smell cookings? :))
Now we enter into Rails world. You were smart and have started with MySQL server with --character-set-server=utf8 --collation-server=utf8_general_ci. You did MySQL dump and checked all tables that have set UTF8 as character set. Seems to be OK?
It seems so :) but now client uses some national char for example in tag (and it seems that user is not Swedish – character is for example one from Polish chars) and then:
Mysql::Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE)
for operation 'like': SELECT * FROM `tags`
WHERE (name LIKE '. Naj?wi?tszego Serca Pana Jezusa.') LIMIT 1
Now You have learned that connection.collation returns not connection collation but servers. But WTF? activerecord/lib/active_record/connection_adapters/mysql_adapter.rb:
def connect
encoding = @config[:encoding]
if encoding
@connection.options(Mysql::SET_CHARSET_NAME, encoding) rescue nil
end
@connection.ssl_set(@config[:sslkey], @config[:sslcert], @config[:sslca], @config[:sslcapath], @config[:sslcipher]) if @con
@connection.real_connect(*@connection_options)
execute("SET NAMES '#{encoding}'") if encoding
SET NAMES 'utf8' in line 472 should set all the collation variables, but somehow it does not, so as quick fix I just have added to environment.rb: