Showing posts with label ERROR 2013 (HY000). Show all posts
Showing posts with label ERROR 2013 (HY000). Show all posts

Thursday, March 31, 2016

UNAUTHENTICATED USER causing mysql 5.6 to hang

I upgraded one of our alpha servers from MySQL 5.1 to 5.5 and then to Percona Server 5.6. This server has a few hundred databases and each databases has a few hundred tables. The next day the server kept hanging and would eventually become so sluggish that it had to be restarted. When attempting to establish a new connection it would indicate max connections had been reached. If I was able to log on before the max connections were all used up, I would see a few hundred "unauthenticated user" which would continue to build up without closing.

From searching online, I first saw some blogs telling you to make sure that skip-name-resolve is in the my.cnf file and that you are only using IP address for limiting user permissions.

Here is another site that talks about this problem:
http://codeinthehole.com/writing/solving-mysql-connection-problems-caused-by-a-dead-name-server/

I had already done that so skip-name-resolve wasn't the problem.

This problem continued to happen so I started pt-stalk and demonized it so that if anything started to build up, it would capture forensic data.

I was finally able to capture this happening by using pt-stalk and a threshold of threads_running at 10. At first I tried setting the threshold with threads_connected but the "unauthenticated user" did not count as a connection and I wasn't able to catch it happening.

pt-stalk generates a lot of files...I started reviewing each one of them.

I checked the error log and nothing looked like a problem.

I looked in processlist and saw a number of entries like this:

*************************** 110. row ***************************
           Id: 1021
         User: unauthenticated user
         Host: connecting host
           db: NULL
      Command: Connect
         Time: NULL
        State: login
         Info: NULL
    Rows_sent: 0
Rows_examined: 0

That was expected.

One of the files was called "2016_03_31_18_24_11-opentables2" and all it contained was "Too many open tables:".

That was interesting.

The next useful file was the "Output" file which was full of errors like this:
ERROR 2013 (HY000): Lost connection to MySQL server at 'sending authentication information', system error: 32

Those two errors were the key. I read up on some forums, such as this one:
http://stackoverflow.com/questions/17813630/mysql-5-6-headaches-on-mac-osx

Developers noted that they had this problem on their local testing MySQL instance because table_open_cache was too low.

I checked my system like this:

show global variables like '%table_open_cache%';


Table_open_cache was set at 64.

I increased it like this:

SET GLOBAL table_open_cache = 32768;

I also updated it in the my.cnf file.

Then I asked our quality engineering team to run some tests and I watched the open tables increase by running a command like this:

SHOW STATUS LIKE 'Open_tables'

Within a few minutes it was past 17,000 open tables and all the queries were finishing too quickly to even catch with SHOW PROCESSLIST. It didn't crash.

MySQL 5.6 can handle a much larger table_open_cache than MySQL 5.1 could.

I'm not convinced that is the only problem with this instance but it seems to be doing better.

We also found a number of app servers that had outdated MySQL client tools running on the app side (MySQL 5.1). Those various client tools were being used to connect to MySQL 5.6 and appeared to be contributing to the un-authenticated users hanging around and eventual hanging of MySQL. Since updating all the client tools to MySQL 5.6, haven't seen this issue anymore.

Wednesday, March 18, 2015

ERROR 2013 (HY000): Lost connection to MySQL server during query

I was trying to run a mysql import on a brand new VM running Percona server 5.6 which had been built with ansible and I kept getting this error over and over again:

ERROR 2013 (HY000) at line 1525: Lost connection to MySQL server during query

I was confused so I did some searching and came upon this thread:

http://stackoverflow.com/questions/10563619/error-code-2013-lost-connection-to-mysql-server-during-query

One of the answers was to check the innodb buffer pool settings which I did. The ansible script had incorrectly set the innodb_buffer_pool_size to MB instead of GB. The person setting it up the ansible automation thought the values were in KB when they were in fact bytes. My system only had about 2 MB for the innodb_buffer_pool_size and it should have been about 2 GB. After upping the size allocate to innodb_buffer_pool_size, the problem went away.