Today I was trying to restore a backup onto a server and then add that new server into a replication cluster. I have a script that automatically does this by capturing the binary log position and log file that the slave should use when attaching to the master. Then the slave can use the binary log and relay file to catch up.
Each time I attempted this I would get this error:
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
I was wondering, who the heck keeps purging the binary logs before my new slave has time to replicate the data?
I checked the my.cnf file for the master and the logs are set to expire after 14 days. I was using a backup taken from this morning.
I checked the root's crontab and there was a little bash script running every hour doing this:
#!/bin/bash
time_string=$(date +"%Y-%m-%d %H:%M:%S" -d "1 hour ago"); mysql -uUSERNAME --password=PASSWORD -e "purge binary logs before '${time_string}'"
That little bash script is purging all the binary logs from an hour before it starts so my script wasn't able to attach to the master and replicate because the log file which my backup was referencing files that had already been purged.
This is an extremely busy server that generates a lot of logs so I can see why this was here.
I temporary commented out the entry, started a new backup and finished adding my new slave into the server.
I also added an FYI to /etc/motd so that when someone logs in to the serverr via SSH they get a message indicating that this cron job is running.