When inserting large amount of data to DB in Rails, large number of insert statements take a lot of time. We can reduce this time dramatically using mysqlimport, which is mysql tool to insert bulk data into the DB quickly, quoting mysql words about mysqlimport: "reads rows from a text file into a table at a very high speed". To use mysqlimport in updating records also, not only in inserting new records, we used a common trick for this problem: Creating a temporary table, inserting updated records(with ids) in this table, join the temporary table with target table on id column and update target table columns accordingly, then drop the temporary table. You can find this idea illustrated for exampel here:
So here is a helper class to use it within Rails application.
So here is a helper class to use it within Rails application.
class SqlWriter ID_STR = 'id' CREATED_AT_STR = 'created_at' UPDATED_AT_STR = 'updated_at' NULL_STR = '\N' COMMA_STR = ',' attr_accessor :insert_sql_file, :update_sql_file # klass is the class of the records we will deal with # sql_dir_path is the directory which will contain the sql data file(text file). def initialize(klass, sql_dir_path) @klass = klass @temp_table_name = "temp_#{klass.table_name}_#{Time.now.to_s(:db).gsub(/-| |:/,'_')}_#{SecureRandom.hex[0..10]}" @insert_sql_file = File.new("#{sql_dir_path}/#{klass.table_name}.txt", 'w') @update_sql_file = File.new("#{sql_dir_path}/#{@temp_table_name}.txt", 'w') @current_time_in_db_format = Time.now.to_s(:db) @insert_fields = klass.new.attributes.except(ID_STR).keys @update_fields = klass.new.attributes.keys @records_need_update = false end def write_record_to_sql_file(record) row_data = get_sql_row(record) if record.new_record? @insert_sql_file.write("#{row_data}\n") else @update_sql_file.write("#{row_data}\n") end end def insert_records_to_database @insert_sql_file.close @update_sql_file.close config = Rails.configuration.database_configuration database = config[Rails.env]["database"] username = config[Rails.env]["username"] password = config[Rails.env]["password"] host = config[Rails.env]["host"] insert_columns_orders = @insert_fields.join(',') `mysqlimport -u #{username} -p#{password} -h #{host} --columns='#{insert_columns_orders}' --local --fields-terminated-by=',' #{database} #{Shellwords.escape(@insert_sql_file.path)}` if @records_need_update ActiveRecord::Base.connection.execute("CREATE TABLE #{@temp_table_name} LIKE #{@klass.table_name};") update_columns_orders = @update_fields.join(',') `mysqlimport -u #{username} -p#{password} -h #{host} --columns='#{update_columns_orders}' --local --fields-terminated-by=',' #{database} #{Shellwords.escape(@update_sql_file.path)}` set_fields = @insert_fields.map{|field| "#{@klass.table_name}.#{field}=#{@temp_table_name}.#{field}"}.join(',') ActiveRecord::Base.connection.execute("UPDATE #{@klass.table_name} INNER JOIN #{@temp_table_name} ON #{@klass.table_name}.id = #{@temp_table_name}.id SET #{set_fields}") ActiveRecord::Base.connection.execute("DROP TABLE #{@temp_table_name}") end File.delete(@update_sql_file) end private def get_sql_row(record) if record.new_record? result = record.attributes.except(ID_STR).values fields = @insert_fields else result = record.attributes.values fields = @update_fields @records_need_update = true end result.each_with_index do |item, index| if item.class == Date || item.class == Time result[index] = item.to_s(:db) elsif item == true || item == false result[index] = item ? 1 : 0 elsif item == nil if fields[index] == CREATED_AT_STR || fields[index] == UPDATED_AT_STR result[index] = @current_time_in_db_format else result[index] = NULL_STR end end end result.join(COMMA_STR) end end
For example, assume that we are inserting large number of records of User model:
sql_file_dir = "path/to/some/dir" sql_writer = BulkDataWriter.new(User, sql_file_dir) alot_of_data.each do |data| #...... user = User.new(user_attributes) sql_writer.write_record_to_sql_file(user) end sql_writer.insert_records_to_databaseAnd you will have you data inserted to DB!