Migrating big data to new database

Home / Uncategorized / Migrating big data to new database

Question:
I’d like to transfer a large amount of data from SQL Server to MongoDB (Around 80 million records) using a solution I wrote in C#. I want to transfer say 200 000 records at a time, but my problem is keeping track of what has already been transferred. Normally I’d do it as follows:Gather IDs from destination to exclude from source scope
Read from source (Excluding IDs already in destination)
Write to destination
Repeat

The problem is that I build a string in C# containing all the IDs that exist in the destination, for the purpose of excluding those from source selection, eg.
select * from source_table where id not in ()

Now you can imagine what happens here when I have already inserted 600 000+ records and then build a string with all the IDs, it gets large and slows things down even more, so I’m looking for a way to iterate through say 200 000 records at a time, like a cursor, but I have never done something like this and so I am here, looking for advice.

Just as a reference, I do my reads as follows
SqlConnection conn = new SqlConnection(myConnStr);
conn.Open();
SqlCommand cmd = new SqlCommand("select * from mytable where id not in ("+bigListOfIDs+")", conn);
SqlDataReader reader = cmd.ExecuteReader();
if (reader.HasRows)
{
while (reader.Read())
{
//Populate objects for insertion into MongoDB
}
}

So basically, I want to know how to iterate through large amounts of data without selecting all that data in one go, or having to filter the data using large strings. Any help would be appreciated.


Answer:
There are many different ways of doing this, but I would suggest first that you don’t try to reinvent the wheel but look at existing programs. There are many programs designed to export and import data between different databases, some are very flexible and expensive, but others come with free options and most DBMS programs include something.

Option 1:
Use SQL Server Management Studio (SSMS) Export wizards.

This allows you to export to different sources. You can even write complex queries if required. More information here:

https://www.mssqltips.com/sqlservertutorial/202/simple-way-to-export-data-from-sql-server/

Option 2:
Export your data in ascending ID order. Store the last exported ID in a table.

Export the next set of data where ID > lastExportedID

Option 3:
Create a copy of your data in a back-up table. Export from this table, and delete the records as you export them.
Read more

Leave a Reply

Your email address will not be published. Required fields are marked *