Scenario:
We have received text or csv file. The file has some company information,date created etc. before the data rows start. Our goal is to delete these information rows and regenerate file start from header row.Fig: Source File With Company Information
Solution:
We will be using Script Task to Delete the Top N rows from text file. We will be reading all the data from the file in string type array and then overwrite the original file after removing Top N rows.
Step 1:
Create two variables as shown in figure.VarFilePath is variable holding the file path. VarDeleteTopNRows is variable that will hold the number of rows you want to delete from starting of file.
Fig 1: Create variables in SSIS Package To Delete Top N Rows from Flat File by using Script Task
Step 2:
Bring the Script Task to Control Flow Pane. Map the above Variables as shown to ReadOnlyVariables in Script Task.Fig 2: Map Variables in Script Task
Paste the below Code. I have bold the code that I wrote, rest of the code is auto generated by Script Task.
#region Namespaces
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO;
using System.Linq;
#endregion
namespace ST_c62f3dcfb0964917aade179aac4edfab
{
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
publicpartialclass ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
publicvoid Main()
{
// TODO: Add your code here
string FilePath = Dts.Variables["User::VarFilePath"].Value.ToString();
Int32 DeleteTopNRows = Convert.ToInt32(Dts.Variables["VarDeleteTopNRows"].Value);
string[] lines = System.IO.File.ReadAllLines(FilePath);
lines = lines.Skip(DeleteTopNRows).ToArray();
System.IO.StreamWriter file = new System.IO.StreamWriter(FilePath);
foreach (string line in lines)
{
// MessageBox.Show(line.ToString());
file.WriteLine(line);
}
file.Close();
Dts.TaskResult = (int)ScriptResults.Success;
}
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
}
}
Save the code. Run the script task and once completed. Check the file again.
Fig 3: Top 3 Rows Deleted By SSIS Package.
You can change the value of variable VarDeleteTopNRows according to your requirements.