Importing a File with Dynamic Columns - c#

I am new to SSIS and C#. In SQL Server 2008 I am importing data from a .csv file. Now I have the columns dynamic. They can be around 22 columns(some times more or less). I created a staging table with 25 columns and import data into it. In essence each flat file that I import has different number of columns. They are all properly formatted only. My task is to import all the rows from a .csv flat file including the headers. I want to put this in a job so I can import multiple files into the table daily.
So inside a for each loop I have a data flow task within which I have a script component. I came up(research online) with the C# code below but I get error:
Index was outside the bounds of the array.
I tried to find the cause using MessageBox and I found it is reading the first line and the index is going outside the bounds of the array after the first line.
1.) I need your help with fixing the code
2.) My File1Conn is the flat file connection instead I want to read it directly from a variable User::FileName that my foreach loop keeps updating. Please help with modifying the code below.
Thanks in advance.
This is my flat file:
https://drive.google.com/file/d/0B418ObdiVnEIRnlsZFdwYTRfTFU/view?usp=sharing
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Windows.Forms;
using System.IO;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
private StreamReader SR;
private string File1;
public override void AcquireConnections(object Transaction)
{
// Get the connection for File1
IDTSConnectionManager100 CM = this.Connections.File1Conn;
File1 = (string)CM.AcquireConnection(null);
}
public override void PreExecute()
{
base.PreExecute();
SR = new StreamReader(File1);
}
public override void PostExecute()
{
base.PostExecute();
SR.Close();
}
public override void CreateNewOutputRows()
{
// Declare variables
string nextLine;
string[] columns;
char[] delimiters;
int Col4Count;
String[] Col4Value = new string[50];
// Set the delimiter
delimiters = ";".ToCharArray();
// Read the first line (header)
nextLine = SR.ReadLine();
// Split the line into columns
columns = nextLine.Split(delimiters);
// Find out how many Col3 there are in the file
Col4Count = columns.Length - 3;
//MessageBox.Show(Col4Count.ToString());
// Read the second line and loop until the end of the file
nextLine = SR.ReadLine();
while (nextLine != null)
{
// Split the line into columns
columns = nextLine.Split(delimiters);
{
// Add a row
File1OutputBuffer.AddRow();
// Set the values of the Script Component output according to the file content
File1OutputBuffer.SampleID = columns[0];
File1OutputBuffer.RepNumber = columns[1];
File1OutputBuffer.Product = columns[2];
File1OutputBuffer.Col1 = columns[3];
File1OutputBuffer.Col2 = columns[4];
File1OutputBuffer.Col3 = columns[5];
File1OutputBuffer.Col4 = columns[6];
File1OutputBuffer.Col5 = columns[7];
File1OutputBuffer.Col6 = columns[8];
File1OutputBuffer.Col7 = columns[9];
File1OutputBuffer.Col8 = columns[10];
File1OutputBuffer.Col9 = columns[11];
File1OutputBuffer.Col10 = columns[12];
File1OutputBuffer.Col11 = columns[13];
File1OutputBuffer.Col12 = columns[14];
File1OutputBuffer.Col13 = columns[15];
File1OutputBuffer.Col14 = columns[16];
File1OutputBuffer.Col15 = columns[17];
File1OutputBuffer.Col16 = columns[18];
}
// Read the next line
nextLine = SR.ReadLine();
}
}
}

As you mentioned the file has dynamic amount of columns, in your script component you need to count number of columns by delimiters, then redirect to different outputs.
For your 2nd question, you can assign your variable to the flat file connection manager connection string property. Then you can read the variable value in your script directly.
Except for script component, you can create a "one column" flat file source by using a dummy delimiter, then in the data flow task, you can read amount of columns into a variable, conditional split the data flow, redirect the outputs into different destinations. An example can be found at http://sqlcodespace.blogspot.com.au/2015/03/ssis-design-pattern-handling-flat-file.html

Related

Keep optional pipe in HL7 after parsing

Original HL7
MSH|^~\&|RadImage^124|xxx|EI-ARTEFACT|xxx|123456789||ORM^O01|1234||2.3|||AL
PID|1|xxxxxx|xxxx||xxxxx^xxxxx xxxxx|xxx xxx|19391007|F|||104-430, xxx^^xxx^xx^xx^xx||(999)999-999|"||V|||||"||||||||"|N
PV1|1|A|11^11-1^^^^^2|||||123^xxx, xxx|||||||||123^xxx, xxx|||01|||||||||||||||||||NA|||||20191211082900|||||||
ORC|XO|"^"|xxx||CM||^^^xxx^^R||123456789|INTERF^INTERFACE||123^xxx, xxx|HOSPI^Hospitalisé|||KDICTE|3A^3A||"^"
OBR|1|"^"|xxx|82561^SCAN SINUS C+^^82561^SCAN SINUS C+|VU|xxx|"|"|||||"|||1234^xxx, xxx||xx|xxx|xxx|IMAGES^|xxxx||CT|"||^^^xxx^^VU||||AAAA~BBB~CCC|"^"||","~"|"|xxx|A|B|||
ZDS|1.11.11.11.1.11.1.1.11^RadImage^Application^DICOM
End result HL7
MSH|^~\&|RadImage^124|xxx|EI-ARTEFACT|xxx|123456789||ORM^O01|1234||2.3|||AL
PID|1|xxxxxx|xxxx||xxxxx^xxxxx xxxxx|xxx xxx|19391007|F|||104-430, xxx^^xxx^xx^xx^xx||(999)999-999|"||V|||||"||||||||"|N
PV1|1|A|11^11-1^^^^^2|||||123^xxx, xxx|||||||||123^xxx, xxx|||01|||||||||||||||||||NA|||||20191211082900
ORC|XO|"^"|xxx||CM||^^^xxx^^R||123456789|INTERF^INTERFACE||123^xxx, xxx|HOSPI^Hospitalisé|||KDICTE|3A^3A||"^"
OBR|1|"^"|xxx|82561^SCAN SINUS C+^^82561^SCAN SINUS C+|VU|xxx|"|"|||||"|||1234^xxx, xxx||xx|xxx|xxx|IMAGES^|xxxx||CT|"||^^^xxx^^VU||||AAAA~BBB~CCC|"^"||","~"|"|xxx|A|B|||
ZDS|1.11.11.11.1.11.1.1.11^RadImage^Application^DICOM
Hi,
I'm making a DLL in C# for parsing and modyfing a HL7 message using the nhapi Hl7 DLL.
The only thing I'm struggling to is to keep the empty pipe at the end of the PV1 segment. It'S removing the pipe in the "End result HL7" vs "Orginal HL7".
I would like to keep those pipe
This is my actual code
...
using NHapi.Base.Model;
using NHapi.Base.Parser;
using NHapi.Base.Util;
using System.Diagnostics;
using NHapi.Model.V23.Segment;
using NHapi.Model.V22.Segment;
using NHapi.Model.V21.Segment;
using NHapi.Model.V231.Segment;
...
...
public void PreAnalysis(ITratmContext ctx, MemBuf mb)
{
var parser = new PipeParser();
Debug.WriteLine(mb.ToString());
var parsedMessage = parser.Parse(mb.ToString());
var pipeDelimitedMessage = parser.Encode(parsedMessage);
Debug.WriteLine(pipeDelimitedMessage); //Message lose the empty pipe HERE
var genericMethod = parsedMessage as AbstractMessage;
// create a terser object instance by wrapping it around the message object
Terser terser = new Terser(parsedMessage);
OurTerserHelper terserHelper = new OurTerserHelper(terser);
String terserExpression = "MSH-12";
String HL7Version = terserHelper.GetData(terserExpression);
if (HL7Version == "2.3")
{
var obr = genericMethod.GetStructure("OBR") as NHapi.Model.V23.Segment.OBR;
if (obr != null)
{
for (int i = 0; i < obr.ReasonForStudyRepetitionsUsed; i++)
{
obr.GetReasonForStudy(i).Identifier.Value = StringExtention.Clean(obr.GetReasonForStudy(i).Identifier.ToString());
}
}
//var obrRep = obr.ReasonForStudyRepetitionsUsed;
Debug.WriteLine(parser.Encode(genericMethod.Message));
mb.Init(parser.Encode(genericMethod.Message));
}
}
Thank you very much !!!!
There is no need to keep any field separators after the last populated field in a segment. They are superfluous and a waste of space.
I don`t see a point in having a field separator after the last populated field. But if you insist on doing this you could you could append a custom separator at the end.

Load an excel file that contains charts and insert new column using Infragistics.Documents.Excel

I would like to insert a new column in an existing file that contains charts.
It doesn't work, visual studio keeps running forever. I noticed that if I delete the charts that are in the loaded file it works just fine. A new column with data is inserted. I just don't know If I can conclude that it's because of existing charts that new columns can't be inserted.
Here is what I did :
private static void Main()
{
string outputFile = "metrics.xlsx";
Workbook workbook = Workbook.Load(outputFile);
Workbook temporary = SetIndicatorsWorkbook();
var values = new List<int>();
for(int j=0; j<12; j++)
{
values.Add((int)temporary.Worksheets["Unit & Integration Tests"].Rows[j].Cells[0].Value);
}
var worksheet = workbook.Worksheets["Unit Testing"];
var k = 9;
var count = worksheet.Rows[14].Cells.Count(cell => cell.Value!=null);
worksheet.Columns.Insert(count+1);
foreach (var value in values)
{
worksheet.Rows[k].Cells[count+1].Value = value;
k++;
}
workbook.Save(outputFile);
}
Your code seems fine, I used a random excel file that had a chart on the sheet and the code executed fine without errors. I will be able to assist further if you provide the metrics.xlsx file.

Issues creating and writing data to a CSV file using C#

I'm using C# Code in Ranorex 5.4.2 to create a CSV file, have data gathered from an XML file and then have it write this into the CSV file. I've managed to get this process to work but I'm experiencing an issue where there are 12 blank lines created beneath the gathered data.
I have a file called CreateCSVFile which creates the CSV file and adds the headers in, the code looks like this:
writer.WriteLine("PolicyNumber,Surname,Postcode,HouseNumber,StreetName,CityName,CountyName,VehicleRegistrationPlate,VehicleMake,VehicleModel,VehicleType,DateRegistered,ABICode");
writer.WriteLine("");
writer.Flush();
writer.Close();
The next one to run is MineDataFromOutputXML. The program I am automating provides insurance quotes and an output xml file is created containing the clients details. I've set up a mining process which has a variable declared at the top which shows as:
string _PolicyHolderSurname = "";
[TestVariable("3E92E370-F960-477B-853A-0F61BEA62B7B")]
public string PolicyHolderSurname
{
get { return _PolicyHolderSurname; }
set { _PolicyHolderSurname = value; }
}
and then there is another section of code which gathers the information from the XML file:
var QuotePolicyHolderSurname = (XmlElement)xmlDoc.SelectSingleNode("//cipSurname");
string QuotePolicyHolderSurnameAsString = QuotePolicyHolderSurname.InnerText.ToString();
PolicyHolderSurname = QuotePolicyHolderSurnameAsString;
Report.Info( "Policy Holder Surname As String = " + QuotePolicyHolderSurnameAsString);
Report.Info( "Quote Policy Holder Surname = " + QuotePolicyHolderSurname.InnerText);
The final file is called SetDataSource and it puts the information into the CSV file, there is a variable declared at the top like this:
string _PolicyHolderSurname = "";
[TestVariable("222D47D2-6F66-4F05-BDAF-7D3B9D335647")]
public string PolicyHolderSurname
{
get { return _PolicyHolderSurname; }
set { _PolicyHolderSurname = value; }
}
This is then the code that adds it into the CSV file:
string Surname = PolicyHolderSurname;
Report.Info("Surname = " + Surname);
dataConn.Rows.Add(new string[] { Surname });
dataConn.Store();
There are multiple items in the Mine and SetDataSource files and the output looks like this in Notepad++:
Picture showing the CSV file after the code has been run
I believe the problem lies in the CreateCSVFile and the writer.WriteLine function. I have commented this region out but it then produces the CSV with just the headers showing.
I've asked some of the developers I work with but most don't know C# very well and no one has been able to solve this issue yet. If it makes a difference this is on Windows Server 2012r2.
Any questions about this please ask, I can provide the whole files if needed, they're just quite long and repetitive.
Thanks
Ben Jardine
I had the exact same thing to do in Ranorex. Since the question is a bit old I didn't checked your code but here is what I did and is working. I found an example (probably on stack) creating a csv file in C#, so here is my adaptation for using in Ranorex UserCodeCollection:
[UserCodeCollection]
public class UserCodeCollectionDemo
{
[UserCodeMethod]
public static void ConvertXmlToCsv()
{
System.IO.File.Delete("E:\\Ranorex_test.csv");
XDocument doc = XDocument.Load("E:\\lang.xml");
string csvOut = string.Empty;
StringBuilder sColumnString = new StringBuilder(50000);
StringBuilder sDataString = new StringBuilder(50000);
foreach (XElement node in doc.Descendants(GetServerLanguage()))
{
foreach (XElement categoryNode in node.Elements())
{
foreach (XElement innerNode in categoryNode.Elements())
{
//"{0}," give you the output in Comma seperated format.
string sNodePath = categoryNode.Name + "_" + innerNode.Name;
sColumnString.AppendFormat("{0},", sNodePath);
sDataString.AppendFormat("{0},", innerNode.Value);
}
}
}
if ((sColumnString.Length > 1) && (sDataString.Length > 1))
{
sColumnString.Remove(sColumnString.Length-1, 1);
sDataString.Remove(sDataString.Length-1, 1);
}
string[] lines = { sColumnString.ToString(), sDataString.ToString() };
System.IO.File.WriteAllLines(#"E:\Ranorex_test.csv", lines);
}
}
For your information, a simple version of my xml looks like that:
<LANGUAGE>
<ENGLISH ID="1033">
<TEXT>
<IDS_TEXT_CANCEL>Cancel</IDS_TEXT_CANCEL>
<IDS_TEXT_WARNING>Warning</IDS_TEXT_WARNING>
</TEXT>
<LOGINCLASS>
<IDS_LOGC_DLGTITLE>Log In</IDS_LOGC_DLGTITLE>
</LOGINCLASS>
</ENGLISH>
<FRENCH ID="1036">
<TEXT>
<IDS_TEXT_CANCEL>Annuler</IDS_TEXT_CANCEL>
<IDS_TEXT_WARNING>Attention</IDS_TEXT_WARNING>
</TEXT>
<LOGINCLASS>
<IDS_LOGC_DLGTITLE>Connexion</IDS_LOGC_DLGTITLE>
</LOGINCLASS>
</FRENCH>
</LANGUAGE>

Editing record in delimited file using FileHelper

I have a simple Delimited log file. I`m using FileHelper library to parse the file using the following code:
LogLine record;
FileHelperAsyncEngine<LogLines> engine = new FileHelperAsyncEngine<LogLines>();
engine.BeginReadFile(#"C:\logs\Log.log");
while (engine.ReadNext() != null)
{
record = engine.LastRecord;
//record.Reported = true; <---I want to be able to edit this!
// Your Code Here
}
Is there any way I can edit this record?
Will something like this be fine for you?
This will modify second element of that file; could not find method similar to seek for that class.
public static void WriteExample()
{
FileHelperEngine engine = new FileHelperEngine(typeof(SampleType));
// to Read use:
SampleType[] res = engine.ReadFile("source.txt") as SampleType[];
res[1].Field1 = "test";
res[1].Field2 = 9;
// to Write use:
engine.WriteFile("source2.txt", res);
}

openxml sdk excel how to parse and calculate formula

I have formula cell in excel file that has the formula =SUM(C2:C3).
From the web application hosted on remote webserver in the cloud, that does not have Excel installed, I would pass in values for C2 and C3.
I can also determine the exact formula in excel. How do I parse this formula programmatically in c# so that I could get the result of 6 if the input values of C2 and C3 were 2 and 4 respectively?
What if the formula is very complex, what is the best way to parse the formula and calculate it in C# on the server side in asp.net mvc application?
Code sample would really benefit me in this case.
If you provide a tool to open excel file and translate it's content to html you must deal with calculation.
If the file is "well created", for example manually with Excel you can be sure you don't need to manage computation of the formulas cause excel does the trick and stores both the formula in CellFormula's child element and result in CellValue's child element (See the method GetValue_D11()). So basically you just need to show the result.. which always will be a String.
Unfortunately you have to deal with styles and dataTypes, if you want to mantain behaviour.
Actually you have to build a complex web based spreadsheet viewer/editor.
Here is a sample "fixed" (totally not dynamic for all) for retrieving String values and formula values. if you wanna run the test be sure to download that file (http://www.devnmore.com/share/Test.xlsx) otherwise it can't works.
ShowValuesSample svs = new ShowValuesSample("yourPath\\Test.xlsx");
String[] test = svs.GetDescriptions_A2A10();
Double grandTotal = svs.GetValue_D11();
ShowValuesSample class:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using DocumentFormat.OpenXml.Packaging;
using Ap = DocumentFormat.OpenXml.ExtendedProperties;
using Vt = DocumentFormat.OpenXml.VariantTypes;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Spreadsheet;
using A = DocumentFormat.OpenXml.Drawing;
using System.Globalization;
namespace TesterApp
{
public class ShowValuesSample
{
public String FileName { get; private set; }
private SpreadsheetDocument _ExcelDocument = null;
public SpreadsheetDocument ExcelDocument
{
get
{
if (_ExcelDocument == null)
{
_ExcelDocument = SpreadsheetDocument.Open(FileName, true);
}
return _ExcelDocument;
}
}
private SheetData _SheetDataOfTheFirstSheet = null;
public SheetData SheetDataOfTheFirstSheet
{
get
{
if (_SheetDataOfTheFirstSheet == null)
{
WorksheetPart shPart = ExcelDocument.WorkbookPart.WorksheetParts.ElementAt(0);
Worksheet wsh = shPart.Worksheet;
_SheetDataOfTheFirstSheet = wsh.Elements<SheetData>().ElementAt(0);
}
return _SheetDataOfTheFirstSheet;
}
}
private SharedStringTable _SharedStrings = null;
public SharedStringTable SharedStrings
{
get
{
if (_SharedStrings == null)
{
SharedStringTablePart shsPart = ExcelDocument.WorkbookPart.SharedStringTablePart;
_SharedStrings = shsPart.SharedStringTable;
}
return _SharedStrings;
}
}
public ShowValuesSample(String fileName)
{
FileName = fileName;
}
//In the file descriptions are stored as sharedString
//so cellValue it's the zeroBased index of the sharedStringTable
//in my example i saved 9 different values
//sharedstring it's a trick to reduce size of a file obiouvsly writing
//repetitive string just once
public String[] GetDescriptions_A2A10()
{
String[] retVal = new String[9];
for (int i = 0; i < retVal.Length; i++)
{
Row r = SheetDataOfTheFirstSheet.Elements<Row>().ElementAt(i + 1);
Cell c = r.Elements<Cell>().ElementAt(0);
Int32 shsIndex = Convert.ToInt32(c.CellValue.Text);
SharedStringItem shsItem = SharedStrings.Elements<SharedStringItem>().ElementAt(shsIndex);
retVal[i] = shsItem.Text.Text;
}
return retVal;
}
//The value it's stored beacause excel does
//To be sure it's correct you should perform all calculations
//In this case i'm sure Excel didn't stored the wrong value so..
public Double GetValue_D11()
{
Double retVal = 0.0d;
Int32 cellIndex = 0;
//cellIndex it's 0 and not 3, cause A11, B11, C11 are empty cells
//Another issue to deal with ;-)
Cell c = SheetDataOfTheFirstSheet.Elements<Row>().ElementAt(10).Elements<Cell>().ElementAt(cellIndex);
//as example take a look at the value of storedFormula
String storedFormula = c.CellFormula.Text;
String storedValue = c.CellValue.Text;
NumberFormatInfo provider = new NumberFormatInfo();
provider.NumberDecimalSeparator = ".";
provider.NumberGroupSeparator = ",";
provider.NumberGroupSizes = new Int32[] { 3 };
retVal = Convert.ToDouble(storedValue, provider);
return retVal;
}
}
}
spreadSheet.WorkbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
spreadSheet.WorkbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
worked for me.
I'm afraid its not possible. In Open XML you can read or change the formula. But you process the formula and get results through open xml.
Change the values for C2 and C3 for the formula and then save it in open xml, now open the document through Excel App. The values will be calculated and displayed.
Refer this SO Post, related to this issue open xml sdk excel formula recalculate cache issue
Refer this post too http://openxmldeveloper.org/discussions/formats/f/14/p/1806/158153.aspx
Hope this helps!

Categories